MOnAMi | Search

Analyzing Privacy Threats in VR/AR Devices (2024)

As new sensors are added to VR headsets, more data can be collected. This introduces a new potential threat to user privacy. We focused on the feasibility of extracting personal information from eye-tracking. To achieve this, we designed a preliminary user study focusing on the pupil response to audio stimuli. We used a variation of machine learning models to test the collected data to determine the feasibility of obtaining information such as the age or gender of the participant. Several of the experiments show promise for obtaining this information. We were able to extract with reasonable certainty whether caffeine was consumed and the gender of the participant. This demonstrates the unknown threat that embedded sensors pose to users. A further studies are planned to verify the results.

Biological Nitrate-dependent Ammonium Oxidation In An Anammox Reactor (2023)

Vu, Gia-Khoa

Die biologische Ammoniumoxidation ist ein zentraler Bestandteil des globalen Stickstoffkreislaufs. Angesichts der extremen Massen Stickstoff anthropogenen Ursprungs in der Umwelt, liegt die Entfernung reaktiven Stickstoffs im Interesse der Umwelt und der öffentlichen Gesundheit. In der folgenden Arbeit werden Bedingungen zur anaeroben Ammoniumoxidation mit Nitrat in einem Anammox-Reaktor untersucht. Dabei wurden 2 Laborreaktoren für eine Zeit von insgesamt 116 Tagen betrieben und beobachtet, die ausschließlich als Elektronendonatoren und Akzeptoren Ammonium und Nitrat enthielten. Zusätzlich wurden Batchkulturen mit Zellen eines Reaktors angezüchtet und auf ihre Gaszusammensetzung abhängig unterschiedlicher Eigenschaften untersucht. Hierbei wurde eine Reihe unterschiedlicher analytischer Quantifizierungsmethoden genutzt und es konnte gezeigt werden, dass ein Abbau unter den Bedingungen stattfindet. Die aktuelle Forschung zu dieser Reaktion ist spärlich und verleiht der Bachelorarbeit dadurch Relevanz.

Lead Scoring with Machine Learning (2023)

Binte Ayaz, Safa

This thesis investigates the efficacy of four machine learning algorithms, namely linear regression, decision tree, random forest and neural network in the task of lead scoring. Specifically, the study evaluates the performance of these algorithms using datasets without sampling and with random under-sampling and over-sampling using SMOTE. The performance of each algorithm is measure using various performance metrics, including accuracy, AUC-ROC, specificity, sensitivity, precision, recall, F1 score, and G-mean. The results indicate that models trained on the dataset without sampling achieved higher accuracy than those trained on the dataset with either random under-sampling or random over-sampling using SMOTE. However, the neural network demonstrated remarkable results on each dataset compared to the other algorithms. These findings provide valuable insights into the effectiveness of machine learning algorithms for lead scoring tasks, particularly when using different sampling techniques. The findings of this study can aid lead management practices in selecting the most suitable algorithm and sampling technique for their needs. Furthermore, the study contributes to the literature by providing a comprehensive evaluation of the performance of machine learning algorithms for lead scoring tasks. This thesis has practical implications for businesses looking to improve their lead management practices, and future research could extend the analysis to other machine learning algorithms or more extensive datasets.

Building a trustless connection between the Lightning Network and EVM-compatible blockchains (2023)

Käbisch, Tim

As the cryptocurrency ecosystem rapidly grows, interoperability has become increasingly crucial, enabling assets and data to interact seamlessly across multiple chains. This work describes the concept and implementation of a trustless connection between the Bitcoin Lightning Network and EVM-compatible blockchains, allowing the transfer of assets between the two ecosystems. Establishing such a connection can significantly contribute to the growth of both ecosystems as they can benefit from each other’s advantages and emerge new pos- sibilities.

Experimental Impacts of Temperature and Predator Diversity on Collembola Communities and Decomposition Rates (2023)

Jahn, Susanne

To investigate the effects of climate change on interactions within ecosystems, a microcosm experiment was conducted. The effects of temperature increase and predator diversity on Collembola communities and their decomposition rate were investigated. The predators used were mites and Chilopods, whose predation effects on several response variables were analysed. This data included Collembola abundance, biomass and body mass as well as basal respiration and microbial biomass carbon. These response variables were tested against the predictors in several models. Temperature showed high significance in interaction with mite abundance in almost all models. Furthermore, the results of the basal respiration and microbial biomass carbon support the suggestion of a trophic cascade within the animal interaction.

Recurrent unit together with reinforcement learning for graph networks (2023)

Panda, Subhashree

Recently a deep neural network architecture designed to work on graph- structured data have been capturing notice as well as getting implemented in various domains and application. However, learning representation (feature embedding) from graphical data picking pace in research and constructing graph(s) from dataset remains a challenge. The ability to map the data to lower dimensions further makes the task easier while providing comfort in applying many operations. Graph neural network (GNN) is one of the novel neural network models that is catching attention as it is outperforming in various applications like recommender systems, social networks, chemical synthesis, and many more. This thesis discusses a unique approach for a fundamental task on graphs; node classification. The feature embedding for a node is aggregated by applying a Recurrent neural network (RNN), then a GNN model is trained to classify a node with the help of aggregated features and Q learning supports in optimizing the shape of neural networks. This thesis starts with the working principles of the Feedforward neural network, recurrent units like simple RNN, Long short-term memory (LSTM), and Gated recurrent unit (GRU), followed by concepts of Reinforcement learning (RL) and the Q learning algorithm. An overview of the fundamentals of graphs, followed by the GNN architecture and workflow, is discussed subsequently. Some basic GNN models are discussed in brief later before it approaches the technical implementation details, the output of the model, and a comparison with a few other models such as GraphSage and Graph attention network (GAN).

Design and Development of a User-Centric Distributed Ground Station Network Providing Bi-Directional Communications Access to Small Satellites (2023)

Vogt, Jesco

In the field of satellites it is common practice to combine multiple ground stations into one network, to increase communication times with satellites. This work focuses on TIM, which is an international academic colaborative project. Important criteria for this project are elaborated and used to evaluate existing ground station networks. It concludes that there is no appropriate solution availiable for this specific use case and establish a proposed solution. The proposed ground station network software will be elaborated and evaluated.

Assessment of COI and 16S for insect species identification ti determine the diet of city bats (2023)

Ngoufack Djoumessi, Kevine Phalone

Despite the numerous benefits of urbanization to human living conditions, urbanization has also negatively affected humans, their environment, and other organisms that share urban habitats with humans. Undoubtedly adverse while some wild animals avoid living in urban areas, others are more tolerant or prefer life in urban habitats. There are more than 1,400 species of bats in the world. Therefore, they have the potential to contribute significantly to the mammalian biodiversity in urban areas. Insectivorous bats species play a key role in agriculture by improving yields and reducing chemical pesticide costs. Using metabarcoding, it is possible to determine the prey consumed by these noctule mammals based on the DNA fragments in their fecal pellets. This study aimed to evaluate COI and 16S metabarcodes for insect species identification to determine the diet of metropolitan bats. For this purpose, COI and 16S metabarcodes were extracted, amplified, and sequenced from 65 bat feces collected in the Berlin metropolitan areas. Following a taxonomic annotation, I found that 73% of all identified insects could only be detected using the COI method, while 15% could be recovered using the 16S approach. Just 12% of all detected insects were identified simultaneously by both markers. According to this result, COI is more suitable for the taxonomic identification of insects from bat feces. However, given the bias of COI primers, it is recommended to use both markers for a more precise estimation of species diversity. Additionally,based on the insect species identified, I noticed that urban bats fed mainly on Diptera, Coleoptera,and Lepidoptera. The bat species Nyctalus noctula was most abundant in the samples. His diet analysis revealed that 91% of the samples contained the insect species Chironomus plumosus. 14 pest insect species were also found in his diet.

Investigation of Tutte polynomial of Graphs (2023)

Stan, Natalia

The Tutte polynomial is an important tool in graph theory. This paper provides an introduction to the two-variable polynomial using the spanning subgraph and rank-generating polynomials. The equivalency of definitions is shown in detail, as well as evaluations and derivatives. The properties and examples of the polynomial, i.e. the universality, coefficient relations, closed forms and recurrence relations are mentioned. Moreover, the thesis contains the connection between the dichromate and other significant polynomials.

Analysis of the Forensic Preparation of Biometric Facial Features for Digital User Authentication (2023)

Halbe, Navina

Biometrics has become a popular method of securing access to data as it eliminates the need for users to remember a password. Although exploiting the vulnerabilities of biometric systems increased with their usage, these could also be helpful during criminal casework. This thesis aims to evaluate approaches to bypass electronic devices with forged faces to access data for law enforcement. Here, obtaining the necessary data in a timely manner is critical. However, unlocking the devices with a password can take several years with a brute force attack. Consequently, biometrics could be a quicker alternative for unlocking. Various approaches were examined to bypass current face recognition technologies. The first approaches included printing the user's face on regular paper and aimed to unlock devices performing face recognition in the visible spectrum. Further approaches consisted of printing the user's infrared image and creating three-dimensional masks to bypass devices performing face recognition in the near-infrared. Additionally, the underlying software responsible for face recognition was reverse-engineered to get information about its operation mode. The experiments demonstrate that forged faces can partly bypass face recognition and obtain secured data. Devices performing face recognition in the visible spectrum can be unlocked with a printed image of the user's face. Regarding devices with advanced near-infrared face recognition, only one could be bypassed with a three-dimensional face mask. In addition, its underlying software provided evidence about the demands of face recognition. Other devices under attack remained locked, and their software provided no clues.

At once post-processing of fluid structures in micro gravity experiment (2023)

Nguyen, Ngoc Uyen

The GeoFlow II experiment aims to replicate Earth’s core dynamics using a rotating spherical container with controlled temperature differences and simulated gravity. During the GeoFlow II campaign, a massive dataset of images was collected, necessitating an automated system for image processing and fluid flow visualization in the northern hemisphere of the spherical container. From here, we aim to detect the special structures appearing on the post processed images. Recognizing YOLOv5’s proficiency in object detection, we apply Yolov5 model for this task.

Analysis of Attention Learning Schemes and the Design of an Attention Integration into Learning Vector Quantization (2023)

Davies, Thomas

Machine learning models for timeseries have always been a special topic of interest due to their unique data structure. Recently, the introduction of attention improved the capabilities of recurrent neural networks and transformers with respect to their learning tasks such as machine translation. However, these models are usually subsymbolic architectures, making their inner working hard to interpret without comprehensive tools. In contrast, interpretable models such learning vector quantization are more transparent in the ability to interpret their decision process. This thesis tries to merge attention as a machine learning function with learning vector quantization to better handle timeseries data. A design on such a model is proposed and tested with a dataset used in connection with the attention based transformers. Although the proposed model did not yield the expected results, this work outlines improvements for further research on this approach.

Analysis of Continuous Learning Strategies at the Example of Replay-Based Text Classification (2023)

Demus, Christoph

Continuous learning is a research field that has significantly boosted in recent years due to highly complex machine and deep learning models. Whereas static models need to be retrained entirely from scratch when new data get available, continuous models progressively adapt to new data saving computational resources. In this context, this work analyzes parameters impacting replay-based continuous learning approaches at the example of a data-incremental text classification task using an MLP and LSTM. Generally, it was found that replay improves the results compared to naive approaches but achieves not the performance of a static model. Mainly, the performances increased with more replayed examples, and the number of training iterations has a significant influence as it can partly control the stability-plasticity-trade-off. In contrast, the impact of balancing the buffer and the strategy to select examples to store in the replay buffer were found to have a minor impact on the results in the present case.

Counterfactual Explanations vs Adversarial Examples: An Investigation on their Differences (2023)

Tiembukong, Amadeo Tunyi

In this work, we identify similarities between Adversarial Examples and Counterfactual Explanations, extend already stated differences from previous works to other fields of AI such as dimensionality, transferability etc. and try to observe these similarities and differences in different classifier with tabular and image data. We note that this topic is an open discussion and the work here isn’t definite and canbe further extended or modified in the future, if new discoveries found.

Designing Account and Token Architectures for Decentralized Social Economies on a Blockchain-based Chat Application (2023)

Hildebrandt, Felix

Traditional user management on the Internet has historically required individuals to give up control over their identities. In contrast, decentralized solutions promise to empower users and foster decentralized interactions. Over the last few years, the development of decentralized accounts and tokens has significantly increased, aiming at broader user adoption and shared social economies. This thesis delves into smart contract standards and social infrastructure for Ethereum-based blockchains to enable identity-based data exchange between abstracted blockchain accounts. In this regard, the standardization landscapes of account and social token developments were analyzed in-depth to form guidelines that allow users to retain complete control over their data and grant access selectively. Based on the evaluations, a pioneering Solidity standard is presented, natively integrating consensual restrictive on-chain assets for abstracted blockchain accounts. Further, the architecture of a decentralized messaging service has been defined to outline how new token and account concepts can be intertwined with efficient and minimal data-sharing principles to ensure security and privacy, while merging traditional server environments with global ledgers.

Exploration of immune responses and reticulocyte production in children with malaria (2023)

Basu, Arjit

This thesis comprehensively explores factors contributing to malaria-induced anemia and severe malarial anemia (SMA). The study utilizes a comprehensive dataset to investigate immunological interactions, genetic variations, and temporal dynamics. Findings highlight the complex interplay between immune markers, genetic traits, and cohort-specific influences. Notably, age, HIV status, and genetic variations emerge as crucial factors influencing anemia risk. The incorporation of Poisson regression models sheds light on the genetic underpinnings of SMA, emphasizing the need for personalized interventions. Overall, this research provides valuable insights into the multifaceted nature of malaria-induced complications, paving the way for further molecular investigations and targeted interventions.

Transmission Dynamics of a COVID-19 Outbreak in a Homeless Shelter in Chicago, Illinois, USA (2023)

Rudanov, Dmitrii

In this thesis, we implement, correct, and modify the compartmental model described in “Transmission Dynamics of Large Coronavirus Disease Outbreak in Homeless Shelter, Chicago, Illinois, USA, 2020”. Our objective is to engage in reading and understanding scientific literature, reproduce the results, and modify or generalize an existing mathematical model. We provide an overview of epidemiological models, focusing on simple compartmental SEIR models. We correct inaccuracies and misprints in the original implementation and use the limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm to fit the model’s parameters. Furthermore, we modify the model by introducing an additional compartment. The resulting model has a more intuitive interpretation and relies on fewer assumptions. We also perform the fitting process for this alternative model. Finally, we demonstrate the advantages of our modified implementations and discuss other possible approaches.

Cognitive Bias-Powered GLVQ: Illogical Machines (2023)

Saruhan, Mert

In this paper, we conduct experiments to optimize the learning rates for the Generalized Learning Vector Quantization (GLVQ) model. Our approach leverages insights from cog- nitive science rooted in the profound intricacies of human thinking. Recognizing that human-like thinking has propelled humankind to its current state, we explore the applica- bility of cognitive science principles in enhancing machine learning. Prior research has demonstrated promising results when applying learning rate methods inspired by cognitive science to Learning Vector Quantization (LVQ) models. In this study, we extend this approach to GLVQ models. Specifically, we examine five distinct cognitive science-inspired GLVQ variants: Conditional Probability (CP), Dual Factor Heuristic (DFH), Middle Symmetry (MS), Loose Symmetry (LS), and Loose Symme- try with Rarity (LSR). Our experiments involve a comprehensive analysis of the performance of these cogni- tive science-derived learning rate techniques across various datasets, aiming to identify optimal settings and variants of cognitive science GLVQ model training. Through this research, we seek to unlock new avenues for enhancing the learning process in machine learning models by drawing inspiration from the rich complexities of human cognition. Keywords: machine learning, GLVQ, cognitive science, cognitive bias, learning rate op- timization, optimizers, human-like learning, Conditional Probability (CP), Dual Factor Heuristic (DFH), Middle Symmetry (MS), Loose Symmetry (LS), Loose Symmetry with Rarity (LSR).

Comparison of Generalized Learning Vector Quantization learning dynamic and numerical stability regarding the Crammer-normalization and the Hein-normalization for adversarial robustness (2023)

Boa, Asirifi

Adversarial robustness of a nearest prototype classifier assures safe deployment in sensitive use fields. Much research has been conducted on artificial neural networks regarding their robustness against adversarial attacks, whereas nearest prototype classifiers have not chalked similar successes. This thesis presents the learning dynamics and numerical stability regarding the Crammer-normalization and the Hein-normalization for adversarial robustness of nearest prototype classifiers. Results of conducted experiments are penned down and analyzed to ascertain the bounds given by Saralajew et al. and Hein et al. for adversarial robustness of nearest prototype classifiers.

Transfer Learning : Offset-Learning for Learning Vector Quantization (2022)

Devineni, Tejaswini

In Machine Learning, Learning Vector Quantization(LVQ) is well known as supervised learning method. LVQ has been studied to generate optimal reference vectors because of its simple and fast learning algorithm [12]. In many tasks of classification, different variants of LVQ are considered while training a model. In this thesis, the two variants of LVQ, Generalized Matrix Learning Vector Quantization(GMLVQ) and Generalized Tangent Learning Vector Quantization(GTLVQ) have been discussed. And later, transfer learning technique for different variants of LVQ has been implemented, visualized and we have compared the results using different datasets.

Genetic pollen analysis based on dual-metabarcoding to portray honey bee foraging in different agro-environments (2022)

Pannicke, Birgit

Pollinating insects are of vital importance for the ecosystem and their drastic decline imposes severe consequences for the environment and humankind. The comprehension of their interaction networks is the first step in order to preserve these highly complex systems. For that purpose, the following study describes a protocol for the investigation of honey bee pollen samples from different agro-environmental areas by DNA extraction, PCR amplification and nanopore sequencing of the barcode regions rbcL and ITS. It was shown, that the most abundant species were classified consistently by both DNA barcodes, while species richness was enhanced by single-barcode detection of less abundant species. The analysis of the the different landscape variables exhibited a decline of species richness, Shannon diversity index, and species evenness with increasing organic crop area. However, sampling was only carried out in August and further investigations are suggested to display a more complete picture of honey bee foraging throughout the seasons.

Towards a Sequence Evolutionary Model of Influenza : a Neuraminidase based on Evolutionary Coupling Analyses and Interpretable Machine Learning Models (2022)

Reuss, Lynn Vivian

Influenza A viruses are responsible for the outbreak of epidemics as well as pandemics worldwide. The surface protein neuraminidase of this virus is responsible, among other things, for the release of virions from the cell and is thus of interest in pharmacological research. The aim of this work is to gain knowledge about evolutionary changes in sequences of influenza A neuraminidase through different methods. First, EVcouplings is used with the goal of identifying evolutionary couplings within the protein sequences, but this analysis was unsuccessful. This is probably due to the great sequence length of neuraminidase. Second, the natural vector method will be used for sequence embedding purposes, in hopes to visualize sequential progression of the virus protein over time. Last, interpretable machine learning methods will be applied to examine if the data is classifiable by the different years and to gain information if the extracted information conform to the results from the EVcouplings analysis. Additionally to using the class label year, other labels such as groups or subtypes are used in classification with varying results. For balanced classes the machine learning models performed adequately, but this was not the case for imbalanced data. Groups and subtypes can be classified with a high accuracy, which was not the case for the years, continents or hosts. To identify the minimal number of features necessary for linear separation of neuraminidase group 1 subtypes, a logistic regression was performed at last, resulting in the identification of 15 combinations of nine amino acid frequencies. Since the sequence embedding as well as the machine learning methods did not show neuraminidase evolution over time, further research is necessary, for example with focus on one subtype with balanced data.

Neuron models: Convergence and Stability Analyses of Hebb, Oja and BCM learning rules (2022)

Heusel, Tabea

This Bachelor thesis investigates the learning rules of the Hebbian, Oja and BCM neuron models for their convergence to, and the stability of, the fixed points. Existing research is presented in a structured manner using consistent notation. Hebbian learning is neither convergent nor stable. Oja learning converges to a stable fixed point, which is the eigenvector corresponding to the largest eigenvalue of the covariance matrix of the input data. BCM learning converges to a fixed point which is stable, when assuming a discrete distribution of orthogonal inputs that occur with equal probability. Hebbian learning can therefore not be used in further applications, where convergence to a stable fixed point is required. Furthermore, this Bachelor thesis came to the conclusion that determining the fixed points of the BCM learning rule explicitly involves extensive calculation and other methods for verifying the stability of possible fixed points should be considered.

Instantaneous learning for Learning Vector Quantization variant based on reject options (2022)

Amjad, Hassan

Digital data is rising day by day and so is the need for intelligent, automated data processing in daily life. In addition to this, in machine learning, a secure and accurate way to classify data is important. This holds utmost importance in certain fields, e.g. in medical data analysis. Moreover, in order to avoid severe consequences, the accuracy and reliability of the classification are equally important. So if the classification is not reliable, instead of accepting the wrongly classified data point, it is better to reject such a data point. This can be done with the help of some strategies by using them on top of a trained model or including them directly in the objective function of the desired training model. We discuss such strategies and analyze the results on data sets in this thesis.

Comparison of Variation Autoencoder with Autoencoder Ala Principe (2022)

Naredla, Santhosh

In the past few years Generative models have become an interesting topic in the field of Machine Learning (ML). Variational Autoencoder (VAE) is one of the popular frameworks of generative models based on the work of D.P Kingma and M. Welling [6] [7]. As an alternative to VAE the authors in [12] proposed and implemented Information Theoretic Learning (ITL) based Autoencoder. VAE and ITL Autoencoder are a combination of the neural networks and probabilistic graphical models (PGM) [7]. In modern statistics it is difficult to compute the approximation ofthe probability densities. In this paper we make use of Variational Inference (VI) technique from machine learning that approximate the distributions through optimization. The closeness between the distributions are measured by the information theoretic divergence measures such as Kullbach-Liebler, Euclidean and Cauchy Schwarz divergences. In this thesis, we study theoretical and experimental results of two different frameworks of generative models which generate images of MNIST handwritten characters [8] and Yale face database B [3]. The results obtained show that the proposed VAE and ITL Autoencoder are capable of generating the underlying structure of the example datasets

Computational modelling of a photorespiratory bypass in C3 metabolism to establish a synthetic C4 cycle (2022)

Schönherr, Lisa

Studying and understanding the metabolism of plants is essential to better adapt them to future climate conditions. Computational models of plant metabolism can guide this process by providing a platform for fast and resource-saving in silico analyses. The reconstruction of these models can follow kinetic or stoichiometric approaches with Flux Balance Analysis being one of the most common one for stoichiometric models. Advances in metabolic modelling over the years include the increasing number of compartments, the automation of the reconstruction process, the modelling of plant-environment interactions and genetic variants or temporally and spatially resolved models. In addition, there is a growing focus on introducing synthetic pathways in plants to increase their agricultural potential regarding yield, growth and nutritional value. One example is the β-hydroxyaspartate cycle (BHAC) to bypass photorespiration. After the implementation in a stoichiometric C3 plant model, in silico flux analyses can help to understand the resulting metabolic changes. When comparing with in vivo experiments with BHAC plants, the metabolic model can reproduce most results with exceptions regarding growth and oxaloacetate. To evaluate whether the BHAC is suitable to establish a synthetic C4 cycle, the pathway is implemented in a two-cell type model that is capable of running a C4 cycle. The results show that the BHAC is only beneficial under light limitation in the bundle sheath cell. An additional engineering target for improved performance of plants is malate synthase. This work serves as the basis for further analyses combining the different factors boosting the advantages of the BHAC and for in vivo experiments in C3 and C4 plants.

Embeddings for Product Data (2022)

Khan, Muhammad Usman

The E-commerce industry has grown exponentially in the last decade, with giants like Amazon, eBay, Aliexpress, and Walmart selling billions of products. Machine learning techniques can be used within the e-commerce domain to improve the overall customer journey on a platform and increase sales. Product data, in specific, can be used for various applications, such as product similarity, clustering, recommendation, and price estimation. For data from these products to be used for such applications, we have to perform feature engineering. The idea is to transform these products into feature vectors before training a machine learning model on them. In this thesis, we propose an approach to create representations for heterogeneous product data from Unite’s platform in the form of structured tabular records. These tables consist of attributes having different information ranging from product-ids to long descriptions. Our model combines popular deep learning approaches used in natural language processing to create numerical representations, which contain mostly non-zeros elements in an array or matrix called as dense representation for all products. To evaluate the quality of these feature vectors, we validate how well the similarities between products are captured by these dense representations. The evaluations are further divided into two categories. The first category directly compares the similarities between individual products. On the other hand, the second category uses these dense vectors in any of the above- mentioned applications as inputs. It then evaluates the quality of these dense representation vectors based on the accuracy or performance of the defined application. As result, we explain the impact of different steps within our model on the quality of these learned representations.

Comparison of numerical properties comparing Automated Derivatives (Autograd) and explicit derivatives (Gradients) for Prototype based models (2022)

Yendamuri, Venkata Sai Sandeep

Differentiation is ubiquitous in the field of mathematics and especially in the field of Machine learning for calculations in gradient-based models. Calculating gradients might be complex and require handling multiple variables. Supervised Learning Vector Quantization models, which are used for classification tasks, also use the Stochastic Gradient Descent method for optimizing their cost functions. There are various methods to calculate these gradients or derivatives, namely Manual Differentiation, Numeric Differentiation, Symbolic Differentiation, and Automatic Differentiation. In this thesis, we evaluate each of the methods mentioned earlier for calculating derivatives and also compare the use of these methods for the variants of Generalized Learning Vector Quantization algorithms.

Prototype-based learning for sequences in molecular biology (2022)

Voigt, Julius

Sequences are an important data structure in molecular biology, but unfortunately it is difficult for most machine learning algorithms to handle them, as they rely on vectorial data. Recent approaches include methods that rely on proximity data, such as median and relational Learning Vector Quantization. However, many of them are limited in the size of the data they are able to handle. A standard method to generate vectorial features for sequence data does not exist yet. Consequently, a way to make sequence data accessible to preferably interpretable machine learning algorithms needs to be found. This thesis will therefore investigate a new approach called the Sensor Response Principle, which is being adapted to protein sequences. Accordingly, sequence similarity is measured via pairwise sequence alignments with different sequence alignment algorithms and various substitution matrices. The measurements are then used as input for learning with the Generalized Learning Vector Quantization algorithm. A special focus lies on sequence length variability as it is suspected to affect the sequence alignment score and therefore the discriminative quality of the generated feature vectors. Specific datasets were generated from the Pfam protein family database to address this question. Further, the impact of the number of references and choice of substitution matrices is examined.

Analysis, comparison, and implementation of machine learning algorithms for optimization of customer data deduplication problems in enterprise CX programs (2022)

Niazi, Jawad

In this thesis, we focus on using machine learning to automate manual or rule-based processes for the deduplication task of the data integration process in an enterprise customer experience program. We study the underlying theoretical foundations of the most widely used machine learning algorithms, including logistic regression, random forests, extreme gradient boosting trees, support vector machines, and generalized matrix learning vector quantization. We then apply those algorithms to a real, private data set and use standard evaluation metrics for classification, such as confusion matrix, precision, and recall, area under the precision-recall curve, and area under the Receiver Operating Characteristic curve to compare their performances and results.

Implementation strategies for battle AI and its usage to increase replayability in action RPG games that focus on PVE combat (2022)

Kießling, Tina

The aim of this bachelor thesis is to find out how the use of artificial intelligence, specifically the one used in combat situations, can increase the playing time or even the replay value of games in the action role-playing genre. Thereby, it focuses mainly on combat situations between a player and an artificial intelligence. To begin with, this bachelor thesis examines the action role-playing genre in order to find a suitable definition for it. Accordingly, action role-playing games involve titles that send the player on a hero’s journey-like adventure in which they must prove their skills in combat against virtual opponents. The greatest challenge of these real-time battles comes from the required quick reflexes, skill queries and hand-eye coordination. Next, six means of increasing the replayability of a game are explored: Experience and Nostalgia, Variety and Randomness, Goals and Completion, Difficulty, Learning, and Social Aspect. The paper then proceeds to give an explanation for the term Artificial Intelligence and examines the various methods used to create intelligent behavior as well as the general advancement of the research field. Special attention is given to the implementation methods of Finite State Machines and Behavior Trees, as they are the most widely used methods for creating behavioral patterns of virtual characters. Finally, a study conducted as part of the bachelor thesis is described, which compares a mathematically balanced artificial intelligence with a behaviorally balanced one in terms of game performance regarding the willingness of test subjects to purchase and play through the game as well as its replay value. The thesis concludes with the findings that while the behavioral approach is more promising than the mathematical approach, a combination of the two methods ultimately leads to the best outcome. Furthermore, the study shows that the use of artificial intelligence to individualize gaming experiences is promising for the future of the gaming industry.

GANs for Numerical Simulation with Predefined Conditions on Statistical Properties of Images (2022)

Nguyen, Ngoc Hien

Simulating complex physical systems involves solving nonlinear partial differential equations (PDEs), which can be very expensive. Generative Adversarial Networks (GAN) has recently been used to generate solutions to PDEs-governed complex systems without having to numerically solve them. However, concerns are raised that the standard GAN system cannot capture some important physical and statistical properties of a complex PDE-governed system, along side with other concerns for difficult and unstable training, the noisy appearance of generated samples and lack of robust assessment methods of the sample quality apart from visual examination. In this thesis, a standard GAN system is trained on a data set of Heat transfer images. We show that the generated data set can capture the true distribution of training data with respect to both visual and statistical properties, specifically the vertical statistical profile. Furthermore, we construct a GAN model which can be conditioned using variance-induced class label. We show that the variance threshold t = 0. 01 constructs a good conditional class label, such that the generated images achieve 96% accuracy rate in complying with the given conditions.

Determining of Classification Label Security/Certainty (2022)

Otoo, Nana Abeka

Classification label security determines the extent to which predicted labels from classification results can be trusted. The uncertainty surrounding classification labels is resolved by the security to which the classification is made. Therefore, classification label security is very significant for decision-making whenever we are encountered with a classification task. This thesis investigates the determination of the classification label security by utilizing fuzzy probabilistic assignments of Fuzzy c-means. The investigation is accompanied by implementation, experimentation, visualization and documentation of the results.

Algorithms For Discrete Logarithms (2022)

Barua, Saumik

Due to the intractability of the Discrete Logarithm Problem (DLP), it has been widely used in the field of cryptography and the security of several cryptosystems is based on the hardness of computation of DLP. In this paper, we start with the topics on Number Theory and Abstract Algebra as it will enable one to study the nature of discrete logarithms in a comprehensive way, and then, we concentrate on the application and computation of discrete logarithms. Application of discrete logarithms such as Diffie Hellman key exchange, ElGamal signature scheme, and several attacks over the DLP such as Baby-step Giant-step method, Silver Pohlig Hellman algorithm, etc have been analyzed. We also focus on the elliptic curve along with the discrete logarithm over the elliptic curve. Attacks for the elliptic curve discrete logarithm problem, ECDLP have been discussed. Moreover, the extension of several discrete logarithms-based protocols over the elliptic curve such as the elliptic curve digital signature algorithm, ECDSA have been discussed also.

Mathematical Aspects and Challenges of the Algorand Blockchain (2022)

Bafna, Mehul

With the growing market of cryptocurrencies, blockchain is becoming central to various research areas relevant from a mathematical and cryptographic point of view. Moreover, it is capable of transforming the traditional methods involving centralized network operations into decentralized peer-to-peer functionalities. At the same time, it provides an alternative to digital payments in a robust and tamperproof manner by adding the element of cryptography, consequently making it traversable for an individual who is a part of the blockchain network. Furthermore, for a blockchain to be optimal and efficient, it must handle the blockchain trilemma of security, decentralization, and scalability constraints in an effective manner. Algorand, a blockchain cryptocurrency protocol intended to solve blockchain’s trilemma, has been studied and discussed. It is a permissionless (public) blockchain protocol and uses pure proof of stake as its consensus mechanism.

Probabilistic Micropayments (2022)

Amirian, Shaghik

Probabilistic micropayments are important cryptography research topics in electronic commerce. The Probabilistic micropayments have the potential to be researched in order to obtain efficient algorithms with low transaction costs and high speeding computer power. To delve into the topic, it is vital to scrutinize the cryptographic preliminaries such as hash functions and digital signatures. This thesis investigates the important probabilistic methods based on a centralized or decentralized network. Firstly, centralized networks such as lottery-based tickets, Payword, coin-flipping, and MR2 are described, and an approach based on blind signatures is also discussed. Then, decentralized network methods such as MICROPAY3, a transferable scheme on the blockchain network, along with an efficient model for cryptocurrencies, are explained. Then we compare the different probabilistic micropayment methods by improving their drawback with a new technique. To set the results from the theoretical analysis of different methods into some context, we analyze the attacks that reduce the security and, therefore, the system’s efficiency. Particularly, we discuss various methods for detecting double-spending and eclipse attacks occurrence

Fermat’s Little Theorem : Proofs, Generalizations and Applications (2022)

Chilkuri, Deepthi Reddy

Fermat proposed fermat’s little theorem in 1640, but a proof was not officially published until 1736. In this thesis paper, we mainly focus on different proofs of fermat’s little theorem like combinatorial proof by counting necklaces, multinomial proofs, proof by modular arithmetic, dynamical systems proof, group theory proof etc. We also concentrate on the generalizations of fermat’s little theorem given by Euler and Laplace. Euler was the first scientist to prove the fermat’s little theorem. We will also go through three different proofs given by Euler for fermat’s little theorem. This theorem has many applications in the field of mathematics and cryptography. We focus on applications of fermat’s little theorem in cryptography like primality testing and publickey cryptography. Primality test is used to determine if the given number n is a prime number or composite number. In this paper, we also concentrate on fermat primality test and Miller-Rabin primality test, which is an extension of fermat primality test. We also discuss the most widely used public-key cryptosystem i.e, the RSA Algorithm, named after its developers R. Rivest, A. Shamir, and L. Adleman. The algorithm was invented in 1978 and depends heavily on fermat’s little theorem.

Isolated Vertices in Random Subgraphs (2021)

Das, Meghodipa

A classical topic in the theory of random graphs is the probability of at least one isolated vertex in a given random graph. An isolated node has a huge impact on social networks which can be given by a random graph. We present a distribution on the number of isolated vertex using the probability generating function. We discuss the relationship between isolated edges and extended cut polynomials, extended matching polynomials using the principle of inclusion exclusion. We introduce an algorithm based on colored graphs for general graphs. We apply this to the components of a graph as well. Finally, we implement the idea on a special class of graphs like cycle, bipartite graph, path, and others. We discuss recursive procedure based on the analogous coloring rules for ladder and fan graphs.

Introducing natural adversarial observations to a Deep Reinforcement Learning agent for Atari Games (2021)

Hanfeld, Pia

Deep Learning methods are known to be vulnerable to adversarial attacks. Since Deep Reinforcement Learning agents are based on these methods, they are prone to tiny input data changes. Three methods for adversarial example generation will be introduced and applied to agents trained to play Atari games. The attacks target either single inputs or can be applied universally to all possible inputs of the agents. They were able to successfully shift the predictions towards a single action or to lower the agent’s confidence in certain actions, respectively. All proposed methods had a severe impact on the agent’s performance while producing invisible adversarial perturbations. Since natural-looking adversarial observations should be completely hidden from a human evaluator, the negative impact on the performance of the agents should additionally be undetectable. Several variants of the proposed methods were tested to fulfil all posed criteria. Overall, seven generated observations for two of three Atari games are classified as natural-looking adversarial observations.

Characterization of metal ion dependent binding and folding of a ribosomal tertiary contact by optical spectroscopy (2021)

Winkler, Anne Katrin

We investigate the folding and thermodynamic stability of a tertiary contact of baker's yeast ribosomal ribonucleic acid (rRNA), which is supposed to be essential for the maturation process of ribosomes in eukaryotes at lower temperatures1. Ribosomes are cellular machines essential for all living organisms. RNA is at the center of these machines and responsible for translation of genetic information into proteins2,3. Only recently, the rRNA tertiary contact of interest was discovered in Zurich by the research group of Vikram Govind Panse. Gerhardy et al.1 showed in vitro that within the 60s-preribosome under defined metal ion concentrations the tertiary contact become visible between a GAAA-tetraloop and a kissing loop motif. Our aim is now to understand this RNA structure, especially the formation of the rRNA tertiary contact, in terms of thermodynamics and kinetics at various experimental conditions, such as temperature and metal ion concentration of K(I), Na(I) and Mg(II). Therein, we use optical spectroscopy like UV/VIS spectroscopy and ensemble Förster or Fluorescence Resonance Energy Transfer (FRET) folding studies. Our findings will help to further characterize this newly discovered ribosomal RNA contact and to elucidate its function within the ribosomal maturation process.

Properties of Series-Parallel (SP-Graphs) (2021)

Asante, Shadrach

Several algorithms have been proposed for the testing of series-parallel graphs in linear time. We give our alternate algorithms for testing series-parallel graphs, their tree decompositions, and the independence number when the input is undirected biconnected series-parallel graphs, which run (approximately) linearly in polynomial time.

VQ-VAE with Neural Gas and Fuzzy c-Means (2021)

Badran, Yahya

VQ-VAE is a successful generative model which can perform lossy compression. It combines deep learning with vector quantization to achieve a discrete compressed representation of the data. We explore using different vector quantization techniques with VQ-VAE, mainly neural gas and fuzzy c-means. Moreover, VQ-VAE consists of a non-differentiable discrete mapping which we will explore and propose changes to the original VQ-VAE loss to fit the alternative vector quantization techniques.

A process to classify exhaled breath by using ion mobility spectra (2021)

Schubert, Ronny

There are multiple ways to gain information about an individual and its health status, but an increasingly popular field in medicine has become the analysis of human breath, which carries a lot of information about metabolic processes within the individuals body. The information in exhaled breath consists of volatile (organic) compounds (VOCs). These VOCs are products of metabolic processes within the individuals body, thus might be an indicator for diseases disturbing those processes. The compounds are to be detected by mass-spectrometric (MS) or ion-mobility spectrometric (IMS) techniques, making the analysis of these compounds not only bounded to exhaled breath. The resulting data is spectral data, capturing concentrations of the VOCs indirectly through intensities. However, a number of about 3000 VOCs [1] could already be determined in human exhaled breath. The number of research paper about VOC-analysis and detection had risen nearly constantly over the last decade 1. Furthermore, the technique to identify VOCs could also be used to capture biomarker from alien species within the individuals body. Extracting VOCs from an individual can be done by non- or minimal invasive techniques. However, the manual identification of VOCs and biomarkers related to a certain disease or infection is not feasible due to the complexity of the sample and often unknown metabolic products, thus automized techniques are needed. [1–4] To establish breath analysis as a diagnosis tool, machine learning methodes could be used. Machine learning has become a popular and common technique when dealing with medical data, due to the rapid analysis. Taking this advantage, breath analysis using machine learning could become the model of choice for diagnosis, keeping in mind that conventional methodes are laboratory based and thus when trying detect bacterial infection need sometimes several days to identify the organism. [5]

Detection and Classification of bluetooth- and WiFi-devices as assistance in IT forensic interventions (2021)

Voigt, Leon

The objective of this Bachelor Project is the creation of a tool that should support forensic investigators during IT forensic interventions. It uses Kismet as the base program and adds functionalities to it via the plugin interface. The installation of the plugin shall be explained, how the plugin works, and a recommendation on how to use it. To understand the underlying basics, an introduction about WLAN and Bluetooth is given. The tests that were performed with the new plugin are described as well as their results. It is therefore briefly discussed why the tool is applicable for locating Wi-Fi devices, especially access points, but not Bluetooth devices. Using all this a few ideas on how to improve the tool and what can be researched in this area are provided.

Development of a genetic biomonitoring test for investigating plant-pollinator interactions (2021)

Prudnikow, Lisa Carolina

In this work, a protocol for portable nanopore sequencing of DNA from pollen collected from honey bees, bumble bees, and wild bees was developed. DNA metabarcoding is applied to identify genera within the mixed DNA samples. The DNA extraction and ITS and ITS2 PCR parameters tested for this purpose were applied to the collected pollen sample and the amplicons were then decoded using the Flongle sequencer adapter from Oxford Nanopore Technologies. It is shown that the main pollinator resources at the different sites can be identified in percentage proportions. The protocol generated in this study can be used for further ecological questions.

Reconstruction of Disrupted Online Time Series Data in Measurements for Fluctuating Production of Energy (2021)

Lodhi, Hasan

Over the past few years, wind and solar power plants have increasingly contributed to energy production. However, due to fluctuating energy sources, the energy production data contain disruption. Such disrupted data lead to the wrong prediction performance, and they need to be estimated by other values. In this thesis, we provide a comparative study to estimate the online disrupted data based on the data of similar groups of power plants, We apply three estimation techniques, e.g., mean, interpolation, and k-nearest neighbor to estimate the disruption on training data. We then apply four clustering algorithms, e.g., k-means, neural gas, hierarchical agglomerative, and affinity propagation, with two similarity measures, e.g., euclidean and dynamic time warping to form groups of power plants and compare the results. Experimental results show that when KNN estimation is applied to data, and neural gas and agglomerative with dtw are used to cluster the data, the cluster quality scores and execution time give better results compared to others. Therefore, we conclude and choose KNN estimation to reconstruct the online disrupted data on each group of a similar power plants.

Generation of transgenic zebrafish lines using CRISPR/Cas9 technology (2021)

Woidtke, Rachel

In this work, a transgenic zebrafish line that expresses the fluorophore dsRed under the endogenous zebrafish cochlin promotor is supposed to be established, using the CRISPR/Cas9 system. dsRed was cloned into a pBluescript vector, followed by the cloning of the cochlin locus into this vector. This bait construct was then supposed to be micro injected into wild type AB zebrafish embryos. The micro injection of Cas9 mRNA, single guide RNA and a bait construct was practiced with the tyrosinase gene, which was disrupted using CRISPR/Cas9.

Consideration of Different Variants of Large Margin Learning Vector Quantization (2021)

Maheshwari, Avinash

In machine learning, Learning Vector Quantization (LVQ) is well known as supervised vector quantization. LVQ has been studied to generate optimal reference vectors because of its simple and fast learning algorithm [2]. In many tasks of classification, different variants are considered while training a model and a consideration of variants of large margin in LVQ helps to get significant results [20]. Large margin LVQ (LMLVQ) is to maximize the distance between decision hyperplane and data points. In this thesis, a comparison of different variants of Generalized Learning Vector Quantization (GLVQ) and Large margin in LVQ is proposed along with visualization, implementation and experimental results.

Introduction to Linear Codes & Binary Hamming Codes (2021)

Soufi, Rasem

Since its foundation as an application of algebra, coding theory is obtaining a day by day increasing importance. For instance, any communication system needs the concepts of coding theory to function efficiently. In this thesis, reader will find an introductory explanation to linear codes and binary hamming codes including some of the algebraic tools devised in their applications. All the described software applications are verified using SageMath 9.0 using Hochschule Mittweida’s JupyterHub.

Agile Game Production : progress Tracking through Process Management – Agile Methodologies enhanced by BPMN Processes (2021)

Abraham, Tessa

The games industry has significantly grown over the last 30 years. Projects are getting bigger and more expensive, making it essential to plan, structure and track them more efficiently. The growth of projects has increased the administrative workload for producers, project managers and leads, as they have to monitor and control the progress of the project in order to keep a permanent overview of the project. This is often accompanied by a lack of insight into the project and basic communication within the team. Therefore, the goal of this thesis is to enhance conventional project management methods with process structures that occur frequently in game development. This thesis initially elaborates on what project management in the game industry actually is: Here, methods are considered, especially agile methods and progress tracking prac-tices, which were created for software development and have become a standard in game development. Subsequently, an example is used to demonstrate how process management can function within the development of video games. Based on this, the ideal is depicted, which is implemented and used in a tool at the German games studio KING Art GmbH. This ideal is compared with expert interviews in order to verify its gen-eral validity in the industry. By integrating process structures, the administrative effort can be reduced, communica-tion within game development can be simplified, while the current project status can be permanently presented. This benefits both project management and leads, as well as the entire team. Further application tests of this theory would have to be organized to check scalability and to draw comparisons to other applications.

Refine

Document Type

Year of publication

Language

Keywords

Author

Institute

120 search hits