Refine
Document Type
- Master's Thesis (6)
- Bachelor Thesis (1)
Keywords
- Deep learning (7) (remove)
Institute
VQ-VAE is a successful generative model which can perform lossy compression. It combines deep learning with vector quantization to achieve a discrete compressed representation of the data. We explore using different vector quantization techniques with VQ-VAE, mainly neural gas and fuzzy c-means. Moreover, VQ-VAE consists of a non-differentiable discrete mapping which we will explore and propose changes to the original VQ-VAE loss to fit the alternative vector quantization techniques.
In dieser Masterthesis wird evaluiert, wie gut sich Deep-Learning-Modelle für eine Toxizitätsbestimmung im digitalen Raum eignen. Hierfür wird die Transformer-Architektur anhand verschiedener Pre-Trainings auf BERT-, DistilBERT-, RoBERTa- und GPT-2-Basis mithilfe der toxisch-binär annotierten GermEval-Datensätze aus den Jahren 2018, 2019 und 2021 angepasst. Das Feintuning der Modelle findet sowohl mit Supervised-, als auch mit Semi-Supervised-Learning via GAN statt. Im Anhang dieser Arbeit steht der genutzte Programmcode zur Verfügung.
Das Feintuning via GAN stellt eine Besonderheit in der Herangehensweise automatisierter NLP-Aufgaben darf. Als Ergebnis dieser Arbeit kann deren Wirksamkeit in binären Textklassifizierungsaufgaben im deutschen Sprachraum bestätigt werden.
Onlinequellen wurden zum Zeitpunkt des Abrufs mithilfe des Firefox-Addons “SingleFile” in eine HTML-Datei gespeichert. Sowohl der HTML-Teil, als auch die Mediendateien, Stylesheets und Skriptdateien befinden sich komprimiert in der Datei. Jede Onlinequelle wurde während des Speichervorgangs bei woleet.io registriert, sodass später die Integrität der HTML-Datei geprüft werden kann. Hierfür speichert Woleet die Signatur und Zeitstempel einer Datei innerhalb der Bitcoin-Blockchain. Soll die Integrität einer Datei geprüft werden, kann dies über gildas-lormeau.github.io/singlefile-woleet/index.html erfolgen.
Introducing natural adversarial observations to a Deep Reinforcement Learning agent for Atari Games
(2021)
Deep Learning methods are known to be vulnerable to adversarial attacks. Since Deep Reinforcement Learning agents are based on these methods, they are prone to tiny input data changes. Three methods for adversarial example generation will be introduced and applied to agents trained to play Atari games. The attacks target either single inputs or can be applied universally to all possible inputs of the agents. They were able to successfully shift the predictions towards a single action or to lower the agent’s confidence in certain actions, respectively. All proposed methods had a severe impact on the agent’s performance while producing invisible adversarial perturbations. Since natural-looking adversarial observations should be completely hidden from a human evaluator, the negative impact on the performance of the agents should additionally be undetectable. Several variants of the proposed methods were tested to fulfil all posed criteria. Overall, seven generated observations for two of three Atari games are classified as natural-looking adversarial observations.
Embeddings for Product Data
(2022)
The E-commerce industry has grown exponentially in the last decade, with giants like Amazon, eBay, Aliexpress, and Walmart selling billions of products. Machine learning techniques can be used within the e-commerce domain to improve the overall customer journey on a platform and increase sales. Product data, in specific, can be used for various applications, such as product similarity, clustering, recommendation, and price estimation. For data from these products to be used for such applications, we have to perform feature engineering. The idea is to transform these products into feature vectors before training a machine learning model on them. In this thesis, we propose an approach to create representations for heterogeneous product data from Unite’s platform in the form of structured tabular records. These tables consist of attributes having different information ranging from product-ids to long descriptions. Our model combines popular deep learning approaches used in natural language processing to create numerical representations, which contain mostly non-zeros elements in an array or matrix called as dense representation for all products. To evaluate the quality of these feature vectors, we validate how well the similarities between products are captured by these dense representations. The evaluations are further divided into two categories. The first category directly compares the similarities between individual products. On the other hand, the second category uses these dense vectors in any of the above- mentioned applications as inputs. It then evaluates the quality of these dense representation vectors based on the accuracy or performance of the defined application. As result, we explain the impact of different steps within our model on the quality of these learned representations.
In this thesis two novel methods for removing undesired background illumination are de-veloped. These include a wavelet analysis based approach and an enhancement of a deep learning method. These methods have been compared with conventional methods, using real confocal microscopy images and synthetic generated microscopy images. These synthetic images were created utilizing a generator introduced in this thesis.
Ziel dieser Arbeit ist das Evaluieren der Klassifikationsfähigkeit eines MVCNN-Verfahrens am Teilproblem der Klassifikation von prozedural generierten, idealisierten Darstellungen von OCT-Scans. Zu diesem Zweck wird ein Tool für das Erstellen ¨solcher Szenen entwickelt sowie ein Algorithmus zur Volumenberechnung von sich überschneidenden Meshes, welcher für das automatische Labeling dieser Szenen verwendet wird.
Drought is one of the most common and dangerous threats plants have to face, costing the global agricultural sector billions of dollars every year and leading to the loss of tons of harvest. Until people drastically reduce their consumption of animal products or cellular agriculture comes of age, more and more crops will need to be produced to sustain the ever growing human population. Even then, as more areas on earth are becoming prone to drought due to climate change, we may still have to find or breed plant varieties more suitable to grow and prosper in these changing environments.
Plants respond to drought stress with a complex interplay of hormones, transcription factors, and many other functional or regulatory proteins and mapping out this web of agents is no trivial task. In the last two to three decades or so, machine learning has become immensely popular and is increasingly used to find patterns in situations that are too complex for the human mind to overlook. Even though much of the hype is focused on the latest developments in deep learning, relatively simple methods often yield superior results, especially when data is limited and expensive to gather.
This Master Thesis, conducted at the IPK in Gatersleben, develops an approach for shedding light on the phenotypic and transcriptomic processes that occur when a plant is subjected to stress. It centers around a random forest feature selection algorithm and although it is used here to illuminate drought stress response in Arabidopsis thaliana, it can be applied to all kinds of stresses in all kinds of plants.