MOnAMi | Search

Embeddings for Product Data (2022)

The E-commerce industry has grown exponentially in the last decade, with giants like Amazon, eBay, Aliexpress, and Walmart selling billions of products. Machine learning techniques can be used within the e-commerce domain to improve the overall customer journey on a platform and increase sales. Product data, in specific, can be used for various applications, such as product similarity, clustering, recommendation, and price estimation. For data from these products to be used for such applications, we have to perform feature engineering. The idea is to transform these products into feature vectors before training a machine learning model on them. In this thesis, we propose an approach to create representations for heterogeneous product data from Unite’s platform in the form of structured tabular records. These tables consist of attributes having different information ranging from product-ids to long descriptions. Our model combines popular deep learning approaches used in natural language processing to create numerical representations, which contain mostly non-zeros elements in an array or matrix called as dense representation for all products. To evaluate the quality of these feature vectors, we validate how well the similarities between products are captured by these dense representations. The evaluations are further divided into two categories. The first category directly compares the similarities between individual products. On the other hand, the second category uses these dense vectors in any of the above- mentioned applications as inputs. It then evaluates the quality of these dense representation vectors based on the accuracy or performance of the defined application. As result, we explain the impact of different steps within our model on the quality of these learned representations.

Introducing natural adversarial observations to a Deep Reinforcement Learning agent for Atari Games (2021)

Hanfeld, Pia

Deep Learning methods are known to be vulnerable to adversarial attacks. Since Deep Reinforcement Learning agents are based on these methods, they are prone to tiny input data changes. Three methods for adversarial example generation will be introduced and applied to agents trained to play Atari games. The attacks target either single inputs or can be applied universally to all possible inputs of the agents. They were able to successfully shift the predictions towards a single action or to lower the agent’s confidence in certain actions, respectively. All proposed methods had a severe impact on the agent’s performance while producing invisible adversarial perturbations. Since natural-looking adversarial observations should be completely hidden from a human evaluator, the negative impact on the performance of the agents should additionally be undetectable. Several variants of the proposed methods were tested to fulfil all posed criteria. Overall, seven generated observations for two of three Atari games are classified as natural-looking adversarial observations.

VQ-VAE with Neural Gas and Fuzzy c-Means (2021)

Badran, Yahya

VQ-VAE is a successful generative model which can perform lossy compression. It combines deep learning with vector quantization to achieve a discrete compressed representation of the data. We explore using different vector quantization techniques with VQ-VAE, mainly neural gas and fuzzy c-means. Moreover, VQ-VAE consists of a non-differentiable discrete mapping which we will explore and propose changes to the original VQ-VAE loss to fit the alternative vector quantization techniques.

A Multi-Omics Perspective on Arabidopsis Drought Response : Discovering Novel Regulators with a machine learning based Gene-to-Phene Approach (2020)

Psaroudakis, Dennis

Drought is one of the most common and dangerous threats plants have to face, costing the global agricultural sector billions of dollars every year and leading to the loss of tons of harvest. Until people drastically reduce their consumption of animal products or cellular agriculture comes of age, more and more crops will need to be produced to sustain the ever growing human population. Even then, as more areas on earth are becoming prone to drought due to climate change, we may still have to find or breed plant varieties more suitable to grow and prosper in these changing environments. Plants respond to drought stress with a complex interplay of hormones, transcription factors, and many other functional or regulatory proteins and mapping out this web of agents is no trivial task. In the last two to three decades or so, machine learning has become immensely popular and is increasingly used to find patterns in situations that are too complex for the human mind to overlook. Even though much of the hype is focused on the latest developments in deep learning, relatively simple methods often yield superior results, especially when data is limited and expensive to gather. This Master Thesis, conducted at the IPK in Gatersleben, develops an approach for shedding light on the phenotypic and transcriptomic processes that occur when a plant is subjected to stress. It centers around a random forest feature selection algorithm and although it is used here to illuminate drought stress response in Arabidopsis thaliana, it can be applied to all kinds of stresses in all kinds of plants.

Dehazing for Biomedical Microscopy Image Data (2020)

Kohl, Marcel

In this thesis two novel methods for removing undesired background illumination are de-veloped. These include a wavelet analysis based approach and an enhancement of a deep learning method. These methods have been compared with conventional methods, using real confocal microscopy images and synthetic generated microscopy images. These synthetic images were created utilizing a generator introduced in this thesis.

Refine

Document Type

Year of publication

Language

Keywords

Author

Institute

5 search hits