Volltext-Downloads (blau) und Frontdoor-Views (grau)

Embeddings for Product Data

  • The E-commerce industry has grown exponentially in the last decade, with giants like Amazon, eBay, Aliexpress, and Walmart selling billions of products. Machine learning techniques can be used within the e-commerce domain to improve the overall customer journey on a platform and increase sales. Product data, in specific, can be used for various applications, such as product similarity, clustering, recommendation, and price estimation. For data from these products to be used for such applications, we have to perform feature engineering. The idea is to transform these products into feature vectors before training a machine learning model on them. In this thesis, we propose an approach to create representations for heterogeneous product data from Unite’s platform in the form of structured tabular records. These tables consist of attributes having different information ranging from product-ids to long descriptions. Our model combines popular deep learning approaches used in natural language processing to create numerical representations, which contain mostly non-zeros elements in an array or matrix called as dense representation for all products. To evaluate the quality of these feature vectors, we validate how well the similarities between products are captured by these dense representations. The evaluations are further divided into two categories. The first category directly compares the similarities between individual products. On the other hand, the second category uses these dense vectors in any of the above- mentioned applications as inputs. It then evaluates the quality of these dense representation vectors based on the accuracy or performance of the defined application. As result, we explain the impact of different steps within our model on the quality of these learned representations.

Download full text files

Export metadata

Additional Services

Search Google Scholar


Author:Muhammad Usman Khan
Advisor:Thomas Villmann, Rudolf Sailer
Document Type:Master's Thesis
Year of Completion:2022
Granting Institution:Hochschule Mittweida
Release Date:2023/02/06
GND Keyword:Electronic Commerce; Maschinelles Lernen; Deep learning
Page Number:93
Institutes:Angewandte Computer‐ und Bio­wissen­schaften
DDC classes:006.31 Maschinelles Lernen
Open Access:Frei zugänglich
Licence (German):License LogoUrheberrechtlich geschützt