Analysis, comparison, and implementation of machine learning algorithms for optimization of customer data deduplication problems in enterprise CX programs
- In this thesis, we focus on using machine learning to automate manual or rule-based processes for the deduplication task of the data integration process in an enterprise customer experience program. We study the underlying theoretical foundations of the most widely used machine learning algorithms, including logistic regression, random forests, extreme gradient boosting trees, support vector machines, and generalized matrix learning vector quantization. We then apply those algorithms to a real, private data set and use standard evaluation metrics for classification, such as confusion matrix, precision, and recall, area under the precision-recall curve, and area under the Receiver Operating Characteristic curve to compare their performances and results.
Author: | Jawad Niazi |
---|---|
Advisor: | Thomas Villmann, Florian Sölch |
Document Type: | Master's Thesis |
Language: | English |
Year of Completion: | 2022 |
Granting Institution: | Hochschule Mittweida |
Release Date: | 2023/02/07 |
GND Keyword: | Maschinelles Lernen; Algorithmus |
Page Number: | 43 |
Institutes: | Angewandte Computer‐ und Biowissenschaften |
DDC classes: | 006.31 Maschinelles Lernen |
Open Access: | Frei zugänglich |
Licence (German): | Urheberrechtlich geschützt |