Refine
Document Type
- Master's Thesis (1) (remove)
Year of publication
- 2022 (1)
Language
- English (1)
Keywords
- Algorithmus (1)
- Maschinelles Lernen (1)
Institute
In this thesis, we focus on using machine learning to automate manual or rule-based processes for the deduplication task of the data integration process in an enterprise customer experience program. We study the underlying theoretical foundations of the most widely used machine learning algorithms, including logistic regression, random forests, extreme gradient boosting trees, support vector machines, and generalized matrix learning vector quantization. We then apply those algorithms to a real, private data set and use standard evaluation metrics for classification, such as confusion matrix, precision, and recall, area under the precision-recall curve, and area under the Receiver Operating Characteristic curve to compare their performances and results.