Refine
Document Type
- Master's Thesis (3)
- Bachelor Thesis (2)
Keywords
- Lernendes System (5) (remove)
Institute
Classification label security determines the extent to which predicted labels from classification results can be trusted. The uncertainty surrounding classification labels is resolved by the security to which the classification is made. Therefore, classification label security is very significant for decision-making whenever we are encountered with a classification task. This thesis investigates the determination of the classification label security by utilizing fuzzy probabilistic assignments of Fuzzy c-means. The investigation is accompanied by implementation, experimentation, visualization and documentation of the results.
Data streams change their statistical behaviour over the time. These changes can occur gradually or abruptly with unforeseen reasons, which may effect the expected outcome. Thus it is important to detect concept drift as soon as it occurs. In this thesis we chose distance based methodology to detect presence of concept drift in the data streams. We used generalized learning vector quantization(GLVQ) and generalized matrix learning vector quantization( GMLVQ) classifiers for distance calculation between prototypes and data points. Chi-square and Kolmogorov–Smirnov tests are used to compare the distance distributions of test and train data sets to indicate the drift presence.
In the following study we evaluated capabilities of how a simple autoencoder can be used to trainGeneralized Learning Vector Quantization classifier. Specifically, we proved that the bottlenecks of an autoencoder serve as an "information filter" which tries to best represent the desired output in that particular layer in the statistical sense of mutual information.
Autoencoder model was trained for purely unsupervised task and leveraged the advantages by learning feature representations. As a result, the model got the significant value of the accuracy. Implementation and tuning of the model was carried out using Tensor Flow [1].
An extra study has been dedicated to improve traditional GLVQ algorithm taken from sklearn-lvg [2] using the bottleneck from an autoencoder.
The study has revealed potential of bottlenecks of an autoencoder as pre-processing tool in improving the accuracy of GLVQ. Specifically, the model was capable to identify 75% improvements of accuracy in GLVQ comparing to original one, which has about 62%. Consequently, the research exposed the need for further improvement of the model in the present problem case.
Learning Vector Quantization ist ein Klassifikator, der in seiner Urform im euklidischen Raum lernt. Für Zeitreihendaten benötigt es ein gesondertes Distanzmaß, nicht nur wegen der Relation der Zeitpunkte untereinander, sondern auch wegen der unterschiedlichen Längen dieser Zeitreihendaten. Als solches Distanzmaß wird Dynamic Time Warping eingesetzt. Diese Arbeit untersucht die Implementierung und dessen Zeit- und Raumkomplexität.
In dieser Arbeit werden die algorithmischen Grundlagen der Machine Learning Verfahren LVQ1 und LVQ3 erläutert. Für LVQ3 werden mehrere Ansätze zur Anpassung der Lernrate betrachtet, die anschließend verglichen werden sollen. Dazu werden vier verschiedene Experimente durchgeführt, wobei zwei Datensätze Verwendung finden, deren Ursprung in medizinischen Bilddaten liegt.