Refine
Document Type
- Master's Thesis (5)
- Bachelor Thesis (2)
Language
- English (7) (remove)
Keywords
- Algorithmus (7) (remove)
Institute
Workload Optimization Techniques for Password
Guessing Algorithms on Distributed Computing Platforms
(2019)
The following thesis covers several ways to optimize distributed computing platforms for cryptanalytic purposes. After an introduction on password storage, password guessing attacks and distributed computing in general, a set of inital benchmark results for a variety of different devices will be analyzed. The shown results are mainly based on utilization of the open source password recovery tool Hashcat. The second part of this work shows an algorithmic implementation for information retrieval and workload generation. This thesis can be used for the conception of a distributed computing system, inventory analysis of available hardware devices, runtime and cost estimations for specific jobs and finally strategic workload distribution.
Path decomposition of a graph has received an important amount of interest over the past decades because of its applications in algorithmic graph theory and in real life problems. For the computation of a path decomposition of small width, we use different heuritics approaches. One of the most useful method is by Bodlaender and Kloks. In this thesis, we focus on the computation, applications, transformation and approximation of a path decomposition of small width.
It is easy to convert a path decomposition in to nice path decomposition with same width, which is more convinent to use to find the graph parameters like independent sets, chromatic polynomials etc. Inspired by [28], we find an algorithm to compute the chromatic polynomial of a graph via nice path decomposition with small width.
Anomaly Detection is a very acute technical problem among various business enterprises. In this thesis a combination of the Growing Neural Gas and the Generalized Matrix Learning Vector Quantization is presented as a solution based on collected theoretical and practical knowledge. The whole network is described and implemented along with references and experimental results. The proposed model is carefully documented and all the further open researching questions are stated for future investigations.
This thesis investigates the efficacy of four machine learning algorithms, namely linear regression, decision tree, random forest and neural network in the task of lead scoring. Specifically, the study evaluates the performance of these algorithms using datasets without sampling and with random under-sampling and over-sampling using SMOTE. The performance of each algorithm is measure using various performance metrics, including accuracy, AUC-ROC, specificity, sensitivity, precision, recall, F1 score, and G-mean. The results indicate that models trained on the dataset without sampling achieved higher accuracy than those trained on the dataset with either random under-sampling or random over-sampling using SMOTE. However, the neural network demonstrated remarkable results on each dataset compared to the other algorithms. These findings provide valuable insights into the effectiveness of machine learning algorithms for lead scoring tasks, particularly when using different sampling techniques. The findings of this study can aid lead management practices in selecting the most suitable algorithm and sampling technique for their needs. Furthermore, the study contributes to the literature by providing a comprehensive evaluation of the performance of machine learning algorithms for lead scoring tasks. This thesis has practical implications for businesses looking to improve their lead management practices, and future research could extend the analysis to other machine learning algorithms or more extensive datasets.
In this work a second version for the Python implementation of an algorithm called Probabilistic Regulation of Metabolism (PROM) was created and applied to the metabolic model iSynCJ816 for the organism Synechocystis sp. PCC 6803. A crossvalidation was performed to determine the minimal amount of expression data needed to produce meaningful results with the PROM algorithm. The failed reproduction of the results of a method called Integrated and Deduced Regulation of Metabolism (IDREAM) is documented and causes for the failed reproduction are discussed.
Prototype-based classification methods like Generalized Matrix Learning Vector Quantization (GMLVQ) are simple and easy to implement. An appropriate choice of the activation function plays an important role in the performance of (deep) multilayer perceptrons (MLP) that rely on a non-linearity for classification and regression learning. In this thesis, successful candidates of non-linear activation functions are investigated which are known for MLPs for application in GMLVQ to realize a non-linear mapping. The influence of the non-linear activation functions on the performance of the model with respect to accuracy, convergence rate are analyzed and experimental results are documented.
In this thesis, we focus on using machine learning to automate manual or rule-based processes for the deduplication task of the data integration process in an enterprise customer experience program. We study the underlying theoretical foundations of the most widely used machine learning algorithms, including logistic regression, random forests, extreme gradient boosting trees, support vector machines, and generalized matrix learning vector quantization. We then apply those algorithms to a real, private data set and use standard evaluation metrics for classification, such as confusion matrix, precision, and recall, area under the precision-recall curve, and area under the Receiver Operating Characteristic curve to compare their performances and results.