Refine
Document Type
- Master's Thesis (121) (remove)
Year of publication
Language
- English (121) (remove)
Keywords
- Maschinelles Lernen (24)
- Vektorquantisierung (8)
- Blockchain (7)
- Algorithmus (5)
- Bioinformatik (5)
- Neuronales Netz (5)
- Deep learning (4)
- Kryptologie (4)
- Virtuelle Währung (4)
- China (3)
VQ-VAE is a successful generative model which can perform lossy compression. It combines deep learning with vector quantization to achieve a discrete compressed representation of the data. We explore using different vector quantization techniques with VQ-VAE, mainly neural gas and fuzzy c-means. Moreover, VQ-VAE consists of a non-differentiable discrete mapping which we will explore and propose changes to the original VQ-VAE loss to fit the alternative vector quantization techniques.
Protein structures are essential elements in every biological system evolved on earth, where they function as stabilizing elements, signaltransducers or replication machin eries. They are consisting of linear-bonded amino acids, which determine the three-dimensional structure of the protein, whereas the structure in turn determines the function. The native and biological active structure ofa protein can be understood as the folding state of a polypeptide chain at the global minimum of free energy.
By means of protein energy profiling, which is an approach derived from statistical physics it is possible to assign a so called energy profile to a protein structure. Such an energy profile describes the local energetic interaction features of every amino acid within the structure and introduces an energetic point of view, instead of a structural or sequential onto proteins.
This work aims to give a perspective to the question of how we may gain pattern information out of energy profiles. The concrete subjects are energy-mapped Pfam family alignments and investigations on finding motifs or patterns indiscretizised energy profile segments.
In Machine Learning, Learning Vector Quantization(LVQ) is well known as supervised learning method. LVQ has been studied to generate optimal reference vectors because of its simple and fast learning algorithm [12]. In many tasks of classification, different variants of LVQ are considered while training a model. In this thesis, the two variants of LVQ, Generalized Matrix Learning Vector Quantization(GMLVQ) and Generalized Tangent Learning Vector Quantization(GTLVQ) have been discussed. And later, transfer learning technique for different variants of LVQ has been implemented, visualized and we have compared the results using different datasets.
Community acquired pneumonia (CAP) is a very common, yet infectious and sometimes lethal disease. Therefor, this disease is connected to high costs of diagnosis and treatment. To actually reduce the costs for health care in this matter, diagnosis and treatment must get cheaper to conduct with no loss in predictive accuracy. One effective way in doing so would be the identification of easy detectable and highly specific transcriptomic markers, which would reduce the amount of work required for laboratory tests by possibly enhanced diagnosis capability.
Transcriptomic whole blood data, derived from the PROGRESS study was combined with several documented features like age, smoking status or the SOFA score. The analysis pipeline included processing by self organizing maps for dimensionality and noise reduction, as well as diffusion pseudotime (DPT). Pseudotime enabled modelling a disease run of CAP, where each sample represented a state/time in the modelled run. Both methods combined resulted in a proposed disease run of CAP, described by 1476 marker genes. The additional conduction of a geneset analysis also provided information about the immune related functions of these marker genes.
Influenza A viruses are responsible for the outbreak of epidemics as well as pandemics worldwide. The surface protein neuraminidase of this virus is responsible, among other things, for the release of virions from the cell and is thus of interest in pharmacological research. The aim of this work is to gain knowledge about evolutionary changes in sequences of influenza A neuraminidase through different methods. First, EVcouplings is used with the goal of identifying evolutionary couplings within the protein sequences, but this analysis was unsuccessful. This is probably due to the great sequence length of neuraminidase. Second, the natural vector method will be used for sequence embedding purposes, in hopes to visualize sequential progression of the virus protein over time. Last, interpretable machine learning methods will be applied to examine if the data is classifiable by the different years and to gain information if the extracted information conform to the results from the EVcouplings analysis. Additionally to using the class label year, other labels such as groups or subtypes are used in classification with varying results. For balanced classes the machine learning models performed adequately, but this was not the case for imbalanced data. Groups and subtypes can be classified with a high accuracy, which was not the case for the years, continents or hosts. To identify the minimal number of features necessary for linear separation of neuraminidase group 1 subtypes, a logistic regression was performed at last, resulting in the identification of 15 combinations of nine amino acid frequencies. Since the sequence embedding as well as the machine learning methods did not show neuraminidase evolution over time, further research is necessary, for example with focus on one subtype with balanced data.
Proteins are involved in almost every aspect of life, mediating a wide range of cellular tasks. The protein sequence dictates the spatial arrangement of the residues and thus ultimately the function of a rotein. Huge effort is put into cumbersome structure eludication experiments which obtain models describing the observed spatial conformation of a protein, enabling users to predict their function, to understand their mode of action or to design tailored drugs to cure disease caused by misfolded or misregulated proteins.
However, the result of structure determination experiments are merely models of reality, made under simplifying assumptions - sometimes containing major undetected errors. On the other hand, such experiments are resource demanding and they cannot supply the actual demand.
Thus, scientists are predicting the structure of proteins in silico, resulting in models that are even
more prone to error.
In consequence, the structure biologists search after a practicable definition of structure quality and over the last two decades several model quality assessment programs emerged, measuring the local and global quality of peculiar structures. Seven representatives were studied, regarding the paradigms they follow and the features they use to describe the quality of residues. Their predications were compared, showing that there is almost no common ground among the tools.
Is there a way to combine their statements anyway?
Finally, the accumulated knowledge was used to design a novel evaluation tool, addressing problems previously spotted. Thereby, high quality of its predication as well as superior usability was
key. The strategy was compared to existing approaches and evaluated on suitable datasets.
The theoretical foundations of enterprise management using information technology were reviewed; analysis of the effectiveness of the use of information systems in the enterprise; ways of improving the enterprise management mechanism using information systems (on example of Mars Wrigley Confectionery Belarus) have been developed.
nicht vorhanden
This master thesis covers the topics of Studying customers’ behavior on the example of skin care brand Nivea. There are presented theoretical basis for the following research about marketing, customers’ behavior and conducting marketing research properly. Then, there is the analysis of German market. Since Nivea is the brand of Beiersdorf company, there is a description of Beiersdorf’s activity and operation work. The main idea of the paper work is to analyze customers’ behavior of Nivea. Therefore, the work contains huge research about the brand along with its’ micro- and macroenvironment. There also were conducted an in-depth interview and a survey to understand customers’
current needs. With all the results the author of the work proposed some ideas for Nivea brand.
This study shows the potential for the make-or-buy theory in several scenarios – production, assembling and development. The evaluation of these possibilities is conducted, based on Bosch’s core competencies. A decision model is developed to support the decision making process. Based on these results, the serial production at RBAC in China is planned and suggestions for setting up the assembly line are given
After the expression of the titin-Hsp27-construct with the following purification supplies no satisfied results which makes the realization of the atomic force microscopy not possible. The devel-opment of the structure model by using different bioinformatic methods can establish a model for the protein sequence. As bioinformatic methods the template search by different BLAST runs and free available software like SwissModel, Pcons, ModWeb and other tools are used. Nevertheless, the generated model is not the native conformation and has to be analyzed with other software until a stable conformation of the structure can be predicted. Depending on the time which is provided the generated model is a good approach for the aim this master thesis has.
As widely discussed in literature spatial patterns of amino acids, so-called structural motifs, play an important role in protein function. The functional responsible part of a protein often lies in an evolutionary highly conserved spatial arrangement of only few amino acids, which are held in place tightly by the rest of the structure. In general, these motifs can mediate various functional interactions, such as DNA/RNA targeting and binding, ligand interactions, substrate catalysis, and stabilization of the protein structure.
Hence, characterizing and identifying such conserved structural motifs can contribute to understanding of structurefunction relationships in diverse protein families. Therefore and because of the rapidly increasing number of solved protein structures, it is highly desirable to identify, understand and moreover to search for structural scattered amino acid motifs. The aim of this work was the development and the implementation of a matching algorithm to search for such small structural motifs in large sets of target structures. Furthermore, motif matches were extensively analyzed, statistically assessed and functionally classified. Following a novel approach, hierarchical clustering was combined with functional classification and used to deduce evolutionary structure-function relationships. The proposed methods were combined and implemented to a feature-rich and easy-to-use command line software tool, which is freely available and contributes to the field of structural bioinformatic research.
The emerging Internet of Things (IoT) technology interconnects billions of embedded devices with each other. These embedded devices are internet-enabled, which collect, share, and analyze data without any human interventions. The integration of IoT technology into the human environment, such as industries, agriculture, and health sectors, is expected to improve the way of life and businesses. The emerging technology possesses challenges and numerous
security threats. On these grounds, it is a must to strengthen the security of IoT technology to avoid any compromise, which affects human life. In contrast to implementing traditional cryptosystems on IoT devices, an elliptic curve cryptosystem (ECC) is used to meet the limited resources of the devices. ECC is an elliptic curve-based public-key cryptography which provides equivalent security with shorter key size compared to other cryptosystems such as Rivest–Shamir–Adleman (RSA). The security of an ECC hinges on the hardness to solve the elliptic curve discrete logarithm problem (ECDLP). ECC is faster and easier to implement and also consumes less power and bandwidth. ECC is incorporated in internationally recognized standards for lightweight applications due to the
benefits ECC provides.
Recently a deep neural network architecture designed to work on graph- structured data have been capturing notice as well as getting implemented in various domains and application. However, learning representation (feature embedding) from graphical data picking pace in research and constructing graph(s) from dataset remains a challenge. The ability to map the data to lower dimensions further makes the task easier while providing comfort in applying many operations. Graph neural network (GNN) is one of the novel neural network models that is catching attention as it is outperforming in various applications like recommender systems, social networks, chemical synthesis, and many more. This thesis discusses a unique approach for a fundamental task on graphs; node classification. The feature embedding for a node is aggregated by applying a Recurrent neural network (RNN), then a GNN model is trained to classify a node with the help of aggregated features and Q learning supports in optimizing the shape of neural networks. This thesis starts with the working principles of the Feedforward neural network, recurrent units like simple RNN, Long short-term memory (LSTM), and Gated recurrent unit (GRU), followed by concepts of Reinforcement learning (RL) and the Q learning algorithm. An overview of the fundamentals of graphs, followed by the GNN architecture and workflow, is discussed subsequently. Some basic GNN models are discussed in brief later before it approaches the technical implementation details, the output of the model, and a comparison with a few other models such as GraphSage and Graph attention network (GAN).
Over the past few years, wind and solar power plants have increasingly contributed to energy production. However, due to fluctuating energy sources, the energy production data contain disruption. Such disrupted data lead to the wrong prediction performance, and they need to be estimated by other values. In this thesis, we provide a comparative study to estimate the online disrupted data based on the data of similar groups of power plants, We apply three estimation techniques, e.g., mean, interpolation, and k-nearest neighbor to estimate the disruption on training data. We then apply four clustering algorithms, e.g., k-means, neural gas, hierarchical agglomerative, and affinity propagation, with two similarity measures, e.g., euclidean and dynamic time warping to form groups of power plants and compare the results. Experimental results show that when KNN estimation is applied to data, and neural gas and agglomerative with dtw are used to cluster the data, the cluster quality scores and execution time give better results compared to others. Therefore, we conclude and choose KNN estimation to reconstruct the online disrupted data on each group of a similar power plants.
The application described in this thesis has been created, built and designed to help nurses or any medical personnel all around the world in being able to access a real-time database to store patient records like Patient Name, Patient ID, Patient Age and Date of Birth, and the Symptoms that the patient is experiencing. A real-time database is a live database where all changes made to it are reflected across all devices accessing it. This application will be beneficial especially in countries where access to a computer or medical equipment is not always possible. A phone is always ready use and at the reach of the hand, users of this application will always be able to access the data at any given time and place. We will be able to add a new patient or search for existing patients. In addition, this application allows us to take RAW medical images that can be used to identify anomalies in the blood sample. RAW images are important for this application because they’re uncompressed, which means, they do not lose any quality or details. The users of this application are the medical personnel that will be taking care of the patients. These users will have to create a profile on the database in order to use the application, since their data, like user ID, will be used in order to control the behaviour of the data retrieved and stored. We will also discuss the current and future features of this application, as well as, the benefits of this application when it comes to the medical personnel, as well as patients. Finally, we will also go
over the implementation of such application from a hardware perspective, as well as a software one.
Sequences are an important data structure in molecular biology, but unfortunately it is difficult for most machine learning algorithms to handle them, as they rely on vectorial data. Recent approaches include methods that rely on proximity data, such as median and relational Learning Vector Quantization. However, many of them are limited in the size of the data they are able to handle. A standard method to generate vectorial features for sequence data does not exist yet. Consequently, a way to make sequence data accessible to preferably interpretable machine learning algorithms needs to be found. This thesis will therefore investigate a new approach called the Sensor Response Principle, which is being adapted to protein sequences. Accordingly, sequence similarity is measured via pairwise sequence alignments with different sequence alignment algorithms and various substitution matrices. The measurements are then used as input for learning with the Generalized Learning Vector Quantization algorithm. A special focus lies on sequence length variability as it is suspected to affect the sequence alignment score and therefore the discriminative quality of the generated feature vectors. Specific datasets were generated from the Pfam protein family database to address this question. Further, the impact of the number of references and choice of substitution matrices is examined.
The occurence of prostate cancer (PCa) has been consistently rising since three decades and remains the third leading cause of cancer-related deaths after lung and bowel cancer in Germany. Despite of new methods of early detection, such as prostate-specific antigen (PSA) testing, it persists to be the most common cancer in german men with over 63,400 new diagnoses in Germany every year and exhibits high prevalence in other countries of Northern andWestern Europe as well [64]. Men over the age of 70 are most commonly affected by the lethal disease, whereas an indisposition before 50 is rare. The malignant prostate tumor can be healed through operation or irradiation while the cancer hasn’t reached the stage of metastasis in which other therapeutic methods have to be employed [14] [15]. In the metastatic phase, the patient usually exhibits symptoms when the tumors size affects the urethra or the cancer spreads to other tissue, often the bones [16].
The high prevalence of this disease marks the importance of further research into prognosis and diagnosis methods, whereby identification of further biomarkers in PCa poses a major topic of scientific analysis. For this task, the effectiveness of high-throughput RNA sequencing of the transcriptome (RNA molecules of an organism or specific cell type) is frequently exploited [66]. RNA sequencing or RNA-Seq in short, offers the possibility of transcriptome assessment, enabling the identification of transcriptional aberrations in diseases as well as uncharacterized RNA species such as non-coding RNAs (ncRNAs) which remain undetected by conventional methods [49]. To alleviate interpretation of the sequenced reads they are assembled to reconstruct the transcriptome as close to the original state as possible, thus enabling rapid detection of relevant biomolecules in the data [49]. Transcriptomic studies often require highly accurate and complete gene annotations on the reference genome of the examined organism. However, most gene annotations and reference genomes are far from complete, containing a multitude of unidentified protein-coding and non-coding genes and transcripts. Therefore, refinement of reference genomes and annotations by inclusion of novel sequences, discovered in high quality transcriptome assemblies, is necessary [24].
Several algorithms have been proposed for the testing of series-parallel graphs in linear time. We give our alternate algorithms for testing series-parallel graphs, their tree decompositions, and the independence number when the input is undirected biconnected series-parallel graphs, which run (approximately) linearly in polynomial time.
Probabilistic Micropayments
(2022)
Probabilistic micropayments are important cryptography research topics in electronic commerce. The Probabilistic micropayments have the potential to be researched in order to obtain efficient algorithms with low transaction costs and high speeding computer power. To delve into the topic, it is vital to scrutinize the cryptographic preliminaries such as hash functions and digital signatures. This thesis investigates the important probabilistic methods based on a centralized or decentralized network. Firstly, centralized networks such as lottery-based tickets, Payword, coin-flipping, and MR2 are described, and an approach based on blind signatures is also discussed. Then, decentralized network methods such as MICROPAY3, a transferable scheme on the blockchain network, along with an efficient model for cryptocurrencies, are explained. Then we compare the different probabilistic micropayment methods by improving their drawback with a new technique. To set the results from the theoretical analysis of different methods into some context, we analyze the attacks that reduce the security and, therefore, the system’s efficiency. Particularly, we discuss various methods for detecting double-spending and eclipse attacks occurrence
Robust soft learning vector quantization (RSLVQ) is a probabilistic approach of Learning vector quantization (LVQ) algorithm. Basically, the RSLVQ approach describes its functionality with respect to Gaussian mixture model and its cost function is defined in terms of likelihood ratio. Our thesis work involves an approach of modifying standard RSLVQ with non-Gaussian density functions like logistic, lognormal, and Cauchy (referred as PLVQ). In this approach, we derive new update rules for prototypes using gradient of cost function with respect to non-Gaussian density functions. We also derive new learning rules for the model parameters like s and s, by differentiating the cost function with respect to parameters. The main goal of the thesis is to compare the performance results of PLVQ model with Gaussian-RSLVQ model. Therefore, the performance of these classification models have been tested on the Iris and Seeds dataset. To visualize the results of the classification models in an adequate way, the Principal component analysis (PCA) technique has been used.
This thesis investigated the generation of laser induced periodic surface structures (LIPSS) using femtosecond laser irradiation at a central wavelength of 775 nm.
The metals stainless steel and copper as well as a semiconducting thin film, ITO on glass substrate were investigated. The impact of the processing parameters was studied for single and multiple pulse irradiation to determine the ablation threshold of the materials
and the different types of LIPSS. These observations allowed the optimisation of area structuring with regards to processing speed and LIPSS quality.
The feasibility of the LIPSS generation in dynamic, real time polarisation control was then explored. By using a fast response, liquid-crystal polarisation rotation device, the direction of the linear polarisation of the laser beam could be dynamically controlled and synchronised to the scanning during laser processing. As a result, a range of complex micro- and nano-scale patterns with orthogonal direction of LIPSS were created. The samples were analysed using optical and electron microscopy. The orientation of the LIPSS was determined also from detection of light diffracted by the LIPSS.
Finally, two applications of large area LIPSS patterning were demonstrated, information encoding on metals and periodic structuring of a thin film conducting oxide for solar cells.
Path decomposition of a graph has received an important amount of interest over the past decades because of its applications in algorithmic graph theory and in real life problems. For the computation of a path decomposition of small width, we use different heuritics approaches. One of the most useful method is by Bodlaender and Kloks. In this thesis, we focus on the computation, applications, transformation and approximation of a path decomposition of small width.
It is easy to convert a path decomposition in to nice path decomposition with same width, which is more convinent to use to find the graph parameters like independent sets, chromatic polynomials etc. Inspired by [28], we find an algorithm to compute the chromatic polynomial of a graph via nice path decomposition with small width.
In the practice of software engineering, project managers often face the problem of software project management.
It is related to resource constrained project scheduling
problem. In software project scheduling, main resources are considered to be the employees with some skill set and required amount of salary. The main purpose of software
project scheduling is to assign tasks of a project to the available employees such that the total cost and duration of the project are minimized, while keeping in check that
the constraints of software project scheduling are fulfilled. Software project scheduling (SPSP) has complex combined optimization issues and its search space increases exponentially when number of tasks and employees are increased, this makes software project scheduling problem (SPSP) a NP-Hard problem. The goal of software project scheduling problem is to minimize total cost and duration of project which makes it multi-objective problem. Many algorithms are proposed up till now that claim to give near optimal results for NP-Hard problems, but only few are there that gives feasible set of solutions for software project scheduling problem, but still we want to get more efficient algorithm to get feasible and efficient results.
Nowadays, most of the problems are being solved by using nature inspired algorithms because these algorithms provide the behavior of exploration and exploitation. For solving
software project scheduling (SPSP) some of these nature inspired algorithms have been used e.g. genetic algorithms, Ant Colony Optimization algorithm (ACO), Firefly etc.
Nature inspired algorithms like particle swarm optimization, genetic algorithms and Ant Colony Optimization algorithm provides more promising result than naive and greedy algorithms. However there is always a quest and room for more improvement. The main purpose of this research is to use bat algorithm to get efficient results and solutions for software project scheduling problem. In this work modified bat algorithm is implemented where a different approach of random walk is used. The contributions of this thesis are to: (1) To adapt and apply modified multi-objective bat algorithm for solving software project scheduling (SPSP) efficiently, (2) to adapt and apply other nature inspired algorithms like genetic algorithms for solving software project scheduling (SPSP) and (3) to compare and analyze the results obtained by applied nature inspired algorithms and provide the conclusion.
Massive multiple-input multiple-output (MIMO), eine Technik bei der die Basisstation einer Mobilfunkzelle mit einer großen Anzahl an Antennen ausgestattet ist, wird derzeit als eine vielversprechende Schlüsseltechnologie zur Erfüllung der Anforderungen zukünftiger drahtloser Kommunikationsnetze der fünften Generation betrachtet. Die zuversichtlichen Angaben über die Leistung solcher Systeme beruht allerdings auf einer theoretischen, bisher kaum praktisch verizierten Annahme, dass die drahtlosen Übertragungskanäle verschiedener Nutzer aufgrund der hohen Anzahl an Antennen voneinander unabhängig sind. Das heißt, dass sogenannte günstige Übertragungsbedingungen herrschen. Die vorliegende Masterarbeit untersucht diese neuartigen Systeme unter zwei verschiedenen Perspektiven.
Im ersten Teil dieser Arbeit wird der Einfluss von realistischen Übertragungsbedingungen auf die Performance von massive MIMO Systemen evaluiert. Dazu werden entsprechende numerische Systemsimulationen durchgeführt und mit den Ergebnissen von praktischen massive MIMO Messkampagnen verglichen.
Die Untersuchungen ergeben, dass die sogenannten günstigen Übertragungsbedingungen in realistischen Umgebungen nur bedingt beobachtet werden können. Daher führen traditionelle Kanalmodelle zu einer ungenauen Abschätzung der Leistung von praktischen massive MIMO Systemen. Um diesem Problem zu begegnen, wird deshalb eine neuartige Parametrisierung des traditionellen Kronecker-Modells vorgeschlagen, sodass relevante Kenngrößen realistischer Kanäle mit diesem Modell präzise widergespiegelt werden.
Anschließend folgt eine Untersuchung verschiedener Methoden zur Kanalschätzung in massive MIMO Systemen unter den verschiedenen Kanalmodellen mittels numerischer Simulationen. Die Experimente zeigen auf, dass Schätzmethoden, welche speziell für massive MIMO unter der Annahme von günstigen Übertragungsbedinungen hergeleitet wurden, eine signifikante Leistungsminderung unter realistischen Kanalmodellen erfahren.
Im zweiten Teil dieser Arbeit liegt der Fokus auf der Anwendung von massive MIMO Systemen in sogenannten Internet of Things (IoT) Netzwerken. Die typischerweise hohe Anzahl an aktiven IoT-Geräten macht die Anwendung von effizienten Scheduling-Algorithmen notwendig. Daher wird ein Downlink-Scheduling-Algorithmus präsentiert, welcher sich die Eigenschaften von massive MIMO Systemen und die typischen Anforderungen an die Datenraten von IoT-Geräten zunutze macht. Im Speziellen wird vorgeschlagen, die IoT-Nutzer in Gruppen aufzuteilen und die verschiedenen Gruppen nacheinander zu versorgen. Die Gruppengröße wird dabei mit Hilfe asymptotischer Eigenschaften von massive MIMO Systemen hergeleitet.
Um die Gruppenmitglieder zu selektieren, wird eine modifizierte Version des populären Semi-Orthogonal-User-Selection (SUS) Algorithmus vorgeschlagen. Die anschließend durchgeführten numerischen Simulationen bestätigen, dass die modifizierte Version von SUS die Nachteile des originalen Algorithmus eliminiert, was wiederum zu verbesserten Datenraten in dem betrachteten System führt.
Computationally solving eigenvalue problems is a central problem in numerical analysis and as such has been the subject of extensive study. In this thesis we present four different methods to compute eigenvalues, each with its own characteristics, strengths and weaknesses. After formally introducing the methods we use them in various numerical experiments to test speed of convergence, stability as well as performance when used to compute eigenfaces, denoise images and compute the eigenvector centrality measure of a graph.
Cryptorchidism describes a disease, in which one or both testes do not descend into the scrotum properly. With a prevalence of up to 10%, cryptorchidism is one of the most common birth defects of the male genital tract. Despite its associated health risks and accompanying economic damage, resulting from surgery and losses in breeding, studies on canine cryptorchidism and its causes are relatively rare. In this study a relational database for genetic causes of cryptorchidism was established and used as a basis for the identification of candidate genes. Associated regions were analysed by nanopore sequencing with the goal to identify genetic variants correlated with cryptorchidism in German Sheep Poodle.
A Protein is a large molecule that consists of a vast number of atoms; one can only imagine the complexity of such a molecule. Protein is a series of amino acids that bind to each other to form specific sequences known as peptide chains. Proteins fold into three-dimensional conformations (or so-called protein’s native structure) to perform their functions. However, not every protein folds into a correct structure as a result of mutations occurring in their amino acid sequences. Consequently, this mutation causes many protein misfolding diseases. Protein folding is a severe problem in the biological field. Predicting changes in protein stability free energy in relation to the amino acid mutation (ΔΔG) aids to better comprehend the driving forces underlying how proteins fold to their native structures. Therefore, measuring the difference in Gibbs free energy provides more insight as to how protein folding occurs. Consequently, this knowledge might prove beneficial in designing new drugs to treat protein misfolding related diseases. The protein-energy profile aids in understanding the sequential, structural, and functional relationship, by assigning an energy profile to a protein structure. Additionally, measuring the changes in the protein-energy profile consequent to the mutation (ΔΔE) by using an approach derived from statistical physics will lead us to comprehend the protein structure thoroughly. In this work, we attempt to prove that ΔΔE values will be approximate to ΔΔG values, which can lead the future studies to consider that the energy profile is a good predictor of protein binding affinity as Gibbs free energy to solve the protein folding problem.
Anomaly Detection is a very acute technical problem among various business enterprises. In this thesis a combination of the Growing Neural Gas and the Generalized Matrix Learning Vector Quantization is presented as a solution based on collected theoretical and practical knowledge. The whole network is described and implemented along with references and experimental results. The proposed model is carefully documented and all the further open researching questions are stated for future investigations.
Mathematics Behind the Zcash
(2020)
Among all the new developed cryptocurrencies from Bitcoin, Zcash comes out to be the strongest cryptocurrency providing both transparency and anonymity to the transactions and its users by deploying the strong mathematics of zk-SNARKs.
We discussed the zero knowledge proofs which is a basic building block for providing the functionality to zk-SNARKs. It offers schnorr and sigma protocols with interactive and noninteractive versions. Non-interactive proofs are further used in Zcash transactions where the validation of sent transaction is proved by cryptographic proof.
Further, we deploy zk-SNARKs proofs following common reference string as public parameter when transaction is made. The proof allows sender to prove that she knows a secret for an instance such that the proof is succinct, can be verified very efficiently and does not leak the
secret. Non-malleability, small proofs and very effective verification make zk-SNARKs a classic tool in Zcash. Since we deal with NP problems therefore we have considered the elliptic curve cryptography to provide the same security like RSA but with smaller parameter size.
Lastly, we explain Zcash transaction process after minting the coin, the corresponding transaction completely hides the sender, receiver and amount of transaction using zero knowledge proof.
As future considerations, we talk about the improvements that can be done in term of decentralization, efficiency by comparing with top ranked cryptocurrencies namely Ethereum and Monero, privacy preserving against the thread of quantum computers and enhancements in shielded transactions.
Soft Learning Vector Quantisation (SLVQ) andRobust Soft Learning Vector Quantisation (RSLVQ) are supervised data classification methods, that have been applied successfully to real world classification problems. The performance of SLVQ and RSLVQ, however, reduces, when they are applied tomore complicated classification problems. In this thesis, we have introducedmodi-fications to SLVQand RSLVQ, in order to havemore capable versions of them. A few possibilities to modify SLVQ and RSLVQ are considered, some of them are not successful enough and they have been included for the sake of completeness. The fruits of the thesis are plenty, including Tangent Soft Learning Vector Quantisation-Strong (TSLVQ-S), together with its more stable version Tangent Robust Soft Learning Vector Quantisation-Strong (TRSLVQ-S), Attraction Soft Learning Vector Quantisation (ASLVQ) and Grassmannian Soft Learning Vector Quantisation (GSLVQ).
With the growing market of cryptocurrencies, blockchain is becoming central to various research areas relevant from a mathematical and cryptographic point of view. Moreover, it is capable of transforming the traditional methods involving centralized network operations into decentralized peer-to-peer functionalities. At the same time, it provides an alternative to digital payments in a robust and tamperproof manner by adding the element of cryptography, consequently making it traversable for an individual who is a part of the blockchain network. Furthermore, for a blockchain to be optimal and efficient, it must handle the blockchain trilemma of security, decentralization, and scalability constraints in an effective manner. Algorand, a blockchain cryptocurrency protocol intended to solve blockchain’s trilemma, has been studied and discussed. It is a permissionless (public) blockchain protocol and uses pure proof of stake as its consensus mechanism.
This Master Thesis covers two main Topics: Sharing Economy and Risk Management and combines them in frames of this paper in order to provide a methodology (Uber was chosen as an example) of how a risk management process may be applied to a Sharing Economy business, as well as which types of risks are of special relevance for those types of businesses.
This thesis investigates the efficacy of four machine learning algorithms, namely linear regression, decision tree, random forest and neural network in the task of lead scoring. Specifically, the study evaluates the performance of these algorithms using datasets without sampling and with random under-sampling and over-sampling using SMOTE. The performance of each algorithm is measure using various performance metrics, including accuracy, AUC-ROC, specificity, sensitivity, precision, recall, F1 score, and G-mean. The results indicate that models trained on the dataset without sampling achieved higher accuracy than those trained on the dataset with either random under-sampling or random over-sampling using SMOTE. However, the neural network demonstrated remarkable results on each dataset compared to the other algorithms. These findings provide valuable insights into the effectiveness of machine learning algorithms for lead scoring tasks, particularly when using different sampling techniques. The findings of this study can aid lead management practices in selecting the most suitable algorithm and sampling technique for their needs. Furthermore, the study contributes to the literature by providing a comprehensive evaluation of the performance of machine learning algorithms for lead scoring tasks. This thesis has practical implications for businesses looking to improve their lead management practices, and future research could extend the analysis to other machine learning algorithms or more extensive datasets.
Die vorliegende Arbeit befasst sich mit der Analyse der kritischen Erfolgsfaktoren für die Zulassung europäischer Industrieprodukte in Indien, anhand eines europäisch entwickelten und produzierten Produktes für die indische Rolling Stock Industrie. Die dabei berücksichtigen Themenschwerpunkte, die im Detail betrachtet sind über: Welche Standards werden derzeit in Indien bzw. in Europa offiziell für den Zulassungsprozess herangezogen? Aktuelle Situation erfassen. Vergleich der technischen Zulassung Standards zwischen IR (Indischen Railway) Standards und
The aim of this master thesis is to describe the key factors of successful energy efficiency projects. In particular, local conditions of such projects in Kazakhstan will be emphasized and a country-specific guideline will be provided at the end. The following topics will be covered in this thesis: energy efficiency technologies, financing, and capacities. The first part examines the energy efficiency approaches and their potential in the local industry. The second part deals with available financing methods, their specific characteristics and appropriateness for overcoming investment barriers in Kazakhstan. The third part of the master thesis concerns necessary project capacities. The application of the three elements for successful project implementation is described in the end.
A classical topic in the theory of random graphs is the probability of at least one isolated vertex in a given random graph. An isolated node has a huge impact on social networks which can be given by a random graph. We present a distribution on the number of isolated vertex using the probability generating function. We discuss the relationship between isolated edges and extended cut polynomials, extended matching polynomials using the principle of inclusion exclusion. We introduce an algorithm based on colored graphs for general graphs. We apply this to the components of a graph as well. Finally, we implement the idea on a special class of graphs like cycle, bipartite graph, path, and others. We discuss recursive procedure based on the analogous coloring rules for ladder and fan graphs.
In bioinformatics one important task is to distinguish between native and mirror protein models based on the structural information. This information can be obtained from the atomic coordinates of the protein backbone. This thesis tackles the problem of distinction of these conformations, looking at the statistics of the dihedral angles’ distribution regarding the protein backbone. This distribution is visualized in Ramachandran plots. By means of an interpretable machine learning classification method – Generalized Matrix Learning Vector Quantization – we are able to distinguish between native and mirror protein models with high accuracy. Further, the classifier model supplies supplementary information on the important distributional regions for distinction, like α-helices and β-strands.
Vicia faba leaves and calli were transformed using CRISPR Cas RNP. Two kinds of CPP fused SpyCas9 were used with sgRNA7, sgRNA5 or sgRNA13 targeting PDS exon 1, PDS exon 2 or MgCh exon 3 respectively. RNP were applied using high pressure spraying, biolistic delivery, incubation in RNP solution and infiltration of leaf tissue. A PCR and restriction enzyme based approach was used for detection of mutation. Screening of 679 E. coli colonies containing the cloned fragments resulted in detection of 14 mutations. Most of the 14 mutations were deletions of sizes 150, 500 or 730 bp. 5 out of the 14 mutations were point mutations located two to three bp upstream of PAM.
Neural networks have become one of the most powerful algorithms when it comes to learning from big data sets and it is used extensively for classification. But the deeper the network models, the lesser is the interpretability of such models. Although many methods exist to explain
the output of such networks, the lack of interpretability makes them black boxes. On the other hand, prototype-based machine learning algorithms are known to be interpretable and robust.
Therefore, the aim of this thesis is to find a way to interpret the functioning of the neural networks by introducing a prototype layer to the neural network architecture. This prototype layer will train alongside the neural network and help us interpret the model. We present architectures of neural networks consisting of autoencoders and prototypes that perform activity recognition from heart rates extracted from ECG signals. These prototypes represent the different activity groups that the heart rates belong to and thereby aid in interpretability.
This thesis focuses on the introduction of a process for the fracture toughness testing of epoxy resin systems, in the light of the linear elastic fracture mechanic approach. Based on the requirements of ISO 13586, SENB-specimen were designed and especially the precracking process was analysed and the tapping process was optimized by designing and testing a drop-weight device. After successful validating the test process using specimen made of Araldite LY556, the in uence of GNP loading on the fracture toughness was analysed. The pure epoxy showed a KIc of 0.73 MPap
m, being perfectly in line with the manufacturers datasheet. A peak in fracture toughness of 0.83 MPap
m was archived at 1 wt% and a loading rate of 10 mm/min, showing a decreasing trend as the loading is increased further. As the loading rate is increased, the fracture toughness reduces slightly for 0.5 wt% and 2 wt% GNP, but
drops signicantly for 1 wt% GNP obliterating the peak. The load vs. displacement curves showed quasi-brittle material behaviour. The fracture surfaces were analysed using SEM and while the neat resin did not show any features, did the reinforced samples show pattern of crack pinning in connection with bridging and pull-out. The resulting improvement is less signicant as observed by other researchers for larger GNPs. This is in line with the general idea, that small particles are not able to yield as high improvements, but the signicant decrease for higher loading rates is not observed or described so far. It is suspected that tests at lower loading rates (e.g. 1 or 0.5 mm/min) show an even higher fracture toughness.
Introducing natural adversarial observations to a Deep Reinforcement Learning agent for Atari Games
(2021)
Deep Learning methods are known to be vulnerable to adversarial attacks. Since Deep Reinforcement Learning agents are based on these methods, they are prone to tiny input data changes. Three methods for adversarial example generation will be introduced and applied to agents trained to play Atari games. The attacks target either single inputs or can be applied universally to all possible inputs of the agents. They were able to successfully shift the predictions towards a single action or to lower the agent’s confidence in certain actions, respectively. All proposed methods had a severe impact on the agent’s performance while producing invisible adversarial perturbations. Since natural-looking adversarial observations should be completely hidden from a human evaluator, the negative impact on the performance of the agents should additionally be undetectable. Several variants of the proposed methods were tested to fulfil all posed criteria. Overall, seven generated observations for two of three Atari games are classified as natural-looking adversarial observations.
Internationalization and business expansion appear to be the most challenging processes in business conduction today. Every step of the foreign market entry process and overseas operations establishment is full of obvious risks and hidden pitfalls. Theoretical background, multiplied with the vital practice, is playing the key role in such a complicated business process; such information can be used as a guideline by further market entrants and players. At present, Germany with its well-developed engineering industry represents a broad space for research of internationalization process in its different forms, as well as can show both successful and negative results of foreign market entries.
Glycans play an important role in the intracellular interactions of pathogenic bacteria. Pathogenic bacteria possess binding proteins capable of recognizing certain sugar motifs on other cells, which are found in glycan structures. Artificial carbohydrate synthesis allows scientists to recreate those sugar motifs in a rational, precise, and pure form. However, due to the high specificity of sugar-binding proteins, known as lectins, to glycan structures, methods for identifying suitable binding agents need to be developed. To tackle this hurdle, the Fraunhofer Institute for Cell Therapy and Immunology (Fraunhofer IZI) and the Max-Planck Institute of Colloids and Interfaces (MPIKG) developed a binding assay for the high throughput testing of sugar motifs that are presented on modular scaffolds formed by the assembly of four DNA strands into simple, branched DNA nanostructures. The first generation of this assay was used in combination with bacteria that express a fluorescent protein as a proof-of-concept. Here, the assay was optimized to be used with bacteria not possessing a marker gene for a fluorescent protein by staining their genomic DNA with SYBR® Green. For the binding assay, DNA nanostructures were combined with artificially synthesized mannose polymers, typical targets for many lectins on the surface of bacteria, presenting them in a defined constellation to bind bacteria strongly due to multivalent cooperativity. The testing of multiple mannose polymers identified monomeric mannose with a 5’-carbon linker and 1,2-linked dimeric mannose with linker as the best binding candidates for E. coli, presumably due to binding with the FimH protein on the surface. Despite similarities between the FimH proteins of E. coli and K. pneumoniae, binding was only observed between E. coli and the different sugar molecules on DNA structures. Furthermore, the degree of free movement seemed to affect the binding of mannose polymers to targeted proteins, since when utilizing a more flexible DNA nanostructure, an increase in binding could be observed. An alternative to the simple DNA nanostructures described above is the use of larger, more complex DNA origami structures consisting of several hundred strands. DNA origami structures are capable of carrying dozens of modifications at the same time. The results for the DNA origami structure showed a successful functionalization with up to 71 1,2-linked dimeric mannose with linker molecules. These results point towards a solution for the high-throughput analysis of potential binding agents for pathogenic bacteria e.g. as an alternative treatment for antibiotic-resistant.
Digital data is rising day by day and so is the need for intelligent, automated data processing in daily life. In addition to this, in machine learning, a secure and accurate way to classify data is important. This holds utmost importance in certain fields, e.g. in medical data analysis. Moreover, in order to avoid severe consequences, the accuracy and reliability of the classification are equally important. So if the classification is not reliable, instead of accepting the wrongly classified data point, it is better to reject such a data point. This can be done with the help of some strategies by using them on top of a trained model or including them directly in the objective function of the desired training model. We discuss such strategies and analyze the results on data sets in this thesis.
For the first time it was discovered that ultraviolet radiation with a wavelength of 200 to 400 nm (maximum 365 nm) radiated from a distance of 40 cm (intensity: 3500 mW/cm²) to PMMA altered its surface wettability as well as a roughness at the nanoscale that was observed with an atomic force microscope (AFM). The roughness rises and falls again in a short time ( 1-2days ) after 75 min and 180 min irradiation time. However , during the next 10 days roughness became stabilized and there was no influence of UV if PMMA was stored in air or in a Petri dish out of glass.
This master thesis was developed based on public information about Linde AG. It analyzed and evaluated macroeconomic factors influencing the pеrformance of the company. Microeconomic and macroeconomic indicators play the central role for the financial management of each global company. Thus, performance measurement is important for understanding the vаlue and extent of the environment. The study of the thesis aims at estimating the extent to which a company may opеrate on the global market and what factors contribute to its performance the most.
Firstly, the thesis examines theoretical background based on the previous researches. It defines the specific macroeconomic and microeconomic factors and their role in the company’s performance. Afterwards the thesis analyses Linde AG activities on domestic and foreign markets. The present structure, the current position in the markets and financial indicators are analyzed. The correlation and regression analysis were developed with the aim to find the links between the company’s performance and the macroeconomic environment. It is believed that inflation, exchange and interest rates as well as stock market index have a significant influence on the Linde’s performance.
The results showed that the indicators of inflation rate and stock market index play a significant role in the Linde’s performance. Thus, when it comes to exchanging rates, more data needs to be evaluated in order to derive concrete conclusions.
Purpose: The study is aimed to determine the Incentives for German SMEs to offshore their business activities in India and China.
Design: This study is based on quantitative approach. Primary and secondary data is being used in the study. The data was collected from individuals working in different SMEs in Germany, having relative offshoring experience. Theories from the articles, peer reviewed journal along with relevant books were consulted throughout the study.
Findings: The findingssuggest that the benefits and advantages of offshoring strategy in India and China are cost efficiency and technology. Moreover, the challenges that are being faced by the firms while executive offshoring strategy is cultural mix especially language/cultural barriers, security issues and loss of market performance.
Originality and Value: The study on incentives of German SMEs to offshore business activities in India and China enables me to understand why companies are interested in offshoring strategy in low cost countries for expanding their business while evaluating the challenges, merits and demerits of offshoring
Implementation of a customised business model for innovative engineering consultancy services
(2019)
Business development is vital for every organisation who intend to grow. It follows expansion through organic and inorganic means. Also, there are many innovative business styles which help organisations to expand. This thesis shows how engineering services organisation chose its form of business expansion
The following thesis explains how engineering service sector company uses its expertise to expand its business towards consultancy market with the demonstration of the real-life executed business model.
The thesis provides a solution for the following issues
1) What is the best in-house strategy to be developed for business expansion in the service industry?
2) How did the niche market experiences help for business expansion?