Refine
Document Type
- Master's Thesis (5)
- Bachelor Thesis (1)
Language
- English (6) (remove)
Keywords
- Bioinformatik (6) (remove)
The endogen steroid hormone 17b-estradiol is a central player in a wide range of physiologic, behavioral processes and diseases in vertebrates. As a consequence, it is a main target for molecular design and drug discovery efforts in medicine and environmental sciences, which requires in-depth knowledge of protein-ligand binding processes. This work develops a bioinformatic framework based on local and global structure similarity for the characterization of E2-protein interactions in all 35 publicly available three-dimensional structures of estradiol-protein complexes. Subsequently, it uses gained data to identify four geometrically conserved estradiol binding residue motifs, against which the Protein Data Bank is queried. As result of this database query, 15 hits present in seven protein structures are found. Five of these structures do not contain E2 as ligand and had thus not been included in this work’s initial data set. One of these newly detected structures is structurally and functionally dissimilar, as well as evolutionarily distant from all other proteins analyzed in this work. Nevertheless, the ability of this protein to actually bind estradiol must be further analyzed. Finally, geometrically conserved E2-protein interactions are identified and a new research direction using these conserved interaction ensembles for the detection of novel estradiol targets is proposed.
Proteins are macromolecules that consist of linear-bonded amino acids. They are essential elements in various metabolic processes. The three-dimensional structure of a protein is determined by the order of amino acids, also referred to as the protein sequence. This conformation corresponds to the structural state in which the protein is functionally active. However, relationships between protein sequence, structure and function have not been fully understood yet. Additionally, information about structural properties or even the entire protein structure are crucial for understanding the dynamics that define protein functionality and mechanisms. From this, the role of a protein in its molecular context can be described closely. For instance, interactions can be investigated and comprehended as a biological dynamic network that is sensitive to alternations, i.e. changes which are caused by diseases. Such knowledge can aid in drug design, whereas compounds need to be specifically tailored and adjusted to their molecular targets. Protein energy profile-basedmethods can be applied to investigate protein structures concerning dynamics and alternations. The publications enclosed to this work discuss in general the scientific potentials of energy profilebased techniques and algorithms. On the one hand, changes in stability caused by protein mutations and proteinligand interactions are discussed in the context of energy profiles. On the other hand, energetic relations to protein sequence, structure and function are elucidated in detail. Finally, the presented discussions focus on recent enhancements of the eProS (energy profile suite) database and toolbox. eProS freely provides all elucidated methodologies to the scientific community. Thus, one can address biological questions with the presented methods at hand. Additionally, eProS provides annotations related to foreign databases. This ensures a broad view on biological data and information. In particular, energetic characteristics can be identified which contribute to a protein’s structure and function.
he automatic comparison of RNA/DNA or rather nucleotide sequences is a complex task requiring careful design due to the computational complexity. While alignment-based models suffer from computational costs in time, alignment-free models have to deal with appropriate data preprocessing and consistently designed mathematical data comparison. This work deals with the latter strategy. In particular, a systematic categorization is proposed, which emphasizes two key concepts that have to be combined for a successful comparison analysis: 1) the data transformation comprising adequate mathematical sequence coding and feature extraction, and 2) the subsequent (dis-)similarity evaluation of the transformed data by means of problem specific but mathematically consistent proximity measures. Respective approaches of different categories
of the introduced scheme are examined with regard to their suitability to distinguish natural RNA virus sequences from artificially generated ones encompassing varying degrees of biological feature preservation. The challenge in this application is the limited additional biological information available, such that the decision has to be made solely on the basis of the sequences and their
inherent structural characteristics. To address this, the present work focuses on interpretable, dissimilarity based classification models of machine learning, namely variants of Learning Vector Quantizers. These methods are known to be robust and highly interpretable, and therefore,
allow to evaluate the applied data transformations together with the chosen proximity measure with respect to the given discrimination task. First analysis results are provided and discussed, serving as a starting point for more in-depth analysis of this problem in the future.
Obesity is a major public health issue in many countries and its development leads to many severe conditions. Adipose tissue (AT) simply called fat, in males visceral adipose tissues (VAT) are dominant. Estrogens play an important role in many pathological processes.
In this study, one of the subtypes of the estrogen receptor ER-beta is activated using KB (Specific ligand) treatment on VAT.
In this study, I investigated the metabolism effectof KB treatment on VAT using bioinformatics methods.
In this thesis study, I applied several bioinformatics methods such as differential expression gene analysis, pathway analysis, RNA splicing analysis and SNPs callings to make the prediction of the effect of KB treatment on VAT. A list of candidate genes, pathways and SNPs were identified in this study, which could provide some clues to reveal the genetic mechanism underlying the KB treatment effect. The results of my study show that the KB treatment on VAT has caused significant effect.
In this work a second version for the Python implementation of an algorithm called Probabilistic Regulation of Metabolism (PROM) was created and applied to the metabolic model iSynCJ816 for the organism Synechocystis sp. PCC 6803. A crossvalidation was performed to determine the minimal amount of expression data needed to produce meaningful results with the PROM algorithm. The failed reproduction of the results of a method called Integrated and Deduced Regulation of Metabolism (IDREAM) is documented and causes for the failed reproduction are discussed.
In bioinformatics one important task is to distinguish between native and mirror protein models based on the structural information. This information can be obtained from the atomic coordinates of the protein backbone. This thesis tackles the problem of distinction of these conformations, looking at the statistics of the dihedral angles’ distribution regarding the protein backbone. This distribution is visualized in Ramachandran plots. By means of an interpretable machine learning classification method – Generalized Matrix Learning Vector Quantization – we are able to distinguish between native and mirror protein models with high accuracy. Further, the classifier model supplies supplementary information on the important distributional regions for distinction, like α-helices and β-strands.