Refine
Document Type
- Bachelor Thesis (1)
- Master's Thesis (1)
Language
- English (2)
Keywords
- Proteine (1)
- Proteinmuster , Bioinformatik (1)
- Sphäroproteine (1)
- Support-Vektor-Maschine (1)
- cis-trans-Isomerie (1)
Institute
- 03 Mathematik / Naturwissenschaften / Informatik (2) (remove)
In this work a new method for the prediction of the Xaa-proline (where Xaa is any amino acid) cis/trans isomerization was investigated. By extraction of twelve structural features (real secondary structure, inside/outside classification, properties of the environment around proline and proline itself) a support vector machine (SVM) based prediction approach was evolved. The Java software Xaa-PIPT for structural feature extraction was developed. Based on 4397 (2199 cis and 2198 trans) prolines extracted from non-redundant, globular proteins a classifier was trained using the radial basis function (RBF) kernel. In ten-fold cross-validation it achieved an accuracy of 70.0478 % and a Matthews correlation coefficient (MCC) of 0.4223, a sensitivity of 0.5433 and a specificity of 0.8576. Based on this classifier a lightweight and easy-to-use Java software tool, called m Xaa-PIPT, for the prediction of the Xaa-proline cis/trans isomerization was devel-oped. It was shown that there are correlations between the proline surrounding environment and the isomerization state. m Xaa-PIPT can be used for the evaluation of low-resolution protein structures and theoretical models to improve their quality by the prediction of the Xaa-proline isomerization.
As widely discussed in literature spatial patterns of amino acids, so-called structural motifs, play an important role in protein function. The functional responsible part of a protein often lies in an evolutionary highly conserved spatial arrangement of only few amino acids, which are held in place tightly by the rest of the structure. In general, these motifs can mediate various functional interactions, such as DNA/RNA targeting and binding, ligand interactions, substrate catalysis, and stabilization of the protein structure.
Hence, characterizing and identifying such conserved structural motifs can contribute to understanding of structurefunction relationships in diverse protein families. Therefore and because of the rapidly increasing number of solved protein structures, it is highly desirable to identify, understand and moreover to search for structural scattered amino acid motifs. The aim of this work was the development and the implementation of a matching algorithm to search for such small structural motifs in large sets of target structures. Furthermore, motif matches were extensively analyzed, statistically assessed and functionally classified. Following a novel approach, hierarchical clustering was combined with functional classification and used to deduce evolutionary structure-function relationships. The proposed methods were combined and implemented to a feature-rich and easy-to-use command line software tool, which is freely available and contributes to the field of structural bioinformatic research.