Refine
Document Type
- Bachelor Thesis (2)
Year of publication
- 2012 (2)
Language
- English (2)
Keywords
- cis-trans-Isomerie (2) (remove)
Institute
In this work a new method for the prediction of the Xaa-proline (where Xaa is any amino acid) cis/trans isomerization was investigated. By extraction of twelve structural features (real secondary structure, inside/outside classification, properties of the environment around proline and proline itself) a support vector machine (SVM) based prediction approach was evolved. The Java software Xaa-PIPT for structural feature extraction was developed. Based on 4397 (2199 cis and 2198 trans) prolines extracted from non-redundant, globular proteins a classifier was trained using the radial basis function (RBF) kernel. In ten-fold cross-validation it achieved an accuracy of 70.0478 % and a Matthews correlation coefficient (MCC) of 0.4223, a sensitivity of 0.5433 and a specificity of 0.8576. Based on this classifier a lightweight and easy-to-use Java software tool, called m Xaa-PIPT, for the prediction of the Xaa-proline cis/trans isomerization was devel-oped. It was shown that there are correlations between the proline surrounding environment and the isomerization state. m Xaa-PIPT can be used for the evaluation of low-resolution protein structures and theoretical models to improve their quality by the prediction of the Xaa-proline isomerization.
The bachelor thesis is about cis-trans isomerization of Xaa-Pro (Xaa = any amino acid), their quantitative acquisition and the selection of 3D structure information for the prediction with a support vector machine (SVM). The quantitative detection of occurrence of cis-, trans- and cis/trans conformation in membrane proteins will be examined and evaluated. The 3D structure informa-tions include 12 features, the amino acids around proline and are including of proline. These include the inside/outside classification, the real secondary structure, energy consideration, as well as five further amino acid occur properties within a defined radius of the proline. From this information, a data set was created for the SVM. This program is used for the prediction of unknown and known Xaa Pro Isomerisms. The methods for the analysis were implemented with the platform independent programming language Java. Two programs have emerged from the work to a Xaa PIPT for the quantitative detection and extracting structural information and m Xaa-PIPT to the pure prediction of Xaa-Pro isomerism in protein structures. 389 Membrane proteins from the PDB (Protein Data Bank) served as a basis. The data were also statistically analysed and evaluated.