Refine
Document Type
- Master's Thesis (78)
- Bachelor Thesis (41)
- Diploma Thesis (1)
Year of publication
Language
- English (120) (remove)
Keywords
- Maschinelles Lernen (23)
- Blockchain (9)
- Vektorquantisierung (9)
- Algorithmus (7)
- Bioinformatik (5)
- Deep learning (5)
- Graphentheorie (5)
- Neuronales Netz (5)
- Kryptologie (4)
- Softwareentwicklung (4)
Institute
- Angewandte Computer‐ und Biowissenschaften (120) (remove)
Community acquired pneumonia (CAP) is a very common, yet infectious and sometimes lethal disease. Therefor, this disease is connected to high costs of diagnosis and treatment. To actually reduce the costs for health care in this matter, diagnosis and treatment must get cheaper to conduct with no loss in predictive accuracy. One effective way in doing so would be the identification of easy detectable and highly specific transcriptomic markers, which would reduce the amount of work required for laboratory tests by possibly enhanced diagnosis capability.
Transcriptomic whole blood data, derived from the PROGRESS study was combined with several documented features like age, smoking status or the SOFA score. The analysis pipeline included processing by self organizing maps for dimensionality and noise reduction, as well as diffusion pseudotime (DPT). Pseudotime enabled modelling a disease run of CAP, where each sample represented a state/time in the modelled run. Both methods combined resulted in a proposed disease run of CAP, described by 1476 marker genes. The additional conduction of a geneset analysis also provided information about the immune related functions of these marker genes.
Differentiation is ubiquitous in the field of mathematics and especially in the field of Machine learning for calculations in gradient-based models. Calculating gradients might be complex and require handling multiple variables. Supervised Learning Vector Quantization models, which are used for classification tasks, also use the Stochastic Gradient Descent method for optimizing their cost functions. There are various methods to calculate these gradients or derivatives, namely Manual Differentiation, Numeric Differentiation, Symbolic Differentiation, and Automatic Differentiation. In this thesis, we evaluate each of the methods mentioned earlier for calculating derivatives and also compare the use of these methods for the variants of Generalized Learning Vector Quantization algorithms.
In this work, a transgenic zebrafish line that expresses the fluorophore dsRed under the endogenous zebrafish cochlin promotor is supposed to be established, using the CRISPR/Cas9 system. dsRed was cloned into a pBluescript vector, followed by the cloning of the cochlin locus into this vector. This bait construct was then supposed to be micro injected into wild type AB zebrafish embryos. The micro injection of Cas9 mRNA, single guide RNA and a bait construct was practiced with the tyrosinase gene, which was disrupted using CRISPR/Cas9.
We investigate the folding and thermodynamic stability of a tertiary contact of baker's yeast ribosomal ribonucleic acid (rRNA), which is supposed to be essential for the maturation process of ribosomes in eukaryotes at lower temperatures1. Ribosomes are cellular machines essential for all living organisms. RNA is at the center of these machines and responsible for translation of genetic information into proteins2,3. Only recently, the rRNA tertiary contact of interest was discovered in Zurich by the research group of Vikram Govind Panse. Gerhardy et al.1 showed in vitro that within the 60s-preribosome under defined metal ion concentrations the tertiary contact become visible between a GAAA-tetraloop and a kissing loop motif. Our aim is now to understand this RNA structure, especially the formation of the rRNA tertiary contact, in terms of thermodynamics and kinetics at various experimental conditions, such as temperature and metal ion concentration of K(I), Na(I) and Mg(II). Therein, we use optical spectroscopy like UV/VIS spectroscopy and ensemble Förster or Fluorescence Resonance Energy Transfer (FRET) folding studies. Our findings will help to further characterize this newly discovered ribosomal RNA contact and to elucidate its function within the ribosomal maturation process.
A relatively new research field of neurosciences, called Connectomics, aims to achieve a full understanding and mapping of neural circuits and fine neuronal structures of the nervous system in a variety of organisms. This detailed information will provide insight in how our brain is influenced by different genetic and psychiatric diseases, how memory traces are stored and ageing influences our brain structure. It is beyond question that new methods for data acquisition will produce large amounts of neuronal image data. This data will exceed the zetabyte range and is impossible to annotate manually for visualization and analysis. Nowadays, machine learning algorithms and specially deep convolutional neuronal networks are heavily used in medical imaging and computer vision, which brings the opportunity of designing fully automated pipelines for image analysis. This work presents a new automated workflow based on three major parts including image processing using consecutive deep convolutional networks, a pixel-grouping step called connected components and 3D visualization via neuroglancer to achieve a dense three dimensional reconstruction of neurons from EM image data.
Die biologische Ammoniumoxidation ist ein zentraler Bestandteil des globalen Stickstoffkreislaufs. Angesichts der extremen Massen Stickstoff anthropogenen Ursprungs in der Umwelt, liegt die Entfernung reaktiven Stickstoffs im Interesse der Umwelt und der öffentlichen Gesundheit. In der folgenden Arbeit werden Bedingungen zur anaeroben Ammoniumoxidation mit Nitrat in einem Anammox-Reaktor untersucht. Dabei wurden 2 Laborreaktoren für eine Zeit von insgesamt 116 Tagen betrieben und beobachtet, die ausschließlich als Elektronendonatoren und Akzeptoren Ammonium und Nitrat enthielten. Zusätzlich wurden Batchkulturen mit Zellen eines Reaktors angezüchtet und auf ihre Gaszusammensetzung abhängig unterschiedlicher Eigenschaften untersucht. Hierbei wurde eine Reihe unterschiedlicher analytischer Quantifizierungsmethoden genutzt und es konnte gezeigt werden, dass ein Abbau unter den Bedingungen stattfindet.
Die aktuelle Forschung zu dieser Reaktion ist spärlich und verleiht der Bachelorarbeit dadurch Relevanz.
It is possible to obtain a common updating rule for k-means and Neural Gas algorithms by using a generalized Expectation Maximization method. This result is used to derive two variants of these methods. The use of a similarity measure, specifically the gaussian function, provides another clustering alternative to the before mentioned methods. The main benefit of using the gaussian function is that it inherently looks for a common cluster center for similar data points (depending on the value of the parameter s ). In different experiments we report similar behaviour of batch and proposed variants. Also we show some useful results for the “alternative” similarity method, specifically when there is no clue about the number of clusters in the data sets.
The objective of this Bachelor Project is the creation of a tool that should support forensic investigators during IT forensic interventions. It uses Kismet as the base program and adds functionalities to it via the plugin interface. The installation of the plugin shall be explained, how the plugin works, and a recommendation on how to use it. To understand the underlying basics, an introduction about WLAN and Bluetooth is given. The tests that were performed with the new plugin are described as well as their results. It is therefore briefly discussed why the tool is applicable for locating Wi-Fi devices, especially access points, but not Bluetooth devices. Using all this a few ideas on how to improve the tool and what can be researched in this area are provided.
Sequences are an important data structure in molecular biology, but unfortunately it is difficult for most machine learning algorithms to handle them, as they rely on vectorial data. Recent approaches include methods that rely on proximity data, such as median and relational Learning Vector Quantization. However, many of them are limited in the size of the data they are able to handle. A standard method to generate vectorial features for sequence data does not exist yet. Consequently, a way to make sequence data accessible to preferably interpretable machine learning algorithms needs to be found. This thesis will therefore investigate a new approach called the Sensor Response Principle, which is being adapted to protein sequences. Accordingly, sequence similarity is measured via pairwise sequence alignments with different sequence alignment algorithms and various substitution matrices. The measurements are then used as input for learning with the Generalized Learning Vector Quantization algorithm. A special focus lies on sequence length variability as it is suspected to affect the sequence alignment score and therefore the discriminative quality of the generated feature vectors. Specific datasets were generated from the Pfam protein family database to address this question. Further, the impact of the number of references and choice of substitution matrices is examined.
In the field of satellites it is common practice to combine multiple ground stations into one network, to increase communication times with satellites. This work focuses on TIM, which is an international academic colaborative project. Important criteria for this project are elaborated and used to evaluate existing ground station networks. It concludes that there is no appropriate solution availiable for this specific use case and establish a proposed solution. The proposed ground station network software will be elaborated and evaluated.