We use machine learning for the selection and classification of single–molecule trajectories to replace commonly used user–dependent sorting algorithms. Measured fluorescence time series of labelled single molecules need to be sorted into ’good molecules’ and ’bad’ molecules before further kinetic and thermodynamic analysis.
Currently, processing, sorting and analysis of the data is mainly done with the help of laboratory specific programs.
Although there are freely available programs for processing smFRET data, they do not offer ’molecular sorting’ or it is purely empirical. Only recently, new approaches came up to solve this problem by means of machine learning. Here, we describe a sound terminology for molecular sorting of smFRET data and present an efficient workflow for manual annotation followed by the training of the ML algorithm. Descriptive statistics of our generated dataset are provided and will serve as the basis for supervised ML-based molecular sorting algorithms yet to be developed.