  • An overview of machine learning-based techniques for detecting outliers in data

    Outlier detection is an important area of data research in various fields. The aim of the study is to provide a non-exhaustive overview of the features of using methods for detecting outliers in data based on various machine learning techniques: supervised, unsupervised, semi-supervised. The article outlines the features of the application of certain methods, their advantages and limitations. It has been established that there is no universal method for detecting outliers suitable for various data, therefore, the choice of a particular method for the implementation of research should be made based on an analysis of the advantages and limitations inherent in the chosen method, with the obligatory consideration of the capabilities of the available computing power and the characteristics of the available data, in including those including their classification into outliers and normal data, as well as their volume.

    Keywords: outliers, machine learning, outlier detection, data analysis, data mining, big data, principal component analysis, regression, isolating forest, support vector machine