×

You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.

Contacts:

+7 961 270-60-01
ivdon3@bk.ru

  • Preparing data for event clustering in information security logs

    The article shows that the preparation of data for further use in algorithms plays an important role and this should be given attention. raw data is often corrupted and unreliable: it may contain out-of-range values ​​(noise), outliers (outliers), and gaps (missing values). Data Preparation is a very time-consuming iterative process that takes up to 80% of all resource and time costs in the life cycle and includes the following tasks of processing initial ("raw") data: data sampling, data cleaning, feature generation, integration, formatting. Data exploration consists in studying the following steps: summarizing data, grouping data, exploring relationships between different attributes. Cluster analysis is a data analysis technique used to find groups that share common attributes (also called grouping). An algorithm of actions for preparing data within the framework of information security log events for further clustering is given.

    Keywords: data, data clustering, events, information security log, algorithm, Data Mining, Data Preparation, dataset, Machine Learning