×

You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.

Contacts:

+7 961 270-60-01
ivdon3@bk.ru

Large data deduplication using databases

Abstract

Large data deduplication using databases

Plotnikova N.P., Kevbrin V.A., Bolohov D.A.

Incoming article date: 22.07.2023

To date, a huge amount of heterogeneous information passes through electronic computing systems. There is a critical need to analyze an endless stream of data with limited means, and this, in turn, requires structuring information. One of the steps in solving the problem of data ordering is deduplication. This article discusses the method of removing duplicates using databases, analyzes the results of testing work with various types of database management systems with different sets of parameters.

Keywords: deduplication, database, field, row, text data, artificial neural network, sets, query, software, unstructured data