×

You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.

Contacts:

+7 961 270-60-01
ivdon3@bk.ru

Search for patent analogues based on a comparison of key phrases

Abstract

Search for patent analogues based on a comparison of key phrases

Fomenkov S.A., Korobkin D.M., Korobkina V.S.

Incoming article date: 01.10.2024

This study describes approaches to automating full-text keyword search in the field of patent information. Automating the search by keywords (n-grams) is a significantly more difficult task than searching by individual words, in addition, it requires morphological and syntactic analysis of the text. To achieve this goal, the following tasks were solved: (a) the full-text search systems were analyzed: Apache Solr, ElasticSearch and ClickHouse; (b) a comparison of the architectures and basic capabilities of each system was carried out; (c) search results in Apache Solr, ElasticSearch and ClickHouse were obtained on the same dataset. The following conclusions were drawn: (a) all the systems considered perform full-text keyword search; (b) Apache Solr is the system with the highest performance, it also has very convenient functions; (b) ElasticSearch has a fast and powerful architecture; (c) ClickHouse has a high data processing speed.

Keywords: search, keyphrases, patent, Apache Solr, Elasticsearch, ClickHouse