Commit Graph

1 Commits

Author SHA1 Message Date
Sebastian Sdorra
0a26741ebd One index per type and parallel indexing (#1781)
Before this change the search uses a single index which distinguishes types (repositories, users, etc.) with a field (_type).
But it has turned out that this could lead to problems, in particular if different types have the same field and uses different analyzers for those fields. The following links show even more problems of a combined index:

    https://www.elastic.co/blog/index-vs-type
    https://www.elastic.co/guide/en/elasticsearch/reference/6.0/removal-of-types.html

With this change every type becomes its own index and the SearchEngine gets an api to modify multiple indices at once to remove all documents from all indices, which are related to a specific repository, for example.

The search uses another new api to coordinate the indexing, the central work queue.
The central work queue is able to coordinate long-running or resource intensive tasks. It is able to run tasks in parallel, but can also run tasks which targets the same resources in sequence. The queue is also persistent and can restore queued tasks after restart.

Co-authored-by: Konstantin Schaper <konstantin.schaper@cloudogu.com>
2021-08-25 15:40:11 +02:00