Eduardo J edited this page Jul 29, 2021 · 6 revisions

This page describes, from a developer point of view, how searching is performed.

Indexing data

The definition of the indices can be found under the app/indices/ subdirectory.

We have two indices defined, one for projects and one for packages. Both have the same fields (defined with indexes) and the same attributes (defined with has). This allows us to perform searches with the same conditions, and retrieve results with mixed results (both projects and packages).

The indexing is done offline via PopulateToSphinxJob, this improves the search service error recovery.



The text search is performed over the name, title and description fields. By default, only the name field is used in the search.


The following attributes are used to perform searches:

  • The attrib_type_ids Sphinx attribute: contains a list of OBS attribute types that this project/package has.
  • The issue_ids Sphinx attribute: contains a list of OBS issues related to this project/package.

The rest of the Sphinx attributes (linked_count, activity_index, linked_projects?, devel_packages?, last_package_updated_at) are only used to provide a weight on the results, to present the results sorted. The weights are defined in the search method of the model FullTextSearch.

Rebuild indices

In case a rebuild of the indices is necesary, this are the steps that should be done:

systemctl stop obs-sphinx.service
run_in_api rake ts:rebuild # This takes long
run_in_api rake ts:stop
systemctl start obs-sphinx.service

After upgrading to Thinking Sphinx 5.2.0 and setting real_time_tidy to true, a rebuild of the indices should not be needed any more to clean up real-time indices.


