Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create test or alert if rolling re-indexer isn't keeping up with demand #4700

Open
ndushay opened this issue Feb 28, 2024 · 1 comment
Open

Comments

@ndushay
Copy link
Contributor

ndushay commented Feb 28, 2024

This ticket might belong in the DSA or infrastructure integration github issues.

It seems useful to check this weekly(?), and we want to check the live production argo index.

This could be a cron job, that perhaps sends a weekly email indicating what the oldest timestemp in argo prod's Solr index is and what the parallel workers value is. Or maybe it runs weekly (with a HB alert if it isn't) and only emails when the oldest Solr document is ?? 2 weeks ?? old (this should be a configuration setting).

You can determine the oldest timestamp for a document in the Argo index. The oldest document should not be older than _____ (something to be in settings.yml as "one month" or "two weeks" or something?)

Using a query such as that below, you can calculate approximatly how long it will take to reindex all of the SDR

http://sul-solr-prod-h.stanford.edu/solr/argo_prod/select?q=*:*&facet.range=timestamp&f.timestamp.facet.range.start=NOW%2FDAY-90DAYS&f.timestamp.facet.range.end=NOW&f.timestamp.facet.range.gap=%2B1DAY&rows=0&facet.field=timestamp&wt=xml&f.timestamp.sort=index

Currently, with "parallel" workers set to 3 in dor-services-app, it takes about 5-6 days.

If there are 2 workers ...

If there is 1 worker ...

@jmartin-sul
Copy link
Member

Seems like another option might be an okcomputer check, if the crux of the check is something like "was the least recently indexed Solr doc indexed more than threshold weeks ago?" We used to have a similar check in pres cat's okcomputer to make sure that everything was being audited in a timely fashion, but that's no longer in the pres cat okcomputer checks. It appears we removed it for DB performance reasons, even though there is/was an index on the date field that check queried. Not sure if Solr might have similar performance issues for that sort of query, in which case maybe the okcomputer approach is a non-starter for this ticket also.

If we do go the cron job route, and want to HB alert when the cron fails to run, here's an explicit reminder that we can use Honeybadger's checkin feature for that (probably intended/implied by the description, but saying just in case).

@ndushay ndushay changed the title test that rolling re-indexer is keeping up with demand create test or alert if rolling re-indexer isn't keeping up with demand Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants