Skip to content

Remediating cocina data that fails validation

Peter Mangiafico edited this page Sep 2, 2022 · 2 revisions

If you have bad data that fails remediation in cocina-models (e.g. a bad date in the source data that is invalid), you will get a 500 error from Argo (which is passed through from DSA) when trying to view the object in Argo, which will direct you the library status page.

There will be an HB alert in DSA that looks something like this: https://app.honeybadger.io/projects/50568/faults/87824955

This is coming from validation that is being done in the cocina-models gem: https://github.com/sul-dlss/cocina-models/blob/main/lib/cocina/models/validators/validator.rb

In the example above, it is date validation: https://github.com/sul-dlss/cocina-models/blob/main/lib/cocina/models/validators/date_time_validator.rb

Since the cocina-models gem is throwing a validation error, any app that consumes it will fail to load the cocina, which includes any app which uses the dor-services-client, such as sdr-api, Argo, etc. This will prevent you from loading the object in Argo or fetching it via the sdr-api, even on the command line and Rails console.

In order to edit the data, you need to go DSA, load the cocina manually from the database, alter it as needed to fix the bad data (e.g. the bad date), then save the cocina back to the database. If you have done this in production, you should then version the object in Argo (now that you can load the page again), to ensure the updates are published to PURL and preserved.

Here is an example of fixing a date in production:

ssh dor_services@dor-services-app-prod-b
cd dor_services/current
bundle exec rails c -e p

druid = 'druid:by835gn6234'
dro = Dro.find_by(external_identifier: druid)
dro.description # shows all of the descriptive json

# dig around for what you need to change (helpful to have the bad value you are looking for from the report)
dro.description['event'][0]['date'][0]['structuredValue'][0]
# or 
dro.description.dig('event',0,'date',0,'structuredValue',0)

# change it (ask Arcadia if you need to find out what the valid value should be) and save
dro.description['event'][0]['date'][0]['structuredValue'][0]['value'] = '-0250'
dro.save

You should then be able to load the object in Argo as normal, then version it (at least if in production).