Some recently updated images are missing license_url
in the meta_data
field
#4318
Labels
🗄️ aspect: data
Concerns the data in our catalog and/or databases
🛠 goal: fix
Bug fix
🟨 priority: medium
Not blocking but should be addressed soon
🧱 stack: catalog
Related to the catalog and Airflow DAGs
Description
On 2024-05-08 UTC the
batched_update
DAG was triggered1 to fill thelicense_url
in themeta_data
field with its corresponding value for rowsWHERE license = 'by' AND license_version = '2.0'
, and it reported a successful end on 2024-05-09, 17:00:18 UTC updating 746,571 records. However, after triggering a run of theadd_license_url
DAG on 2024-05-10, it reported the same row number missing said license, which indicates that some workflows may not be filling this field or are overwriting it.Flicker is confirmed to be on the set of rows missing this value.
If there are more, it is to be confirmed. It is known the Flickr DAG was running those days, as well as Europeana, the Finnish Museum, Wikimedia Commons, and the Metropolitan Museum.
Screenshot of DAG reports on Thursday, May 9th. Time is in VET.
Additional context
Discovered while working on #3885.
Footnotes
Link only available to maintainers. ↩
The text was updated successfully, but these errors were encountered: