You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
it is possible that the reports generated from this dataset can be created from the "files" extract
Producer Files Extract (18M rows, 2.5GB -- likely more with enhancements)
Generate a JSON document for each producer file in Merritt. Include collection information, owner information, "Mime Group" information and possibly "inv_ingests" information
Analysis to be peformed
match file types to a database of sustainable/at risk file types
match filenames to patterns
identify metadata sidecar files
identify content files
Objects Extract (3.5M rows)
Generate a JSON document for each object in Merritt
Include collection and owner information
Include file information
include localid information
Analysis to be performed
Find objects with metadata files
Find objects with content files
Find objects with local ids
Find objects with meaningful metadata
The text was updated successfully, but these errors were encountered:
@terrywbrady In the same vein of finding objects with meaningful metadata, it would be beneficial to find objects that are missing ERC/object-level metadata.
Design Document
TODO
Billing Database Extract (135K rows, 43MB)
Producer Files Extract (18M rows, 2.5GB -- likely more with enhancements)
Objects Extract (3.5M rows)
The text was updated successfully, but these errors were encountered: