Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need to improve the get_file_list mechanism #338

Open
randytpierce opened this issue Feb 28, 2024 · 1 comment
Open

need to improve the get_file_list mechanism #338

randytpierce opened this issue Feb 28, 2024 · 1 comment
Assignees
Labels
bug Something isn't working couchbase Stale VXingest issues related to the VXingest project

Comments

@randytpierce
Copy link
Contributor

The code uses a data set in couchbase that is retrieved by this query

SELECT url,
    mtime
FROM `vxdata`._default.METAR
WHERE subset = 'METAR'
    AND type = 'DF'
    AND fileType = 'grib2'
    AND originType = 'model'
    and model = 'HRRR_OPS'
    AND url IS NOT MISSING
    AND mtime IS NOT MISSING
order by url;

and this is turning out to be very inefficient. It showed up as a big difference in the capella evaluation tests. There is a document for each file that gets processed, essentially, and that is just too many documents. It should probably be an array of files for a kind of ingest or something.

@randytpierce randytpierce added bug Something isn't working couchbase VXingest issues related to the VXingest project labels Feb 28, 2024
@randytpierce randytpierce self-assigned this Feb 28, 2024
Copy link

This issue is stale because it has been open 90 days with no activity.

@github-actions github-actions bot added the Stale label May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working couchbase Stale VXingest issues related to the VXingest project
Projects
None yet
Development

No branches or pull requests

1 participant