Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

views: support for elasticsearch scroll to retrieve all records #29

Open
lnielsen opened this issue Feb 5, 2016 · 4 comments
Open

views: support for elasticsearch scroll to retrieve all records #29

lnielsen opened this issue Feb 5, 2016 · 4 comments

Comments

@lnielsen
Copy link
Member

lnielsen commented Feb 5, 2016

In order to be able to use the records REST API to retrieve all records we need to add support for scroll over the elasticsearch results.

Related to #28

@lnielsen
Copy link
Member Author

lnielsen commented Feb 5, 2016

@nharraud Is this something you would have time to look into?

@jma
Copy link
Contributor

jma commented Nov 8, 2018

It would be great to integrate this feature as we it is an alternative to the legacy OAI-PMH harvesting.

@slint
Copy link
Member

slint commented Nov 8, 2018

The logic wouldn't change a lot compared to what invenio-oaiserver already does. I guess introducing a scroll_id/cursor querystring parameter and doing a similar call is all that's needed... Good extras would be having the ability to enable/disable/rate-limit the feature

@lnielsen
Copy link
Member Author

Agreed, it would probably need separate rate-limiting. The alternative to this feature is something like the exporter we have in Zenodo, that can produce a full dump of the records on a regular basis. On Zenodo e.g. a full-dump of 400k records is something like 250MB BZip2 compressed JSON.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants