Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC for time series gem downloads #48

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

segiddins
Copy link
Member

@indirect would appreciate your help rounding this off

@segiddins segiddins requested a review from indirect March 27, 2023 14:25
@indirect
Copy link
Member

indirect commented Apr 4, 2023

@segiddins this looks good to me, I think. my only remaining question is how we will ensure the main app doesn't break if e.g. timescale disappears. for the work outlined in this RFC, I think the only thing that might break is the background job that writes to timescale? if so, that means it's mainly a question for the future work that will display download counts and graphs.

@segiddins
Copy link
Member Author

So my thought there is, we can have a cron job that copies all time download numbers to the main postgres database on some cadence?

@segiddins segiddins marked this pull request as ready for review April 4, 2023 05:27
@indirect
Copy link
Member

indirect commented Apr 4, 2023

Oh yeah, that seems good. That plus a circuit-breaker type thing on the requests to the timescaledb so puma can continue to serve other rails requests sounds like it should work.

@segiddins
Copy link
Member Author

That plus a circuit-breaker type thing on the requests to the timescaledb so puma can continue to serve other rails requests sounds like it should work.

I'm not quite sure what you mean by that? My thinking was we'd use stimulus to show download counts on pages, so as to be able to cache download numbers separately, and that would give us a natural place to keep reads from timescale off the critical path

@indirect
Copy link
Member

indirect commented Apr 5, 2023

That's pretty much what I was thinking--something out of the critical path to show the numbers, something out of the critical path to write into timescale. Thanks for writing this up!

- Adding another datastore complicates contributing to the application
- Another datastore means we can't natively write joins against the download data
- Timescale cloud costs money
- There is no terraform provider for timescale cloud (yet) requiring us to write one or manually provision
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Another datastore means we can't natively write joins against the download data
- Timescale cloud costs money
- There is no terraform provider for timescale cloud (yet) requiring us to write one or manually provision
- This is still high cardinality data, and there's no knowing for sure how well timescale will compress it
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help testing and benchmarking it. Is there any data available to seed the database? Or if anyone can share any numbers to give me some insights about how big is the cardinality.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found the weekly backup ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants