Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow line list download speeds #1107

Closed
sratcliffe118 opened this issue Sep 17, 2020 · 17 comments
Closed

Slow line list download speeds #1107

sratcliffe118 opened this issue Sep 17, 2020 · 17 comments
Assignees
Labels
P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix

Comments

@sratcliffe118
Copy link

  • Do we have a sense for the speeds when a user clicks the download button?
  • I just tested on a 70-75Mb per second connection in UK for 137k entries and it is still running +2mins
  • This feels too slow for users

@Mougk Are there any download speed benchmarks / datasets we can look to?

@sratcliffe118 sratcliffe118 added Data Bug is related to data Eng ready labels Sep 17, 2020
@sratcliffe118 sratcliffe118 added this to the Public launch milestone Sep 17, 2020
@attwad
Copy link
Contributor

attwad commented Sep 17, 2020

Took me 1.4min to start the content download of 130K cases:

image

That's quite a long time.

After that network seems to be the bottleneck as the servers aren't using any crazy CPU or RAM at all

kubectl top pods
NAME                            CPU(cores)   MEMORY(bytes)
curator-prod-7bd48b496f-tbrtb   3m           53Mi
data-prod-c469b5ccf-rsqtj       18m          44Mi

Download seems to have stopped altogether after 2.2min though and no sign of anything going on other than that spinning widget in the download button.

image

I'm not sure if we can set some special http headers to start the download immediately in the browser instead of waiting for what seems like the complete file to be sent. @allysonjp715 in case she has any thoughts?

@attwad
Copy link
Contributor

attwad commented Sep 21, 2020

This might be relevant: https://nodejs.org/pt-br/docs/guides/backpressuring-in-streams/

Also, we should probably gzip the CSV before sending it, it will reduce the size of the download and maybe make it faster?
The stream pipeline stuff of node described in this website looks nice, perhaps we can use it in the curator service as well to do this?

@allysonjp715
Copy link
Contributor

Just found this, could be part of the problem: axios/axios#479

@attwad
Copy link
Contributor

attwad commented Sep 21, 2020

Also, I've looked a gzip and nginx already does gzip compression of responses for us (visible in the dev tools nettwork tab), although not for the download it seems, we should be able to configure it so that it does:

image

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#gzip-types

@attwad
Copy link
Contributor

attwad commented Sep 21, 2020

Just found this, could be part of the problem: axios/axios#479

I don't think it applies to us though? We're using node on the backend, not in the browser.

@allysonjp715
Copy link
Contributor

That bug is on axios. So IIUC, the response stream can't be handled while it's coming in (if it's going across the browser). So we couldn't start the download until all the data has come in. The slow UI after clicking download makes sense then because it's holding all the data in memory until the full response comes in, only then does the download happen.

Agreed we should try zipping the data.

@attwad
Copy link
Contributor

attwad commented Sep 21, 2020

I've enabled gzipping of csv data in nginx, will send a PR for that.
image

@attwad
Copy link
Contributor

attwad commented Sep 21, 2020

That bug is on axios. So IIUC, the response stream can't be handled while it's coming in (if it's going across the browser). So we couldn't start the download until all the data has come in. The slow UI after clicking download makes sense then because it's holding all the data in memory until the full response comes in, only then does the download happen.

Agreed we should try zipping the data.

I see, in that that's why the download don't work at all as we can't hold all the data in memory for sure.

@attwad
Copy link
Contributor

attwad commented Sep 22, 2020

Reloaded dev, took me 7s to start the download then browser took it over and it worked nicely, thanks @allysonjp715 for the fix! (I sent you a PR to push to prod as well today).

@attwad attwad closed this as completed Sep 22, 2020
@attwad
Copy link
Contributor

attwad commented Sep 22, 2020

image

@allysonjp715
Copy link
Contributor

Unfortunately this isn't working in prod. For 100,000 cases it takes ~30s for the download to start. It shows the URL pending bar in the lower left hand corner, but we should show a loading spinner in the UX for that too.

I've also tried a couple times now and the download keeps hanging at ~46MB. The browser's download object shows continual spinning at that number and doesn't complete or increase after that.

@allysonjp715
Copy link
Contributor

The download is fully working after that last PR 👍 The browser download object shows up immediately and begins the download, and the full download succeeds with ~100,000 cases. The full response is streaming so it should succeed no matter how many cases there are.

@SandraAdele
Copy link

Line list download either slow or does not work. I get the error code "502 Bad Gateway" @calremmel

@SandraAdele SandraAdele reopened this Dec 17, 2020
@Mougk
Copy link
Contributor

Mougk commented Dec 17, 2020

In addition download speed was very slow (>10minutes for me)

@Mougk Mougk added the P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix label Dec 17, 2020
@Mougk Mougk removed this from the Friends and Family launch milestone Dec 17, 2020
@Mougk Mougk added this to the Marketing Comms launch milestone Dec 17, 2020
@Mougk Mougk removed Data Bug is related to data Eng ready labels Dec 17, 2020
@calremmel
Copy link
Contributor

Switching to a cached download for the entire dataset should help address this. Most filtered subsets should be fine, but we should be mindful for large filtered subsets, such as Brazil, which can have in excess of 100K cases. If many people try to do this, we should think of how we may want to make this more user friendly for them.

@calremmel
Copy link
Contributor

calremmel commented Jan 20, 2021

Next steps:

  • Write script to export cases as csv in 100K chunks
  • Write script to process nested arrays and other formatting per chunk
  • Write script to combine chunks into single file and compress
  • Schedule pipeline to run nightly at 12am UTC

@calremmel
Copy link
Contributor

Tracked in #1436

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1: Launch blocker Needs fixing before we launch, schedule some time to investigate & fix
Projects
None yet
Development

No branches or pull requests

6 participants