New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zipfulldata.rb crashes #444
Comments
What I figured out so far:
|
Maybe the storage box got disconnected again? In that case remounting it should help. Take a look at |
Yeah, I guess its because the Zipfulldata job would have started again? :D I restarted the machine now and here's the
Here's the same thing for the
So the storage box itself is available on
But I can't figure out where it comes from? (well, then it's 7am here and I didn't have coffee yet) |
Thank you for having a look :) Not sure how to fix the mounting (23:30 here now..) for now, but there are some tricks to get rubyzip to use less memory, for example, by fiddling with the buffer settings as in this piece of code: rubyzip/rubyzip#261 We could also try to see what happens if we don't use the gem and call on the system zip directly. We are still running the latest version of rubyzip, will give that a try asap |
Isn't |
Actually not, if you look at the
So the host's |
Ah, I see. So, I guess this is simply to ensure that the data in there doesn't get lost between container instances. |
Right, but where do we mount That's the reason we have |
I get that, but I don't know. 😄 |
Haha, great! Maybe @philippbayer has an idea. At least the folders have the |
Doesn't seem like it has been missed. 😅 |
Seems like another feature that could be cut out then. ;) Do we have some documentation of how the storage boxes are mounted? Couldn't find anything on that in the |
Other than |
Ok, so according to Should we do that? |
OK my recent PR hopefully alleviates some of the issue, probably won't fix it currently, the script |
Ack, how big is the folder? If it’s small enough we could briefly shut down the web server so that this won’t happen as no one can’t upload anything and then transfer it? |
That would probably be the safest. It's only 106MB. |
Ok, then that sounds like a good way to me :) |
Current status: @philippbayer moved the Also: With the fix about where the zipfile is written from #447 the non-picture zips all ran through without an issue again. Let's hope that the combination of all of this helps to fix all of our problems with memory leaks! |
Status update:
Next step:
|
It still fails, but as I wrote on gitter, I think that's because |
That should be done with the next PR (#452). I'll just merge once travis greenlights and will restart the zip job. |
Ha, look what happened now:
As @philippbayer mentioned: The |
Ok, my cheap hack in #453 worked. So we can close this here I think! 👍 |
Related to the crashes mentioned in #443:
I upgraded our
sidekiq
machine to the next level, which means it now has a nice 8 GB of memory to fullfill its tasks. As a user mentioned via email that theopensnp_datadump.current.zip
is super outdated I manually started theZipfulldata
task and observed the memory consumption. Turns out this is so hungry that even the 8 GB weren't enough, the whole machine crashed once again. 😒This might easily explain why also the newsletters can't be sent in a regular interval: The
zipfulldata.rb
is run once a day to create the latest dump. As this just kills the whole machine the newsletters run out of memory at that point. 😂I could not fully figure out why the task takes up that much memory. One thing I did notice is that we're heavily over-logging. As every single phenotype/user combination creates a new logging-event. Another thing I noticed: The
create_picture_zip
produces nothing but errors:I found a very simple reason for this: The whole path
/home/app/snpr/public/system/
is not available in thesidekiq
Docker container while it is available in theweb
container! Somehow the mounting there doesn't work like it's supposed to. @tsujigiri @philippbayer Help is appreciated 😱 Do you have an idea for.../system
folder turns up in thesidekiq
machine?The text was updated successfully, but these errors were encountered: