Speedup seeding #4901

jorg-vr · 2023-08-14T12:02:00Z

This is one of the main reason for slow tests on github actions

waarschijnlijk kunnen we dat voor een groot deel versnellen door de truuk op https://railsnotes.xyz/blog/seed-your-database-with-the-faker-gem#fixing-our-slow-seeds-with-upsert_all-and-activerecord-import

Slack Message

bmesuere · 2023-08-14T12:17:07Z

Note that in addition to seeding, it could also be used in the application itself. For example, when creating an evaluation we do a lot of inserts which can maybe be done as a single one.

jorg-vr · 2023-08-18T14:20:18Z

Speeding up using bulk inserts is a lot less simple than the example given, which is just a bunch of inserts with Faker data

I tried to profile the seeding script using stackprof to find our causes of slowdown:
26% of our time is taken by gitable functions (eg repository cloning)
This is more file system related. We could ask ourselfs whether we need a 'large activity repo' in the seed

21% of time is taken by creating activity statuses
A lot of that time is also spend in validations.
This could potentially be rewritten in a single query, but it'll be rather complex to get correct

Next we get creating most courses (13%) and visualisation test (11%)
A significant part of this is creating series, series memberships, course memberships etc. But as we loop over these to create submissions, a lot of the speed up of a collective insert all is lost when we have to query all afterwards.
Creating submissions might be a good candidate for a collective insert, but these are also rather complex objects (We also have to fix the code and result file written to the filesystem) But avoiding some of the callbacks here could provide a speedup (some callbacks I tracked from submission create add up to at least 6.5% of total runtime)

I tried replacing student creation with one insert_all and one User.where(permissions: :student) call and it caused a slowdown instead of a speedup

jorg-vr added the chore Repository/build/dependency maintenance label Aug 14, 2023 — with Slack

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup seeding #4901

Speedup seeding #4901

jorg-vr commented Aug 14, 2023

bmesuere commented Aug 14, 2023

jorg-vr commented Aug 18, 2023

Speedup seeding #4901

Speedup seeding #4901

Comments

jorg-vr commented Aug 14, 2023

bmesuere commented Aug 14, 2023

jorg-vr commented Aug 18, 2023