Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bulk_create for related objects with PostgreSQL, MariaDB 10.5+ and SQLite 3.35+ #297

Open
richardebeling opened this issue Mar 20, 2022 · 0 comments

Comments

@richardebeling
Copy link

richardebeling commented Mar 20, 2022

In #206, bulk creation of related objects was introduced. However, for _quantity=N, it will cause N calls to save() of the related class, causing N database queries. This comment claimed this was necessary to retrieve the IDs of the related objects.

According to the current django documentation, bulk_create will update the primary key of the objects if the field type is AutoField and the database used is PostgreSQL, MariaDB 10.5+ or SQLite 3.35+. I think this covers the majority of use cases. It would be nice if users could profit from the performance of bulk_create in these cases.

For implementing, django's can_return_rows_from_bulk_insert could be used to test whether this would be supported.

Expected behavior

with

class RelatedModel(Model):
    pass

class MainModel(models.Model):
    related = ForeignKey(RelatedModel)

I'd expect

N = 1000
baker.make(MainModel, _quantity=N, _bulk_create=N)

to execute O(1) instead of O(N) database queries.

Actual behavior

def test_bulk_create_multiple_fk(self):
with self.assertNumQueries(6):
baker.make(models.PaymentBill, _quantity=5, _bulk_create=True)
assert models.PaymentBill.objects.all().count() == 5
assert models.User.objects.all().count() == 5

asserts that 6 = N+1 database queries happen.


On a side note: The documentation gives this workaround:

If you want to avoid that, you’ll have to perform individual bulk creations per foreign keys as the following example:

from model_bakery import baker

baker.prepare(User, _quantity=5, _bulk_create=True)
user_iter = User.objects.all().iterator()
baker.prepare(Profile, user=user_iter, _quantity=5, _bulk_create=True)
  • I think the calls are supposed to be baker.make calls instead of baker.prepare(?)
  • If this works (it does for me), doesn't this falsify the claim that the save() calls are needed to retrieve the IDs? Note that it does not specify any requirements on the database or django version used. Edit: I just saw it iterates all objects, not just the ones that were created.
richardebeling pushed a commit to richardebeling/EvaP that referenced this issue Mar 21, 2022
In some places, we can not use _bulk_create because baker doesn't create
M2M-entries then.

Tests are about 5 seconds (8%) faster with a hotpatched baker regarding
model-bakers/model_bakery#297
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant