Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

please document how to deserialize by natural key #517

Open
johncronan opened this issue Apr 28, 2022 · 5 comments
Open

please document how to deserialize by natural key #517

johncronan opened this issue Apr 28, 2022 · 5 comments

Comments

@johncronan
Copy link

I've created a minimal recreation to verify that this works in Django normally, with model inheritance. You can have a demo/models.py like this:

class FooManager(models.Manager):
    def get_by_natural_key(self, slug):
        return self.get(slug=slug)

class Foo(models.Model):
    slug = models.SlugField(unique=True)
    content = models.CharField(blank=True, max_length=100)
    
    objects = FooManager()
    
    def natural_key(self):
        return (self.slug,)


class Bar(Foo):
    foo = models.OneToOneField(Foo, models.CASCADE, parent_link=True, primary_key=True)
    val = models.IntegerField(default=0)
    
    def natural_key(self):
        return self.foo.natural_key()
    natural_key.dependencies = ['demo.foo']

And then when you create a bar instance: Bar(slug='baz', val=1).save()
We can dump that with ./manage.py dumpdata --natural-primary --natural-foreign demo > dump.json and get a file that works when we run loaddata on it.

Additionally, let's say we want to copy these values, but in a new instance, so run: sed s/baz/qux/g < dump.json > new.json

Then, running loaddata new.json there will be two bar instances, baz and qux. Also, two instances of the base object.

But, repeating the same process, but with Foo changed to a PolymorphicModel and FooManager changed to a PolymorphicManager, I find that the first part, reloading instances that are already in the database, works, while in the second part I get the following error:

Traceback (most recent call last):
  File "...site-packages/django/core/serializers/json.py", line 70, in Deserializer
    yield from PythonDeserializer(objects, **options)
  File "...site-packages/django/core/serializers/python.py", line 174, in Deserializer
    obj = base.build_instance(Model, data, using)
  File "...site-packages/django/core/serializers/base.py", line 332, in build_instance
    natural_key = Model(**data).natural_key()
  File "test/jkc/demo/models.py", line 45, in natural_key
    return self.foo.natural_key()
  File "...site-packages/polymorphic/models.py", line 203, in accessor_function
    attr = objects.get(pk=self.pk)
  File "...site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "...site-packages/django/db/models/query.py", line 496, in get
    raise self.model.DoesNotExist(
demo.models.DoesNotExist: Foo matching query does not exist.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test/jkc/./manage.py", line 22, in <module>
    main()
[... management command stuff]

Any idea why that would occur? Thanks for looking at this.

@johncronan johncronan mentioned this issue Apr 28, 2022
@johncronan
Copy link
Author

johncronan commented Apr 30, 2022

It looks like in this spot

        def create_accessor_function_for_model(model, accessor_name):
            def accessor_function(self):
                objects = getattr(model, "_base_objects", model.objects)
                attr = objects.get(pk=self.pk)
                return attr

            return accessor_function

        subclasses_and_superclasses_accessors = self._get_inheritance_relation_fields_and_models()

        for name, model in subclasses_and_superclasses_accessors.items():
            # Here be dragons.
            orig_accessor = getattr(self.__class__, name, None)
            if issubclass(
                type(orig_accessor),
                (ReverseOneToOneDescriptor, ForwardManyToOneDescriptor),
            ):
                setattr(
                    self.__class__,
                    name,
                    property(create_accessor_function_for_model(model, name)),
                )

there needs to be some special casing for ForwardOneToOneDescriptor, that instantiates the parent model (as long as none of the required fields are deferred), the way that Django does here:

    def get_object(self, instance):
        if self.field.remote_field.parent_link:
            deferred = instance.get_deferred_fields()
            # Because it's a parent link, all the data is available in the
            # instance, so populate the parent model with this data.
            rel_model = self.field.remote_field.model
            fields = [field.attname for field in rel_model._meta.concrete_fields]

            # If any of the related model's fields are deferred, fallback to
            # fetching all fields from the related model. This avoids a query
            # on the related model for every deferred field.
            if not any(field in fields for field in deferred):
                kwargs = {field: getattr(instance, field) for field in fields}
                obj = rel_model(**kwargs)
                obj._state.adding = instance._state.adding
                obj._state.db = instance._state.db
                return obj
        return super().get_object(instance)

I haven't studied this to see exactly what it would imply, though.

@johncronan
Copy link
Author

@vdboor, what do you think? I wonder if you could assist me with the implications part, and what a good fix ought to look like?

I've been planning to implement JSON import and export in my application's admin. That will have to use the natural keys. So, I think, I have to either fix this issue or take the library out. Perhaps if I can figure out how to correct it within django-polymorphic, I can then potentially turn that into a subclass implementation for interim use.

@johncronan
Copy link
Author

544f5ed

At this point, we don't know why this line of code is executed, but we do know it's not consistently executed between Django 1.10 and Django 1.11 due to the addition of ForwardOneToOneDescriptor, a subclass of ForwardManyToOneDescriptor.

As an experiment, I commented out the setattr for create_accessor_function_for_model and ran the tests. It causes a failure in test_foreignkey_field:

======================================================================
FAIL: test_foreignkey_field (polymorphic.tests.test_orm.PolymorphicTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jk/django-polymorphic/polymorphic/tests/test_orm.py", line 460, in test_foreignkey_field
    self.assertEqual(object2a.model2b.__class__, Model2B)
AssertionError: <class 'polymorphic.tests.models.Model2C'> != <class 'polymorphic.tests.models.Model2B'>

----------------------------------------------------------------------
Ran 74 tests in 2.185s

FAILED (failures=1)

It fails here, which is a test that uses an inheritance hierarchy A -> B -> C to check that if A is queried for an object that's also a C, its B instance will be of class B. With the setattr commented out, it is instead a C instance.

I still don't understand.

@johncronan
Copy link
Author

Tracing the original example, I realized that the definition of natural_key on Bar is not used. I had figured that I at least need to declare the dependence, but Django must be handling that already using the inheritance hierarchy.

Correct version of the example code is:

class FooManager(models.Manager):
    def get_by_natural_key(self, slug):
        return self.get(slug=slug)

class Foo(models.Model):
    slug = models.SlugField(unique=True)
    content = models.CharField(blank=True, max_length=100)
    
    objects = FooManager()
    
    def natural_key(self):
        return (self.slug,)


class Bar(Foo):
    foo = models.OneToOneField(Foo, models.CASCADE, parent_link=True, primary_key=True)
    val = models.IntegerField(default=0)

And then (using PolymorphicManager and PolymorphicModel, instead) the presentation of the error changes to:

Traceback (most recent call last):
  File "...site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "...site-packages/django/db/backends/sqlite3/base.py", line 477, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: NOT NULL constraint failed: demo_bar.foo_id

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./manage.py", line 22, in <module>
    main()
  File "./manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "...site-packages/django/core/management/__init__.py", line 446, in execute_from_command_line
    utility.execute()
  File "...site-packages/django/core/management/__init__.py", line 440, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "...site-packages/django/core/management/base.py", line 414, in run_from_argv
    self.execute(*args, **cmd_options)
  File "...site-packages/django/core/management/base.py", line 460, in execute
    output = self.handle(*args, **options)
  File "...site-packages/django/core/management/commands/loaddata.py", line 102, in handle
    self.loaddata(fixture_labels)
  File "...site-packages/django/core/management/commands/loaddata.py", line 163, in loaddata
    self.load_label(fixture_label)
  File "...site-packages/django/core/management/commands/loaddata.py", line 253, in load_label
    if self.save_obj(obj):
  File "...site-packages/django/core/management/commands/loaddata.py", line 209, in save_obj
    obj.save(using=self.using)
  File "...site-packages/django/core/serializers/base.py", line 281, in save
    models.Model.save_base(self.object, using=using, raw=True, **kwargs)
  File "...site-packages/django/db/models/base.py", line 857, in save_base
    updated = self._save_table(
  File "...site-packages/django/db/models/base.py", line 1000, in _save_table
    results = self._do_insert(
  File "...site-packages/django/db/models/base.py", line 1041, in _do_insert
    return manager._insert(
  File "...site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "...site-packages/django/db/models/query.py", line 1434, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "...site-packages/django/db/models/sql/compiler.py", line 1621, in execute_sql
    cursor.execute(sql, params)
  File "...site-packages/django/db/backends/utils.py", line 103, in execute
    return super().execute(sql, params)
  File "...site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "...site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "...site-packages/django/db/backends/utils.py", line 84, in _execute
    with self.db.wrap_database_errors:
  File "...site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "...site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "...site-packages/django/db/backends/sqlite3/base.py", line 477, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: Problem installing fixture 'new.json': Could not load demo.Bar(pk=None): NOT NULL constraint failed: demo_bar.foo_id

And then I noticed that, during the loaddata process, after the Foo instance has been instantiated and saved to the transaction, get_by_natural_key is called twice. With the pure Django test case it's only called once. The first time it's called with the correct value, 'qux', and the second time with an empty string.

There's some stuff in the stack trace that, when I inspect that part of the Django source, appears to involve copying over of manager methods in the inheritance scenario.

I'm closer, now.

@johncronan
Copy link
Author

Ah! So, now that I understand what's going on in the internals: the get_by_natural_key needs to use non_polymorphic or it's just not doing its job.

class FooManager(models.Manager):
    def get_by_natural_key(self, slug):
        return self.non_polymorphic().get(slug=slug)

I'm leaving this open, because it should be mentioned in the managers section of the documentation.

@johncronan johncronan changed the title Polymorphic models are unable to deserialize by natural key when not already in DB please document how to deserialize by natural key May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant