Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: dictionary changed size during iteration #685

Closed
pawamoy opened this issue Jun 2, 2022 · 13 comments
Closed

RuntimeError: dictionary changed size during iteration #685

pawamoy opened this issue Jun 2, 2022 · 13 comments
Labels
bug Something isn't working

Comments

@pawamoy
Copy link
Sponsor Contributor

pawamoy commented Jun 2, 2022

Describe the bug
Not sure Ormar will be able to do anything.

To Reproduce

Upon running a query, there's some Pydantic validation happening, and it ends up with a runtime error when deep-copying data.
It seems that it is due to an element (of the model being deep-copied) being a submodel, and that upon accessing it, or another attribute of the parent model, it is loaded, changing the dictionary size somehow.

Here's the query:

packages = await Package.objects.select_related(["library", "tickets__ticket"]).exclude(version__contains=".x").all()

That query triggers the runtime error mentioned above. If I either remove tickets__ticket, or go down to the submodel with tickets__ticket__team, the error disappears.

packages = await Package.objects.select_related(["library"]).exclude(version__contains=".x").all()  # ok
packages = await Package.objects.select_related(["library", "tickets__ticket__team"]).exclude(version__contains=".x").all()  # ok

Traceback:

$ python sara.py 
Traceback (most recent call last):
  File "sara.py", line 26, in <module>
    asyncio.run(amain())
  File "/home/user/.basher-packages/pyenv/pyenv/versions/3.8.11/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/user/.basher-packages/pyenv/pyenv/versions/3.8.11/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "sara.py", line 9, in amain
    monitoring_ticket_id = await suggest_monitoring()
  File "/home/user/dev/project/src/project/sara/monitoring.py", line 114, in suggest
    libraries = await suggest_monitoring_libraries()
  File "/home/user/dev/project/src/project/suggest.py", line 150, in suggest_monitoring_libraries
    await Package.objects.select_related(["library", "tickets__ticket"]).exclude(version__contains=".x").all()
  File "/home/user/dev/project/__pypackages__/3.8/lib/ormar/queryset/queryset.py", line 1053, in all
    result_rows = self._process_query_result_rows(rows)
  File "/home/user/dev/project/__pypackages__/3.8/lib/ormar/queryset/queryset.py", line 184, in _process_query_result_rows
    result_rows = [
  File "/home/user/dev/project/__pypackages__/3.8/lib/ormar/queryset/queryset.py", line 185, in <listcomp>
    self.model.from_row(
  File "/home/user/dev/project/__pypackages__/3.8/lib/ormar/models/model_row.py", line 84, in from_row
    item = cls._populate_nested_models_from_row(
  File "/home/user/dev/project/__pypackages__/3.8/lib/ormar/models/model_row.py", line 203, in _populate_nested_models_from_row
    child = model_cls.from_row(
  File "/home/user/dev/project/__pypackages__/3.8/lib/ormar/models/model_row.py", line 104, in from_row
    instance = cast("Model", cls(**item))
  File "/home/user/dev/project/__pypackages__/3.8/lib/ormar/models/newbasemodel.py", line 140, in __init__
    values, fields_set, validation_error = pydantic.validate_model(
  File "pydantic/main.py", line 1038, in pydantic.main.validate_model
  File "pydantic/fields.py", line 857, in pydantic.fields.ModelField.validate
  File "pydantic/fields.py", line 1067, in pydantic.fields.ModelField._validate_singleton
  File "pydantic/fields.py", line 857, in pydantic.fields.ModelField.validate
  File "pydantic/fields.py", line 1074, in pydantic.fields.ModelField._validate_singleton
  File "pydantic/fields.py", line 1121, in pydantic.fields.ModelField._apply_validators
  File "pydantic/class_validators.py", line 313, in pydantic.class_validators._generic_validator_basic.lambda12
  File "pydantic/main.py", line 679, in pydantic.main.BaseModel.validate
  File "pydantic/main.py", line 605, in pydantic.main.BaseModel._copy_and_set_values
  File "/home/user/.basher-packages/pyenv/pyenv/versions/3.8.11/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/home/user/.basher-packages/pyenv/pyenv/versions/3.8.11/lib/python3.8/copy.py", line 229, in _deepcopy_dict
    for key, value in x.items():
RuntimeError: dictionary changed size during iteration

Expected behavior
No runtime error.

Versions (please complete the following information):

  • Database backend used: sqlite
  • Python version: 3.8.11
  • ormar version: 0.11.0
  • pydantic version: 1.9.1
  • if applicable fastapi version: 0.78.0

Additional context
The code triggering the error did not always trigger it. It might be due to a dependency upgrade (which one? I don't know).
Did anyone encounter the same issue?

@pawamoy pawamoy added the bug Something isn't working label Jun 2, 2022
@collerek
Copy link
Owner

collerek commented Jun 3, 2022

Can you share the models so I can try to reproduce the error?

@pawamoy
Copy link
Sponsor Contributor Author

pawamoy commented Jun 3, 2022

Yes, thank you 🙂

import os
from enum import Enum as StdEnum
from typing import List, Union

import databases
import ormar
import pydantic
import sqlalchemy

MAX_LIBRARY_NAME_LENGTH: int = 100
MAX_LIBRARY_KIND_LENGTH: int = 16
MAX_LIBRARY_SUMMARY_LENGTH: int = 1000
MAX_LIBRARY_URL_LENGTH: int = 1000
MAX_PACKAGE_VERSION_LENGTH: int = 64
MAX_PROJECT_APP_CODE_LENGTH: int = 16
MAX_PROJECT_NAME_LENGTH: int = 100
MAX_PROJECT_PLATFORM_LENGTH: int = 64
MAX_PROJECT_DEPENDENCY_GIT_BRANCH_LENGTH: int = 128
MAX_PROJECT_DEPENDENCY_USAGE_LENGTH: int = 1000
MAX_TEAM_NAME_LENGTH: int = 16
MAX_TICKET_KIND_LENGTH: int = 16
MAX_TICKET_STATUS_LENGTH: int = 16
MAX_TICKET_PACKAGE_COMMENT_LENGTH: int = 1000
MAX_TICKET_PACKAGE_STATUS_LENGTH: int = 16


class Enum(StdEnum):
    @classmethod
    def values(cls) -> List[Union[str, int]]:
        return [item.value for item in cls]


class LibraryKind(Enum):
    pypi: str = "pypi"
    nuget: str = "nuget"
    maven: str = "maven"
    npm: str = "npm"
    docker: str = "docker"
    helm: str = "helm"
    rpm: str = "rpm"
    generic: str = "generic"


class TicketPackageStatus(Enum):
    in_progress: str = "in progress"
    accepted: str = "accepted"
    rejected: str = "rejected"


class TicketStatus(Enum):
    open: str = "open"
    closed: str = "closed"


class TicketKind(Enum):
    monitoring: str = "monitoring"
    analysis: str = "analysis"
    urbanization: str = "urbanization"
    security: str = "security"


class BaseMeta(ormar.ModelMeta):
    database = databases.Database("sqlite:///db.sqlite", timeout=30)
    metadata = sqlalchemy.MetaData()


class Team(ormar.Model):
    class Meta(BaseMeta):
        tablename = "teams"
        constraints = [ormar.UniqueColumns("name")]

    id: int = ormar.Integer(primary_key=True)
    name: str = ormar.String(max_length=MAX_TEAM_NAME_LENGTH)

    @pydantic.validator("name")
    def _name_must_be_uppercase(cls, value):  # noqa: N805
        return value.upper()


class Library(ormar.Model):
    class Meta(BaseMeta):
        tablename = "libraries"
        constraints = [ormar.UniqueColumns("name", "kind")]

    id: int = ormar.Integer(primary_key=True)
    name: str = ormar.String(max_length=MAX_LIBRARY_NAME_LENGTH)
    kind: str = ormar.String(max_length=MAX_LIBRARY_KIND_LENGTH, choices=LibraryKind.values())
    summary: str = ormar.String(max_length=MAX_LIBRARY_SUMMARY_LENGTH)
    url: str = ormar.String(max_length=MAX_LIBRARY_URL_LENGTH)

    @pydantic.root_validator
    def _normalize_name(cls, values):  # noqa: N805
        kind = values.get("kind")
        if kind is LibraryKind.pypi:
            values["name"] = values["name"].lower().replace("_", "-")
        return values


class Package(ormar.Model):
    class Meta(BaseMeta):
        tablename = "packages"
        constraints = [ormar.UniqueColumns("library", "version")]

    id: int = ormar.Integer(primary_key=True)
    library: Library = ormar.ForeignKey(Library, related_name="packages", ondelete="CASCADE")
    version: str = ormar.String(max_length=MAX_PACKAGE_VERSION_LENGTH)


class Ticket(ormar.Model):
    class Meta(BaseMeta):
        tablename = "tickets"
        constraints = [ormar.UniqueColumns("number")]

    id: int = ormar.Integer(primary_key=True)
    number: int = ormar.Integer()
    kind: TicketKind = ormar.String(max_length=MAX_TICKET_KIND_LENGTH, choices=TicketKind.values())
    status: TicketStatus = ormar.String(max_length=MAX_TICKET_STATUS_LENGTH, choices=TicketStatus.values())
    team: Team = ormar.ForeignKey(Team, related_name="tickets", ondelete="CASCADE")


class TicketPackage(ormar.Model):
    class Meta(BaseMeta):
        tablename = "tickets_packages"
        constraints = [ormar.UniqueColumns("ticket", "package")]

    id: int = ormar.Integer(primary_key=True)
    status: TicketPackageStatus = ormar.String(
        max_length=MAX_TICKET_PACKAGE_STATUS_LENGTH, choices=TicketPackageStatus.values()
    )
    comment: str = ormar.String(max_length=MAX_TICKET_PACKAGE_COMMENT_LENGTH)
    ticket: Ticket = ormar.ForeignKey(Ticket, related_name="packages", ondelete="CASCADE")
    package: Package = ormar.ForeignKey(Package, related_name="tickets", ondelete="CASCADE")
    explicit: bool = ormar.Boolean(default=True)

@pawamoy
Copy link
Sponsor Contributor Author

pawamoy commented Jun 8, 2022

Happening on another query as well, here's the most verbose output I can get with pytest (it still truncates data...):

x = {'id': 1, 'library': {'id': 1}, 'projects_dependencies': [], 'tickets': [], ...}
memo = {139688783992448: {'__dict__': {'id': 1, 'library': <weakproxy at 0x7f0bd4e02450 to Library at 0x7f0bd762cdc0>}, '__fi...ckets': [], 'projects_dependencies': []}), 139688784020160: ['id'], 139688784023168: ['version', 'id', 'library'], ...}
deepcopy = <function deepcopy at 0x7f0be8487040>

    def _deepcopy_dict(x, memo, deepcopy=deepcopy):
        y = {}
        memo[id(x)] = y
>       for key, value in x.items():
            y[deepcopy(key, memo)] = deepcopy(value, memo)
                return y
E       RuntimeError: dictionary changed size during iteration

../../../../.basher-packages/pyenv/pyenv/versions/3.8.11/lib/python3.8/copy.py:229: RuntimeError

We can see that in x we have for example 'library': {'id': 1} while in memo we have 'library': <weakproxy...>. Not sure how/why it triggers a runtime error.

Query is:

await Team.objects.filter(projects__dependencies__package__id=package_id).all()

(I didn't provide these models above)

Fixed with:

await Team.objects.select_related("projects__dependencies__package__library").filter(projects__dependencies__package__id=package_id).all()

@pawamoy
Copy link
Sponsor Contributor Author

pawamoy commented Jun 8, 2022

I've ran a failing snippet with multiple versions of Ormar, downgrading from 0.10.25 to 0.10.0, and the snippet stopped failing at 0.10.23, so the "bug" was probably introduced in 0.10.24, by either Ormar itself or an upgraded version of one of its dependencies 🙂

  • dependencies with 0.10.23:
    • SQLAlchemy-1.4.28
    • aiosqlite-0.17.0
    • databases-0.5.3
    • greenlet-1.1.2
    • pydantic-1.8.2
    • typing_extensions-4.2.0
  • dependencies with 0.10.24:
    • SQLAlchemy-1.4.29 ⚠️ ⬆️
    • aiosqlite-0.17.0
    • databases-0.5.4 ⚠️ ⬆️
    • greenlet-1.1.2
    • pydantic-1.9.1 ⚠️ ⬆️
    • typing_extensions-4.2.0

The most suspicious dependency if of course pydantic since it appears in the traceback.

I'll try to git bisect between 0.10.24 and 0.10.23.

@pawamoy
Copy link
Sponsor Contributor Author

pawamoy commented Jun 8, 2022

Possibly related: pydantic/pydantic@952fad2, from issue pydantic/pydantic#3641

@collerek
Copy link
Owner

collerek commented Jun 8, 2022

Thanks for the investigation!

Yep, it seems it's pydantic related as I didn't introduce anything that I can think of that could cause this error in 0.10.24.
Will try to dig deeper.

@pawamoy
Copy link
Sponsor Contributor Author

pawamoy commented Jun 8, 2022

Just finished bisecting, and it confirms our suspicions:

aab46de800b5bd26e426ddd3cbd8a16cb7bcdf7a is the first bad commit
commit aab46de800b5bd26e426ddd3cbd8a16cb7bcdf7a
Author: collerek <collerek@gmail.com>
Date:   Mon Jan 3 18:23:22 2022 +0100

    remove date dumping to isoformat, add pydantic 1.9 support

So yeah I think that the change from shallow to deep copy in pydantic broke Ormar 🙂
I have no clue on how that would be "fixed"/supported in Ormar 😅

@collerek
Copy link
Owner

collerek commented Jun 8, 2022

I can see that it also causes additional errors in prs open by dependabot in other places.

Yeah, I need to check, I already overwritten some pydantic methods (dict, json etc.) in ormar so might need to add some more 😅

@collerek
Copy link
Owner

collerek commented Jun 8, 2022

Should be fixed in 0.11.1.
Please check and let me know. ;)

@collerek collerek closed this as completed Jun 8, 2022
@pawamoy
Copy link
Sponsor Contributor Author

pawamoy commented Jun 8, 2022

Awesome, thanks a lot! I'll be able to try that tomorrow, I'll let you know 🙂

@pawamoy
Copy link
Sponsor Contributor Author

pawamoy commented Jun 9, 2022

Yep, everything is working fine again 🙂 Thanks!

@josushiman
Copy link

thank you

@alexol91
Copy link

alexol91 commented Aug 22, 2022

Hi, I have the same problem. I was on ORMAR 0.10.25 and uploaded to 0.11.2 to get the fix and it keeps crashing.

I think my case is slightly different, or more complex. The relationships within my model to other models work, the problem appears when in the third level I need a JOIN on the same model (the initial one). This code worked a few months ago.

Example that works:
Example: A -> B -> C
A.objects.select_related([ "B", "B__C" ])

Example that fails:
Example: A -> B -> A
A.objects.select_related([ "B", "B__A" ])

Real code:

# Class definitions
class Node(ormar.Model):
        class Meta(ormar.ModelMeta):
            tablename = "node"

      id: int = ormar.Integer(primary_key=True)
      name: str = ormar.String(max_length=120)
      type: str = ormar.String(max_length=12, default="FLOW")
      created_at: datetime = ormar.DateTime(timezone=True, default=datetime.now)
     
  
  class Edge(ormar.Model):
      class Meta(ormar.ModelMeta):
          tablename = "edge"

      id: str = ormar.String(primary_key=True, max_length=12)
      src_node: int = ormar.ForeignKey(Node, related_name="next_edges")
      dst_node: int = ormar.ForeignKey(Node, related_name="previous_edges")
  
      condition: str = ormar.String(max_length=255, nullable=True)
      order: int = ormar.Integer(default=1)
  
      created_at: datetime = ormar.DateTime(timezone=True, default=datetime.now)
  
 
# Query that fails
active_node = await Node.objects.select_related(['next_edges', 'next_edges__dst_node']).all()

Thanks in advance and congratulations on this awesome ORM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants