Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to hook into structuring of a simple dict? #524

Open
kkg-else42 opened this issue Mar 18, 2024 · 6 comments
Open

How to hook into structuring of a simple dict? #524

kkg-else42 opened this issue Mar 18, 2024 · 6 comments
Labels
more-info-needed More information required.

Comments

@kkg-else42
Copy link
Contributor

  • cattrs version: 23.2.3
  • Python version: 3.11
  • Operating System: Windows (dev)/Linux (prod)

Hey there!

I have a few attrs classes. Some are members of a (tagged) union, together with dict.
The unstructure already works. Among other things, datetime objects are converted into strings -- in a special format (_TIMESTAMP_FORMAT: Final[str] = '%Y%m%d_%H%M%S').

The structuring of datetime attributes in attrs instances works perfectly.
My problem is with the structuring of simple dict objects.
I need to hook into it to convert the strings (which are in that special format) into datetime objects.
But since the simple dict objects can also contain attrs instances, it must be possible to call the structuring recursively again.

I just need a hint how to call the structuring without creating an endless loop.

@Tinche
Copy link
Member

Tinche commented Mar 18, 2024

Could you provide a minimal example in code?

@Tinche Tinche added the more-info-needed More information required. label Mar 18, 2024
@kkg-else42
Copy link
Contributor Author

Sorry for the delay...
I don't know if it is a minimal one, but here is my example:

from datetime import datetime
from typing import Any, Final, Type, TypeVar

import attrs
from cattrs.preconf.json import make_converter
from cattrs.strategies import configure_tagged_union

T = TypeVar('T')


@attrs.frozen
class Sub:
    foo: str = 'bar'
    # and some more fields (incl. other attrs types)


@attrs.frozen
class A:
    some: str = 'a'
    sub: Sub = Sub()


@attrs.frozen
class B:
    some: str = 'b'
    sub: Sub = Sub()


FrameData = dict | A | B


@attrs.frozen
class Frame:
    data: FrameData


_CUSTOMIZED_STRUCTURE_TYPES: Final[set] = {
    datetime,
    dict,
    Frame,
    # and some more...
}

_TIMESTAMP_FORMAT: Final[str] = '%Y%m%d_%H%M%S'


def _structure(data: dict[str, Any] | str, to_type: Type[T]) -> T:
    match to_type:
        case t if t is datetime:
            return datetime.strptime(data, _TIMESTAMP_FORMAT)
        case t if t is dict:
            return _structure_dict(data)
        case t if t is Frame:
            data.pop('to_add', None)
            return conv.structure_attrs_fromdict(data, Frame)
        case _:
            raise NotImplementedError(f'Unsupported type: {str(to_type)}.')


def _structure_dict(data: dict[str, Any]) -> dict[str, Any]:
    structured: dict[str, Any] = data.copy()
    for k, v in structured.items():
        if isinstance(v, str):
            try:
                structured[k] = datetime.strptime(v, _TIMESTAMP_FORMAT)
            except ValueError:
                continue
    # something is needed here to call the converter for structuring the other values of the dict
    return structured


conv = make_converter()

for data_type in _CUSTOMIZED_STRUCTURE_TYPES:
    conv.register_structure_hook(data_type, lambda data, to_type: _structure(data, to_type))

configure_tagged_union(union=FrameData,
                       converter=conv,
                       tag_name='_type',
                       tag_generator=lambda t: t.__name__.casefold(),
                       default=dict)

As a result of this:

f='{"data": {"a": {"some": "a", "sub": {"foo": "bar"}}, "ts": "20240320_010203", "_type": "dict"}}'
print(conv.loads(f, Frame))

I get this output:
Frame(data={'a': {'some': 'a', 'sub': {'foo': 'bar'}}, 'ts': datetime.datetime(2024, 3, 20, 1, 2, 3)})

But what I need is this output:
Frame(data={'a': A(some='a', sub=Sub(foo='bar')), 'ts': datetime.datetime(2024, 3, 20, 1, 2, 3)})

@kkg-else42
Copy link
Contributor Author

Hi Tin,
Is there anything else I should add or is it just a busy schedule?

@Tinche
Copy link
Member

Tinche commented Apr 10, 2024

Hey,

yeah sorry I got sidetracked by other things.

But since the simple dict objects can also contain attrs instances, it must be possible to call the structuring recursively again.

This is going to be complicated without modeling this more precisely. How do you know a nested dict is supposed to be converter into a class instance and not left as a dict? If a nested dict always means A | B, then it gets easier.

Frame(data={'a': {'some': 'a', 'sub': {'foo': 'bar'}}, 'ts': datetime.datetime(2024, 3, 20, 1, 2, 3)})

That looks correct given the input. Even if we assume data['a'] is logically typed as FrameData, it has no _type field and so will default to a dict. In other words, how can we tell data['a'] is supposed to be A?

@kkg-else42
Copy link
Contributor Author

But since the simple dict objects can also contain attrs instances, it must be possible to call the structuring recursively again.

This is going to be complicated without modeling this more precisely. How do you know a nested dict is supposed to be converter into a class instance and not left as a dict? If a nested dict always means A | B, then it gets easier.

I now see that my example is misleading.
My sentence (which you quoted) referred to the following line in the _structure_dict function:

# something is needed here to call the converter for structuring the other values of the dict
return structured

The goal is not to convert an arbitrary dict into an attrs instance. But dict-values, which in turn can be attrs instances (with datetime), should also be converted accordingly. However, they are not recognized as such before the datetime conversion (due to the special format).
To achieve this, the converter would have to be called again within the _structure_dict function. Something like this:

# something is needed here to call the converter for structuring the other values of the dict
return conv.structure(structured, dict)

Of course, this doesn't work because it creates an endless loop.
With attrs classes I use conv.structure_attrs_fromdict to achieve this (as in the case of Frame).

If necessary, I can rework the example. (But that will certainly not be until the week after next.)

@Tinche
Copy link
Member

Tinche commented Apr 13, 2024

Yeah, a simplified example would be good.

To achieve this, the converter would have to be called again within the _structure_dict function. Something like this: ... Of course, this doesn't work because it creates an endless loop.

You can just call _structure_dict on each value yourself, right? You don't even need to jump back into cattrs. It won't create an endless loop since it will stop when there are no values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more-info-needed More information required.
Projects
None yet
Development

No branches or pull requests

2 participants