Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply func non roundtrippable seq #250

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ap--
Copy link

@ap-- ap-- commented Mar 29, 2024

Before submitting
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did all existing and newly added tests pass locally?

What does this PR do?

Fixes #249

Note: I'm not sure if the added overhead in the slow path of apply_to_collection due to the shallow sequence copy and the elementwise comparison is undesirable to support this use case. I could reimplement this to cache the result of can_roundtrip_sequence based on the class.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.

Cheers,
Andreas 😃


📚 Documentation preview 📚: https://lit-utilities--250.org.readthedocs.build/en/250/

Copy link

codecov bot commented Mar 29, 2024

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️


val = NonRoundtrippableSequence(3)
result = apply_to_collection(val, int, lambda x: x + 1)
assert val == result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't you expect result == [1, 2, 3]? What's a real example where this should become a no-op?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came up when using: https://github.com/Bayer-Group/pado/blob/635d7b8b57e527254d6302730100a6dab5a2095f/pado/images/ids.py#L126-L351

Where ImageId instances are tuple subclasses but they don't roundtrip, i.e.:

iid = ImageId("a", "b", "c", site="site-1")

ImageId(list(iid)) <- is not allowed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @ap-- ^^

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it seems my reply was in pending state for a month and I had to still click on submit review 🤷

@Borda Borda requested a review from carmocca May 6, 2024 17:01
def can_roundtrip_sequence(obj: Sequence) -> bool:
"""Check if sequence can be roundtripped."""
try:
return obj == type(obj)(list(obj)) # type: ignore[call-arg]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can you be sure that list(obj) will work for any sequence? Even though this is covered by the try-except block, the and conditions added to L129 will now skip the is_sequence loop for any sequence that this doesn't consider

Given the complexity and that your data structure is very specific to your problem. Could we instead expose an API so that you can have your data structure define whether it should be treated as a sequence or not?

For instance, that is similar to how pytrees are implemented which are much more extensible and performant: https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py#L163. In this case, it could also be an attribute of the data structure

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the complexity and that your data structure is very specific to your problem. Could we instead expose an API so that you can have your data structure define whether it should be treated as a sequence or not?

That sounds like the best solution. An exclude list for types that should just be passed through would be ideal. I can work on a PR in early June.

@carmocca carmocca removed their assignment May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non roundtrippable sequence subclasses raise error in apply_to_collection
3 participants