Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discuss] Recent breaking changes in new dill releases #589

Open
tvalentyn opened this issue Apr 12, 2023 · 3 comments
Open

[Discuss] Recent breaking changes in new dill releases #589

tvalentyn opened this issue Apr 12, 2023 · 3 comments

Comments

@tvalentyn
Copy link

In the project I work on (apache-beam), dill is used both on client-side and service side. Since pickled bytes created by newer version of dill cannot be unpickled by an older version of dill, we require a very narrow version band in our requirements. Practically speaking, we have been using dill==0.3.1.1 for a while.

Recently we received many requests from users asking us to upgrade to newer versions of dill (example: apache/beam#22893). We also had to update dill version to support Python 3.11.

It appears that updating to the newer version of dill risks breaking many of our customers. I have visibility into a significant user-base who use Beam at my company and I see that changing dill from 0.3.1.1 to 0.3.6 dill breaks many users, with various failure modes. Multiple breaking changes have also been reported on dill's issue tracker. It is understandable that sometimes bugs happen and are fixed in follow up releases, however it seems that there were some large non-trivial changes that changed what is being pickled and how. For example, we noticed that some our tests suites dill invocations started to pickle unpickleable pytest classes from main session (unnecessary for that particular test and a new behavior).

The impact of the breaking changes is not immediately obvious without diving into the details of dill implementation. The release notes (https://github.com/uqfoundation/dill/releases) don't call out breaking changes and/or mitigations for the breaking changes that users can take. It doesn't seem that new behavior may be reverted in a followup releases.

My assumptions that none of the breakages are intentional and the intent behind the changes is to pickle more and pickle better. However, if giving advantages to some group users also breaks another group of users, it would be better to enable new behaviors via opt-in or at least provide an opt-out option.

My questions to dill maintainers:

  • Can substantially different pickling behaviors be enabled/disabled by dill settings?
  • Can breaking changes and mitigation options be called out in release notes when breaking changes are unavoidable or become known after the release?
  • Any other things we can do to minimize the impact of braking changes?
@mmckerns
Copy link
Member

mmckerns commented Apr 13, 2023

I try to ensure full backward compatibility, but sometimes (1) python forces a breakage or (2) I don't catch a subtle omission someone (could be me) makes which causes a breakage. Pickling is a nasty, messy, nasty and messy business. Functions by default try to pickle some version of the global namespace, so that in itself is already a mess. As helper functions in dill that are used to pickle registered objects, the global namespace for those functions is the _dill module... so that has its own complexity. Anyway, you get the idea.

In short:

  • Can substantially different pickling behaviors be enabled/disabled by dill settings?

Yes

  • Can breaking changes and mitigation options be called out in release notes when breaking changes are unavoidable or become known after the release?

This is not done now, but it could be. Breaking changes are generally found after the release. Maybe a pre-release would be a good path forward.

  • Any other things we can do to minimize the impact of braking changes?

Recently, dill added the concept of a shim... which gives infrastructure to load old pickles across breaking changes. So, to go from 0.3.1.1 to the current version, you should be able to add shims to get you across the breaking changes. It might be good to have easy user function access to a shim_registry (like register with the pickle_registry). @anivegesana

https://github.com/uqfoundation/dill/blob/master/dill/_shims.py

@anivegesana
Copy link
Contributor

anivegesana commented Apr 13, 2023

Adding on top of that, dill is relatively popular. Because of that, people are bound to pickle objects that were not considered before and may have been pickled differently or incorrectly, sometimes causing regressions that are inevitable to fix things that would be considered bugs for most other users.

https://xkcd.com/1172/

That (but of course less extreme) is how I would classify #572. It was caused by a change in order to fix #482, but because we cannot change the behavior of how individual objects are pickled, it is impossible to create code that solves both issues. An extensible way to customize how dill pickles objects is still needed.

@tvalentyn
Copy link
Author

tvalentyn commented Apr 15, 2023

Thanks for the feedback.

. Pickling is a nasty, messy, nasty and messy business.

There is no question about it, and I sincerely appreciate your efforts on this front. My goal is to save apache-beam users from this realization as much as possible.

I try to ensure full backward compatibility

Backward compatibility in the sense of being able to unpickle old pickles on new version is not very relevant for us; the problem is when the same payload that was pickled previously is no longer able to pickle, or unpickles into something that causes unexplained behavior downstream. Are shims still relevant for this problem?

regressions that are inevitable to fix things that would be considered bugs for most other users

Well... this is the tricky part. Occasional issues are fine, but in my particular scenario, I see a large number of errors that show different failure modes on upon random inspection. The question becomes, should I dive deep to to identify understand and fix these errors, or this task is basically a non-starter because something fundamental has changed and would require weeks of work to migrate.

Can substantially different pickling behaviors be enabled/disabled by dill settings?

Yes

This would be great. FWIW, on one of the issues there was a suggestion to use recurse=True, which wasn't sufficient (and led to more failures).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants