-
-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discuss] Recent breaking changes in new dill releases #589
Comments
I try to ensure full backward compatibility, but sometimes (1) python forces a breakage or (2) I don't catch a subtle omission someone (could be me) makes which causes a breakage. Pickling is a nasty, messy, nasty and messy business. Functions by default try to pickle some version of the global namespace, so that in itself is already a mess. As helper functions in dill that are used to pickle registered objects, the global namespace for those functions is the _dill module... so that has its own complexity. Anyway, you get the idea. In short:
Yes
This is not done now, but it could be. Breaking changes are generally found after the release. Maybe a pre-release would be a good path forward.
Recently, dill added the concept of a shim... which gives infrastructure to load old pickles across breaking changes. So, to go from 0.3.1.1 to the current version, you should be able to add shims to get you across the breaking changes. It might be good to have easy user function access to a shim_registry (like https://github.com/uqfoundation/dill/blob/master/dill/_shims.py |
Adding on top of that, dill is relatively popular. Because of that, people are bound to pickle objects that were not considered before and may have been pickled differently or incorrectly, sometimes causing regressions that are inevitable to fix things that would be considered bugs for most other users. That (but of course less extreme) is how I would classify #572. It was caused by a change in order to fix #482, but because we cannot change the behavior of how individual objects are pickled, it is impossible to create code that solves both issues. An extensible way to customize how dill pickles objects is still needed. |
Thanks for the feedback.
There is no question about it, and I sincerely appreciate your efforts on this front. My goal is to save apache-beam users from this realization as much as possible.
Backward compatibility in the sense of being able to unpickle old pickles on new version is not very relevant for us; the problem is when the same payload that was pickled previously is no longer able to pickle, or unpickles into something that causes unexplained behavior downstream. Are shims still relevant for this problem?
Well... this is the tricky part. Occasional issues are fine, but in my particular scenario, I see a large number of errors that show different failure modes on upon random inspection. The question becomes, should I dive deep to to identify understand and fix these errors, or this task is basically a non-starter because something fundamental has changed and would require weeks of work to migrate.
This would be great. FWIW, on one of the issues there was a suggestion to use |
In the project I work on (apache-beam), dill is used both on client-side and service side. Since pickled bytes created by newer version of dill cannot be unpickled by an older version of dill, we require a very narrow version band in our requirements. Practically speaking, we have been using
dill==0.3.1.1
for a while.Recently we received many requests from users asking us to upgrade to newer versions of dill (example: apache/beam#22893). We also had to update dill version to support Python 3.11.
It appears that updating to the newer version of dill risks breaking many of our customers. I have visibility into a significant user-base who use Beam at my company and I see that changing dill from 0.3.1.1 to 0.3.6 dill breaks many users, with various failure modes. Multiple breaking changes have also been reported on dill's issue tracker. It is understandable that sometimes bugs happen and are fixed in follow up releases, however it seems that there were some large non-trivial changes that changed what is being pickled and how. For example, we noticed that some our tests suites dill invocations started to pickle unpickleable pytest classes from main session (unnecessary for that particular test and a new behavior).
The impact of the breaking changes is not immediately obvious without diving into the details of dill implementation. The release notes (https://github.com/uqfoundation/dill/releases) don't call out breaking changes and/or mitigations for the breaking changes that users can take. It doesn't seem that new behavior may be reverted in a followup releases.
My assumptions that none of the breakages are intentional and the intent behind the changes is to pickle more and pickle better. However, if giving advantages to some group users also breaks another group of users, it would be better to enable new behaviors via opt-in or at least provide an opt-out option.
My questions to dill maintainers:
The text was updated successfully, but these errors were encountered: