Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yank recent dask (distributed) releases? #238

Open
fjetter opened this issue Apr 12, 2022 · 18 comments
Open

Yank recent dask (distributed) releases? #238

fjetter opened this issue Apr 12, 2022 · 18 comments

Comments

@fjetter
Copy link
Member

fjetter commented Apr 12, 2022

We've observed a bunch of severe stability issues with the most recent versions.

It appears that the January version 2022.1.1 is the most recent stable version. Should we yank the subsequent releases? Is there an equivalent for conda we can do?

@fjetter
Copy link
Member Author

fjetter commented Apr 12, 2022

Obviously, we also shouldn't release anymore until we're reasonably confident that the new versions are actually stable.

@martindurant
Copy link
Member

Can you please link any issues about stability, for reference. Is it just distributed we are talking about?

@fjetter
Copy link
Member Author

fjetter commented Apr 12, 2022

I opened dask/distributed#6110 which is still very vague but I haven't had the time, yet to debug further or reduce the problem further. It's again about deadlocks 🎉

Is it just distributed we are talking about?

Yes

@gjoseph92
Copy link

@hayesgb
Copy link

hayesgb commented Apr 12, 2022

I vote yes

@jakirkham
Copy link
Member

There's at least one fix in flight ( dask/distributed#6112 ). Would suggest holding off here for the moment.

@gjoseph92
Copy link

Whether we will have a new release that fixes some issues isn't relevant to whether we yank existing releases that are known to be unstable.

@jakirkham
Copy link
Member

This will impact deployments that have versions constraints that are tied to the yanked versions. RAPIDS would be negatively impacted for example.

@martindurant
Copy link
Member

For a PyPI "yank", you can still get a specific version when explicitly specified, but it isn't included in a free install. That's something like putting a label on a conda package. You can also change the conda-forge metadata via PR quite easily, but I don't know how to get the exact same behaviour.

@jakirkham
Copy link
Member

Both have concepts of pulling old releases. Though those actions have there own set of tradeoffs.

Given the issues listed above have in some cases been known about and engaged with for a couple months, am confused as why they are now driving pulling packages, but were not considered release blockers at the time.

@mrocklin
Copy link
Member

An alternative is to see what we can resolve in the next week, and plan to do a release on Friday.

I plan to focus on fixing these issues this week regardless.

@mrocklin
Copy link
Member

First PR (mostly just doing what Gabe and Florian suggested) is up here: dask/distributed#6112

@gjoseph92
Copy link

but were not considered release blockers at the time

Because they weren't known to be issues at the time (inadequate testing). If any one of them had been known to cause a deadlock, they would have been a release blocker. We just only recognize this in hindsight.

@quasiben
Copy link
Member

Yanking (I think) 4 releases seems fairly extreme, I think that's what is being proposed here. As John said, RAPIDS just released 22.04 which relies on 2022.03.0 and yanking this out would be very disruptive even though there are known bugs.

@mrocklin
Copy link
Member

Second PR (around work stealing) dask/distributed#6115

This isn't actually about deadlocks, but it was easy to investigate.

@hammer
Copy link

hammer commented Apr 21, 2022

Any progress to report on these severe stability issues?

I'm also eager to follow any discussions happening elsewhere about how to limit severe stability issues in Dask releases moving forward. I remain supportive of adding release candidates (#94) to the release process, and for designating a subset of releases as targets for long-term support (#101).

@martindurant
Copy link
Member

See dask/distributed#6110

@mrocklin
Copy link
Member

Hi All, weighing in here. I'll include a quick status update, longer-term plans, and some general thoughts here.

Quick status update

The latest release pushed out last Friday identified and resolved two known deadlocks. We've identified a third in the issue that Martin just pointed to and have a fix for it in a PR (although that issue seems a bit more fringe). That will go out next Friday.

I encourage everyone to try latest release.

Personal plans

I'm shifting my time away from adminsitrative work to be spending around 50% of my time on dask.distributed. This has been going on for the last couple of weeks now. I plan to keep this up for the next quarter.

Aside from this topic there is a lot of good work to do. There are many things to clean up. I'm having a good time and I encourage a positive outlook. I apologize to anyone as I deprecate your favorite old feature :)

Thoughts on yanking

I'm open to yanking releases, but it's not my first choice. I would rather we focus on fixing things and getting out a solid release than on retreating. I think that with some focused effort we can resolve these problems in a couple more weeks (recent progress has been good, thanks everyone, let's keep it up).

I don't think that there is a very solid platform to retreat to anyway, and I don't want us to get into that habit, at least not when considering going far back in history.

LTS

I like Dask's rapid and regular release cycle. I think innovation is still more valuable to Dask than API stability, at least for the community development cycle.

<heads-up, about to mention coiled things>

However, I also appreciate the desire for a more steady base, especially in a more enterprise or institutional setting. FWIW Coiled needed something like this for our customers (we need to have a thing to target for support components of contracts) and we published a decently simple meta-package at https://github.com/coiled/coiled-runtime . There's also a nascent effort there to do larger scale integration testing. We've found this metapackage useful and plan to maintain it. It's free to use (on conda-forge) although we don't promise to provide free advice/support around it. That might provide a bit more of a guided experience to folks looking for something that will be stable for a longer period of time.

Release candidates

I'm fine with that if folks would actually opt-in to test and provide feedback. My sense is that most of the larger downstream packages have us in an upstream CI build. This seems to be as-or-more effective than what I've seen of release candidates in the PyData stack, but I'm totally open to this if folks want to start engaging.

Question here: who would actively try things out and report back feedback? If yes, what lead time would you need?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants