Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RLS: pandas 1.5.2 #49194

Closed
datapythonista opened this issue Oct 20, 2022 · 23 comments
Closed

RLS: pandas 1.5.2 #49194

datapythonista opened this issue Oct 20, 2022 · 23 comments
Labels

Comments

@datapythonista
Copy link
Member

Aiming at the release of pandas 1.5.2 at some point in November, if nothing urgent needs to be fixed and released.

@datapythonista
Copy link
Member Author

@pandas-dev/pandas-core can you please add a comment with any issue/PR you want to fix before 1.5.2 release? If we're not waiting for anything in particular I'll be releasing beginning of next week. If there is something pending we want to wait for, I'll postpone to the beginning of the week after. Thanks!

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Nov 11, 2022

Would like to get a FutureWarning for #49497 in (as suggested in the last dev meeting) if possible, I'll raise a PR today

PR: #49640

@MarcoGorelli
Copy link
Member

There's still some disagreement on the above PR, I'd suggest we wait until 1.5.3 (which I presume will likely happen before 2.0.0?) to get that in

So, not a blocker, no objections on my end to release on Monday

@jreback
Copy link
Contributor

jreback commented Nov 13, 2022

agree - do not block in the above

@rhshadrach
Copy link
Member

Would be good to get #49676 in (it's all set I think); definitely not a blocker.

@jbrockmendel
Copy link
Member

There's a spurious warning produced by DataFrame.update that we should fix or at least suppress

@datapythonista
Copy link
Member Author

There's a spurious warning produced by DataFrame.update that we should fix or at least suppress

Is there any work on this, or an issue to track? Planning to start the release shortly.

@datapythonista
Copy link
Member Author

datapythonista commented Nov 15, 2022

I created the 1.5.3 milestone and rolled all 1.5.2 issues and PR to it. Waiting for the CI of #49705 to complete, and more info from @jbrockmendel on the mentioned warning, and I'm ready to start the release of pandas 1.5.2 after that.

@jbrockmendel
Copy link
Member

Is there any work on this, or an issue to track?

#49720

@phofl
Copy link
Member

phofl commented Nov 16, 2022

This is merged now, so we are good to go?

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Nov 16, 2022

Can we add #49736 ?

@datapythonista
Copy link
Member Author

Can we add #49736 ?

Fine by me. I'm planning to release on Monday, I don't want to release end of the week or weekend, in case the new version breaks anything.

@mroeschke
Copy link
Member

+1 to release Monday. Whatever doesn't make it can be released in a 1.5.3

@jreback
Copy link
Contributor

jreback commented Nov 19, 2022

yes let's release - delaying doesn't help anything

tbh weekend or not doesn't make any difference

@datapythonista
Copy link
Member Author

Starting the release of pandas 1.5.2

@datapythonista
Copy link
Member Author

The release tests seem to be failing because they consume too much memory. It started happening in the 1.5.x series, where one of the tests failed, but now both the pip and conda tests crash because of the error (in a machine with 16Gb of RAM, with no more than couple of them used by other things).

I'm not sure if we're running the high memory tests in the CI. But it doesn't seem to be affected. If there are no objections, I'll continue with the release of 1.5.2 as planned in some hours, ignore the errors for now, and I'll be checking pandas memory usage before 1.5.3, where we can try to revert to previous usage.

CC: @pandas-dev/pandas-core

@mroeschke
Copy link
Member

now both the pip and conda tests crash because of the error

Do those test run with pytest-xdist?

I'm not sure if we're running the high memory tests in the CI

IIRC I combined "high memory" pytest marker to "slow" which we're not running. Generally, CI tests are all running with not slow and not network and not single_cpu (but there's a build that runs single_cpu)

@datapythonista
Copy link
Member Author

We are using pytest-xdist in those tests, yes.

They run with:

python3 -c "import pandas; pandas.test(extra_args=['-m not clipboard', '--skip-slow', '--skip-network', '--skip-db', '-n=2'])"

I'll move forward. Before the next release I'll try to take care of:

  • Checking the memory tests, find out if pandas did increase memory usage significantly, and how much, and try to identify the cause and fix.
  • Move the release tests to the CI. In theory, my understanding is that we just build the wheels/conda package and run the tests with the packages instead of directly from the source code. Not sure if there is anything else being tested. But I think all can be done in the CI for every commit, and not before the release process.
  • Maybe automate most of the release in the CI after that is done, which should be pretty straight-forward.

@datapythonista
Copy link
Member Author

Seems like for the Python 3.9 ARM builds, we're using NumPy 1.19, which is only compatible with Python 3.7 and 3.8. I'll be updating to NumPy 1.20 here unless there is a better idea.

Also, I see we're not using the latest patch releases for the build. For example, for Python 3.8 we're using NumPy 1.19.1 while 1.19.5 is available. Is there any other reason for that other than the latest was used at some point, and it hasn't been upgraded when new releases became available? Would it be worth to simply use 1.19 so the latest release is used?

CC: @lithomas1

@datapythonista
Copy link
Member Author

Increased the numpy of the failing build to the minimum numpy supporting 3.9 (1.19.3) and CI is green now. I just merged the MacPython PR. Also merged the conda-forge PR, so conda-forge packages should be available soon.

It'll be late where I am when the wheels are ready, but will upload them to the pypi in the morning, and will make the announcements.

@datapythonista
Copy link
Member Author

Release complete, I'll be making the official announcements shortly.

@simonjayhawkins
Copy link
Member

  • Move the release tests to the CI. In theory, my understanding is that we just build the wheels/conda package and run the tests with the packages instead of directly from the source code.

I have a workflow that I have been using as a release readiness test that I used just prior to release https://github.com/simonjayhawkins/pandas-release/actions?query=workflow%3A%22Tag+Release%22.

see for example, #47485 (comment)

Not sure if there is anything else being tested.

check release notes are in sync to potentially catch missed backports/incorrect location of release notes

But I think all can be done in the CI for every commit, and not before the release process.

IIRC we discussed this before and decided to only port across the sdist build to pandas CI.

@datapythonista
Copy link
Member Author

Thanks @simonjayhawkins this is useful. I moved the procedure and checklist to the docs: https://pandas.pydata.org/docs/dev/development/maintaining.html#release-process I think it matches what you've got (let me know if you see anything missing).

We've got a sdist build now in the CI. Adding the tagging and the testing of pip and coda to our runs probably makes sense. I'd also try to understand better what needs to be done to build the wheels in our repo instead of MacPython. I think it makes more sense to test in every PR all the OS/architectures... instead of knowing if something is wrong during the release process in MacPython. And that would let us automate the whole release I think. Maybe we can eventually have a call with you and @lithomas1 to discuss?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants