Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pandas 1.5+ FutureWarning in .str.split(..., expand=True) #9704

Merged
merged 2 commits into from Nov 30, 2022

Conversation

JacobHayes
Copy link
Contributor

@JacobHayes JacobHayes commented Nov 30, 2022

I confirmed manually that this removed the warning with the repro code in #9703, but didn't add an explicit test for it.

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

@JacobHayes JacobHayes changed the title Fix pandas 1.5+ FutureWarning in .str.split(..., expand=True) Fix pandas 1.5+ FutureWarning in .str.split(..., expand=True) Nov 30, 2022
@phobson phobson self-assigned this Nov 30, 2022
@ncclementi
Copy link
Member

add to allowlist

@phobson
Copy link
Contributor

phobson commented Nov 30, 2022

@JacobHayes this looks good and fixes the issue on my system. Do you mind if I push a commit that adds a test to the PR?

@JacobHayes
Copy link
Contributor Author

@phobson

@JacobHayes this looks good and fixes the issue on my system. Do you mind if I push a commit that adds a test to the PR?

That'd be great, thanks!

@phobson
Copy link
Contributor

phobson commented Nov 30, 2022

@JacobHayes turns out I don't have the credentials to do that.

Right here, can you add:

@pytest.mark.parametrize("index", [None, [0]], ids=["range_index", "other index"])
def test_str_split_no_warning(index):
    df = pd.DataFrame({"a": ["a\nb"]}, index=index)
    ddf = dd.from_pandas(df, npartitions=1)

    pd_a = df["a"].str.split("\n", n=1, expand=True)
    dd_a = ddf["a"].str.split("\n", n=1, expand=True)

    assert_eq(dd_a, pd_a)

Co-authored-by: Paul Hobson <pmhobson@gmail.com>
@JacobHayes
Copy link
Contributor Author

Ah, great - I should have expected filterwarnings=error:::pandas[.*] in the pytest setup! Added the test, thanks for walking me through it. :)

@jrbourbeau jrbourbeau changed the title Fix pandas 1.5+ FutureWarning in .str.split(..., expand=True) Fix pandas 1.5+ FutureWarning in .str.split(..., expand=True) Nov 30, 2022
Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JacobHayes, this looks great!

Note the test failure here is unrelated to the changes in this PR and being handled over in #9701

Also, I noticed this is your first code contribution to this repository. Welcome!

@jrbourbeau jrbourbeau merged commit 99123bd into dask:main Nov 30, 2022
@JacobHayes JacobHayes deleted the fix-str-split-warning branch November 30, 2022 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.str.split(..., expand=True) with Int64Index triggers pandas FutureWarning
5 participants