Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Identity checking NA in map is incorrect #57390

Open
mroeschke opened this issue Feb 13, 2024 · 3 comments · May be fixed by #58392
Open

BUG: Identity checking NA in map is incorrect #57390

mroeschke opened this issue Feb 13, 2024 · 3 comments · May be fixed by #58392
Assignees
Labels
Arrow pyarrow functionality NA - MaskedArrays Related to pd.NA and nullable extension arrays Regression Functionality that used to work in a prior pandas version

Comments

@mroeschke
Copy link
Member

In [2]: pd.Series([pd.NA], dtype="Int64").map(lambda x: 1 if x is pd.NA else 2)
Out[2]: 
0    2
dtype: int64

In pandas 2.1

In [2]: pd.Series([pd.NA], dtype="Int64").map(lambda x: 1 if x is pd.NA  else 2)
Out[2]: 
0    1

This is probably because we call to_numpy before going through map_array

@mroeschke mroeschke added Regression Functionality that used to work in a prior pandas version NA - MaskedArrays Related to pd.NA and nullable extension arrays Arrow pyarrow functionality labels Feb 13, 2024
@rohanjain101
Copy link
Contributor

I hit the same issue in 2.2.0, based on #56606 (comment), it was mentioned this was the expected behavior going forward. Is this no longer the case?

@mroeschke
Copy link
Member Author

mroeschke commented Feb 15, 2024

Ah thanks @rohanjain101, I didn't realized you opened #56606

I would say in an ideal world pd.NA still shouldn't get coerced to np.nan when evaluating a UDF (and without going through object)

rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Feb 20, 2024
Due to a change in pandas 2.2 with how NA is handled (incorrectly) in UDFs pandas-dev/pandas#57390

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #15071
@droussea2001
Copy link
Contributor

take

@droussea2001 droussea2001 linked a pull request Apr 23, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality NA - MaskedArrays Related to pd.NA and nullable extension arrays Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants