Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow np.uint64 to be used in indexing. Support numpy 1.24.1 #510

Merged
merged 1 commit into from Jan 12, 2023

Conversation

Dr-Irv
Copy link
Collaborator

@Dr-Irv Dr-Irv commented Jan 12, 2023

Turns out that np.uint64 is not a subclass of np.int64, so changed the indexing type to be np.integer, which is the parent class of np.int64 and np.uint64. Error only occurred with numpy 1.24.1, so modified things to allow us to test with that version.

@@ -1705,7 +1706,7 @@ def test_pivot_table() -> None:
),
pd.DataFrame,
)
with pytest.warns(np.VisibleDeprecationWarning):
if Version(np.__version__) <= Version("1.23.5"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be fine removing tests for older versions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this test once the bug is fixed in pandas. There is a PR for that now at pandas-dev/pandas#50682

df = pd.DataFrame(dict(x=[1, 2, 3]), index=np.array([10, 20, 30], dtype="uint64"))

def get_NDArray(df: pd.DataFrame, key: npt.NDArray[np.uint64]) -> pd.DataFrame:
df2 = df.loc[key]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't any np.NDArray work (not just integer) as long as the DataFrame index is of the same type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could probably only enforce a tight dtype match if index was generic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't any np.NDArray work (not just integer) as long as the DataFrame index is of the same type?

Probably, but since we can't track the dtype of an Index in a DataFrame, I'm limiting this for now to the issue that was reported. I think most people use arrays of int or arrays of str (which we probably could add), but I'd rather be incremental in adding support for all the possible types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could probably only enforce a tight dtype match if index was generic

If we knew the dtype of the underlying index, but we don't know that.

@twoertwein twoertwein merged commit bfa107b into pandas-dev:main Jan 12, 2023
@twoertwein
Copy link
Member

Thanks @Dr-Irv !

@Dr-Irv Dr-Irv deleted the issue508 branch February 4, 2023 17:03
twoertwein pushed a commit to twoertwein/pandas-stubs that referenced this pull request Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add npt.NDArray[np.uint64] to IndexType?
2 participants