Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always call PyNumber_Index when casting from Python to a C++ integral type, also pre-3.8 #2801

Merged

Conversation

YannickJadoul
Copy link
Collaborator

Description

See the remaining decision to be made in #2698: #2698 (comment)

In one sentence, this makes the casting from Python objects to C++ integer types consistent across Python versions, "backporting" Python 3.8 behavior on handling __index__ to pre-3.8 Pythons.

As @bstaletic pointed out to me, we have done this before, most often backporting Python 3 behavior to Python 2 (unicode, for example), but also in #2616, "backporting" 3.8 behavior.

Given the original issue fixed by #2698, and especially the discussion following that ("everything with __index__ is an integer type"), I think this is the logical to minimize surprises when developing pybind11 libraries. See also the changed test demonstrating how things are more consistent across versions.

I'm cheekily adding this to 2.6.2, because in my eyes, it's still part of #2698, and I'd like to see a decision made ("no, because ..." is also fine, btw; then we close this PR! But it doesn't make sense to me to delay this simple decision on policy.). This PR mainly makes it easier to judge the changes (after soon merging #2698, at least).

Suggested changelog entry:

When casting to a C++ integer, `__index__` is always called and not considered as conversion, consistent with Python 3.8+.

@YannickJadoul
Copy link
Collaborator Author

YannickJadoul commented Jan 17, 2021

Aaaaand ICC segfaults. Why!?
It's been slightly over a day since we merged that PR...

Valgrind seems perfectly happy, btw.

Resolved in #2801; thanks, @henryiii!

Comment on lines 1046 to 1060
if (!tmp) {
PyErr_Clear();
return false;
}
do_decref = true;
obj = tmp;
}
#endif
if (std::is_unsigned<py_type>::value) {
py_value = as_unsigned<py_type>(obj.ptr());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might seems like a reasonably big change, but after this PR, I want to fix #2786, which involves a minor refactoring of casting to C++ integer types (to ensure future consistency with py::int_::operator int()), so keep that in mind when reviewing, please ;-)

If we think that consistency between Python < 3.8 and >= 3.8 versions is a nice thing to have, then I personally don't really think this is a too high implementation price to pay.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, #2786's fix shouldn't be complex either, so if you're able to wait 1 or 2 more days, it can also still be a fix to go into 2.6.2. But we need to draw a line somewhere, ofc.)

@henryiii
Copy link
Collaborator

There was an issue in setuptools 51.3.0 fixed in 51.3.1, I've restarted the build. pypa/setuptools#2535

@YannickJadoul
Copy link
Collaborator Author

There was an issue in setuptools 51.3.0 fixed in 51.3.1, I've restarted the build. pypa/setuptools#2535

Thanks :-) I'd noticed setuptools had release suspiciously recently, but I didn't manage to figure out what was going in anymore, at 2 am.

@henryiii
Copy link
Collaborator

I'm mildly in favor. We don't want to rush a release out; after I get the PRs that were blocked by the GitHub API issue & CMake, I'd like @rwgk to run a global test before we pull the trigger on a release. So there's a bit more time.

@YannickJadoul
Copy link
Collaborator Author

In principle, this is ready. Surely things can still be cleaned up, but let's first make a decision on what we want?

#2802 (fixing #2786) is a bit more of a mess, though. I had hoped I could still get it in, together with this, but ... well, I don't know, right now.

@rwgk
Copy link
Collaborator

rwgk commented Jan 19, 2021

OOPS, this is #2801, I mistakingly thought it's #2698. I just deleted my previous comment. Sorry!

@rwgk
Copy link
Collaborator

rwgk commented Jan 19, 2021

I'd like @rwgk to run a global test before we pull the trigger on a release

Will do. (I'm pretty neutral on this PR yes/no for v2.6.2.)

include/pybind11/cast.h Outdated Show resolved Hide resolved
include/pybind11/cast.h Outdated Show resolved Hide resolved
PyErr_Clear();
return false;
}
do_decref = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index_owner = reinterpret_steal<object>(tmp);

That way you don't need the second #if PY_VERSION_HEX < 0x03080000 below and this code become exception safe.

I'd also use idx (or similar) instead of tmp, to be more descriptive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of/tried that, but didn't want to incur an overhead refcounting on Python >= 3.8, and this is also what CPython does.

But wait, maybe you mean something else, that doesn't need this! I'll give this a shot :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesss, that does work out beautifully! Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still seeing if this could easily be refactored out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it's quite hard to refactor into a separate private function without incurring an additional inc_ref/dec_ref, it seems. It's already cleaner than before, though, so is it fine to leave like this for now?

@rwgk
Copy link
Collaborator

rwgk commented Jan 20, 2021

Thanks @YannickJadoul, that looks great! I'll run this through our big testing system asap (probably tonight).

@YannickJadoul
Copy link
Collaborator Author

Thanks @YannickJadoul, that looks great! I'll run this through our big testing system asap (probably tonight).

Good, thanks! I'm pretty confident this PR does as it says (given our own tests and how they caught some corner cases), but it's good to have an idea how this "backported behavior" interacts in a larger context.

Copy link
Collaborator

@bstaletic bstaletic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me.

@rwgk
Copy link
Collaborator

rwgk commented Jan 22, 2021

I'm seeing 2 test failures that look related to this PR.
We're still on Python 3.6, so the #if PY_VERSION_HEX < 0x03080000 branch kicks in.
Debugging.

@rwgk
Copy link
Collaborator

rwgk commented Jan 22, 2021

In both failing tests a NumPy array with one float was passed for an int arg. I mailed fixes, boiling down to int(arr[0]), with comment:

A NumPy array was passed instead of an integer. The implicit conversion is disabled by pybind11 PR #2801, which backports a behavior change introduced by Python 3.8, for consistency.

@YannickJadoul, is that description accurate?

@henryiii
Copy link
Collaborator

I think the current (in this PR) behavior is the correct one; a NumPy float should not be automatically converted to an int. If the they want to support floats, there should be a float function/method/constructor that does the conversion. NumPy no longer allows floats in indexing, as well.

@YannickJadoul
Copy link
Collaborator Author

Huh, this is weird though. I thought this PR would only make things more permissive! Let me try out a few things.

@rwgk
Copy link
Collaborator

rwgk commented Jan 22, 2021

I agree with @henryiii and I'm OK if you want to merge for 2.6.2, although strictly speaking it's bending the rules to introduce this behavior change in a minor release, now that we know there are things that will break.

@henryiii
Copy link
Collaborator

Are you sure that current master without this PR doesn't also trigger that? Assuming it's the other PR that caused this?

@henryiii
Copy link
Collaborator

It's breaking something that shouldn't have worked, though, so it's a bit of a grey area.

@rwgk
Copy link
Collaborator

rwgk commented Jan 22, 2021 via email

@YannickJadoul
Copy link
Collaborator Author

$ python3.6
Python 3.6.9 (default, Oct  8 2020, 12:12:24) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> x = np.array(42)
>>> int(x)
42
>>> list(range(100))[x]
42
>>> y = np.array(3.14159)
>>> int(y)
3
>>> list(range(100))[y]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: only integer scalar arrays can be converted to a scalar index

Yeah, so int(some_float_array) works (also in 3.8), so I would think that we still expect to convert this on noconvert(false), right?

@rwgk
Copy link
Collaborator

rwgk commented Jan 22, 2021

Thanks @YannickJadoul, I'll run this PR through our big old testing mill again, later tonight.

FYI: In the meantime one team already merged my fix for them. (I'm still waiting for feedback on the second fix.)

@YannickJadoul
Copy link
Collaborator Author

Thanks @YannickJadoul, I'll run this PR through our big old testing mill again, later tonight.

FYI: In the meantime one team already merged my fix for them. (I'm still waiting for feedback on the second fix.)

Oh, OK. I don't think they're even to blame, actually, since things did just work on 3.8. Thanks for finding this!

I still need to figure out how to get 2.7 to at least pass tests. (Maybe not worth putting too much effort into 2.7.)

@rwgk
Copy link
Collaborator

rwgk commented Jan 22, 2021

The failures were actually useful. I don't think the code was working as intended by the original authors, only scraping by.
But I agree with your approach to revising this PR, to fully match Python 3.8 behavior.
Python 2.7 is on it's last leg as far as I see, not worth a significant effort.

@YannickJadoul
Copy link
Collaborator Author

Ah, good to know it was at least worth the effort, then!
For the record, I believe that some of these kinds of conversions start producing deprecation warnings from 3.8+ onwards (I had to silence them). Python just has a really weird approach to deprecation warnings and tends to hide them (though it's already better in more recent versions, I believe).

@YannickJadoul
Copy link
Collaborator Author

YannickJadoul commented Jan 22, 2021

OK, got it. Python 2's long vs. int. Sigh

@YannickJadoul
Copy link
Collaborator Author

I'll get back to this/fix this tomorrow, btw.
I'm still confused, since it seems that both PyLong_AsLong as well as PyNumber_Long call __int__? The docs are slightly confusing, so I'll check out CPython's source, but I need some sleep and a fresh perspective on this.

Copy link
Collaborator Author

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this fixes patches things, adding some more duct tape here and there.

Seems like refactoring/restructuring the float/int type caster is overdue, as well, but I propose to not cram that into this PR anymore.

Also, yes, this could use another test round, @rwgk. Who knows what I missed, this time.
Next to that test round, it's also worth checking: is this series of tests the behavior we want?

@@ -286,14 +300,20 @@ def cant_convert(v):
assert noconvert(7) == 7
cant_convert(3.14159)
assert convert(DeepThought()) == 42
require_implicit(DeepThought())
requires_conversion(DeepThought())
assert convert(DoubleThought()) == 0 # Fishy; `int(DoubleThought)` == 42
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not great, but kind of a consequence of us saying "everything with __index__ is already an int, so don't try converting".

py_value = as_unsigned<py_type>(src_or_index.ptr());
} else { // signed integer:
py_value = sizeof(T) <= sizeof(long)
? (py_type) PyLong_AsLong(src_or_index.ptr())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, another note: this results in warnings, and I think that's correct. Because it's not just getting out the long from the PyLong object, but it's also trying to do the conversion.

Yes, the C API is quite messy here. And it's only made worse by the structure of this caster.

@rwgk
Copy link
Collaborator

rwgk commented Jan 24, 2021

FYI: I tried running this through our global testing last night but something went wrong, the tests only ran with a previous version of this PR. I'll try again.

@rwgk
Copy link
Collaborator

rwgk commented Jan 24, 2021

To keep track of an observation, below is the error I saw in the first global testing run. With the current version of this PR the same test passes.

For completeness: the dreamplace code used here is ~1 year behind github.
The non-matching args were: tensor(3.), tensor(3.)

  File "dreamplace/ops/electric_potential/electric_overflow.py", line 240, in forward
    self.num_threads
TypeError: fixed_density_map(): incompatible function arguments. The following argument types are supported:
    1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: at::Tensor, arg4: at::Tensor, arg5: float, arg6: float, arg7: float, arg8: float, arg9: float, arg10: float, arg11: int, arg12: int, arg13: int, arg14: int, arg15: int, arg16: int, arg17: int) -> at::Tensor

Invoked with: tensor([ 124.0602,    0.0000,  499.0000,    0.0000,  315.0000,   85.0000,
         124.0602,  100.0000,  499.0000,  101.0000,   65.0000,  105.0000]), tensor([   1.8796,    0.0000,    0.0000,    0.0000,  120.0000,   80.0000]), tensor([   1.8796,    0.0000,    0.0000,    0.0000,  120.0000,   40.0000]), tensor([  31.2500,   93.7500,  156.2500,  218.7500,  281.2500,  343.7500,
         406.2500,  468.7500]), tensor([  31.2500,   93.7500,  156.2500,  218.7500,  281.2500,  343.7500,
         406.2500,  468.7500]), 0.0, 0.0, 500.0, 500.0, tensor(62.5000), tensor(62.5000), 1, 5, 8, 8, tensor(3.), tensor(3.), 8

@YannickJadoul
Copy link
Collaborator Author

To keep track of an observation, below is the error I saw in the first global testing run. With the current version of this PR the same test passes.

This should now also be covered by our own tests. I added the TypeErrorThought (yes, this naming joke got out of hand quite quickly) in caa5382, which demonstrated the observed failure. And of course, the current version of the passes this new test.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR passed the Google global testing now.
Thanks @YannickJadoul!

@@ -256,6 +256,13 @@ class DeepThought(object):
def __int__(self):
return 42

class DoubleThought(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double meaning it has index and int?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I meant :-) As admitted yesterday:

yes, this naming joke got out of hand quite quickly

Should I still pick better names?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. IntIndexThought, at least. :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done so. This should make things more clear? More boring as well, but ...

7fd2db5 has only changes to our own tests, so if these pass, look good/better to you, and you're happy with the way DoubleThought/IntAndIndex is handled, then we can merge this, @henryiii

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. IntIndexThought, at least. :)

I liked the mix of HHGG and 1984, but yes ;-)

@henryiii
Copy link
Collaborator

LGTM!

@YannickJadoul YannickJadoul merged commit 0bb8ca2 into pybind:master Jan 25, 2021
@YannickJadoul
Copy link
Collaborator Author

Thanks, all! Another tiny bit of progress :-)

@YannickJadoul YannickJadoul deleted the noconvert-int-index-pre-3.8 branch January 25, 2021 20:05
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Jan 25, 2021
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants