Backwards incompatible ideas for a major release
Matti Picus edited this page Nov 23, 2022
·
26 revisions
This is a collection of ideas for changes that would break backwards compatibility and be inappropriate for anything but a major release. If/when we do make a major release, we can then go through this list and see what can be taken along.
That does not mean that all of these ideas necessarily strictly require a major release.
Adding your name to the change, which means you will do the work to implement it would be nice.
- Make the default integer type int64 at least on python 3, no matter what long is on the system (from @seberg). (Making the default integer equal to
intp
may be the simpler option that is at least predictable/easier to reason about.) - overhaul casting rules to avoid things like
uint64 + int64 -> float64
. Perhaps use "C-like" casting instead. See https://github.com/numpy/numpy/issues/12525. - Casting rules for arithmetic are value-dependent for scalars (https://github.com/numpy/numpy/issues/6240)
-
result_type(int, str)
should beobject
, notstr
. (seberg: Or shouldn't it simply raise an error, i.e. never implicitly go to object? – In which case, is simple deprecation viable?) - Require explicitly writing
dtype=object
to get an object dtype array. - Find a solution to issues created by
PyArray_Return
: That is, most numpy functions, importantly ufuncs, convert 0-D array results to scalars when returning. This could be a breaking change returning arrays always, or more complex solutions. Possible steps forward that do not require breakage (immediately) are discussed in https://github.com/numpy/numpy/issues/13105. - Make the ufunc
out
argument force a higher precision loop (maybe possible without a major version increase?). https://mail.python.org/pipermail/numpy-discussion/2019-September/080106.html - Some APIs (probably only the loadtxt/genfromtxt) have backcompat to default to "byte string" behaviour, it would be nice to remove that and switch the default.
- Get rid of
long double
/float96
/float128
completely. This is a very cumbersome alias on macOS, Windows and Linux-aarch64. And on Linux-x86 it's 80-bit. The time spent on long doubles is not worth it. -
np.logical_or.reduce()
, etc. (but more specifically and less controversial maybe justany
/all
) should probably return booleans by default. I.e. by default, do not try to imitate Python's logicalor
operator. - Clean up the namespaces, underlined exposed functions, and aliases.
-
Extend the ndarray struct in order to speed up and clean up buffer handling.We did this already. - Implement the
bf_releasebuffer
on ndarray. This was never done, because it breaks compatibility with the"s"
, etc. parsing codes for thePyArg_Parse*
API. However, maybe this break is more acceptable now and easier with a major release. Further, scalars have their own code paths now, so the amount affected code may be smaller. - Delete the sigint header and related functions (technically an ABI and API break, but a loud one and nobody probably notices)
- Removing
NPY_CHAR
(see https://github.com/numpy/numpy/issues/2801 and linked PRs/issues) - Dtype cleanup ideas (see https://github.com/numpy/numpy/issues/2899)
- Make the
PyArray_Descr
andPyUfunc_Object
structs opaque like we did withPyArray_Object
, extractingPyArray_Descr_Fields
etc - this allows us to make API changes more easily later. - modify
NPY_SORTKIND
to allow different sorting algorithms (timsort, radixsort). THis requires a change in size ofPyArray_ArrFuncs
See https://github.com/numpy/numpy/pull/12586 https://github.com/numpy/numpy/pull/12586 - Increase NPY_MAXARGS to more than 32, see https://github.com/numpy/numpy/issues/4398.
- implement radixsort once sort ABI can be changed, see https://github.com/numpy/numpy/pull/12586
- Remove promise to handle
NULL
asPy_None
for object arrays (we do not use this, and it crashes hard, so could probably do it without a major release as well). The probably necessary exception is uninitialized or cleared data. To simplify buffer initialization and clearing it is easier to NULL it initially (and NULL it again upon clearing). That means writing to an object buffer/array must usePy_XSETREF
. However, reading from an array/buffer would be allowed to assumeNULL
is not possible. A buffer has to be cleared (to avoid double clearing) usingPy_XSETREF(ptr, NULL)
.
- The
elsize
slot ofPyArray_Descr
should benpy_intp
orssize_t
and not integer.