Refactor assertions in `test_creation.py` #32

honno · 2021-10-22T18:27:39Z

In #30 I'm experimenting with the ph.assert_dtype() assertion util. As well as those promoted dtype checks, I realise a lot of dtype assertions relate to the default int and float, so I wanted to experiment with refactoring those too... I didn't get round to that today, seeing as how tricky just understanding test_arange was (I think it covers more now at least).

array_api_tests/test_creation_functions.py

honno · 2021-10-27T17:58:31Z

I unexpectedly spent a lot of time extending/modifying namely the arange and linspace tests. On that front I believe they should cover more things, and run much faster as I've managed to avoid filtering/throwing Hypothesis examples. Especially interested if you have thoughts on the specified_kwargs() strategy I wrote in the spirit of #22 . A rough array equals is something I could get round to as well.

In terms of refactoring assertions, I've actually found the process useful in making sure to test all the appropriate things in each test method. It also saves time in creating user-friendly error messages, especially if you want to fix/modify the message. I was thinking of moving these into say pytest_helpers.py and using them in test_elementwise.py, as I'm certain I missed things in those methods—and generally I'll use them going forward.

Tomorrow I'll have to fix a bug I just caught (see comment), but otherwise I'm happy for a review.

asmeurer · 2021-10-27T20:32:55Z

Is the point of specified_kwargs that you can't use kwargs because the tests use data?

A rough array equals is something I could get round to as well.

I expect this to be pretty thorny. The spec doesn't require any level of precision for output values, and different libraries may return different results if they use different implementations. There have been some ideas (e.g., #7), but none of it is simple. I'd personally rather see at least basic coverage of all the functions before we start to worry about testing floating-point correctness. Right now there are still too many functions that aren't tested at all, or are only tested to a bare minimum in test_signatures.

asmeurer · 2021-10-27T20:57:21Z

Also, you probably figured this out already, but we don't need to generalize the argument handling of arange too much. arange's signature is quite unique in the specification. It's the only function with a signature as complicated as it is.

I will be refactoring assertion messages anyway

honno · 2021-10-28T09:24:36Z

Is the point of specified_kwargs that you can't use kwargs because the tests use data?
Also, you probably figured this out already, but we don't need to generalize the argument handling of arange too much. arange's signature is quite unique in the specification. It's the only function with a signature as complicated as it is.

Yep, although specified_kwargs() was helpful for test_linspace too. Fairly simple implementation, and I imagine it might come up again when test method signatures cannot realistically mimic the corresponding method signatures.

honno · 2021-10-28T17:41:43Z

Also meshgrid sometimes kills pytest runs, like the test suite has now. Found this tricky to debug and thought it was just an error in my implementation (like when I was extending test_arange), but it's come up again so clearly not. Can explore NumPy's issue tracker tomorrow, otherwise try and identify the issue.

asmeurer · 2021-10-28T21:29:06Z

Looking at the code for test_meshgrid, my guess is that you are not limiting the number of input arrays, so the result is huge. This causes NumPy to try to allocate an array that doesn't fit in memory and it crashes Python. Note that the result shape from meshgrid is the product of the input shapes. You should filter any input where the resulting output would be larger than MAX_ARRAY_SIZE. It's also reasonable to not send more than some smallish number of total inputs. This is another example where the input strategy is complicated, so we want to just test the promotion in the normal test. If anything, I'm surprised this doesn't crash every time.

Also meshgrid should only really be supporting 1-d inputs. That looks like a bug in the numpy implementation.

asmeurer · 2021-10-28T21:38:50Z

array_api_tests/test_creation_functions.py

+            ("dtype", dtype, None),
+        ),
+        label="kw",
+    )


Maybe I'm missing a reason why this wouldn't work, but can all the drawing stuff in this test be factored into a @composite test strategy. The logic here is complicated it is probably a good idea to have some meta tests for it to make sure it doesn't accidentally omit things it shouldn't.

I've now moved specified_kwargs() to hypothesis_helpers.py and wrote a test method for it. I also introduced a named tuple which I think helps to clarify what the arguments mean.

asmeurer · 2021-10-28T21:40:31Z

array_api_tests/test_creation_functions.py

-            assert list(a) == list(r), "arange() produced incorrect values"
+        assert out.dtype == dtype
+    assert out.ndim == 1, f"{out.ndim=}, but should be 1 [linspace()]"
+    f_func = f"[linspace({start=}, {stop=}, {step=})]"


Suggested change

f_func = f"[linspace({start=}, {stop=}, {step=})]"

f_func = f"[arange({start=}, {stop=}, {step=})]"

This will be confusing (or wrong really) if only a single argument is passed because arange(n) is the "start" argument (the first argument) but n is actually the stop. In fact, that's why start is positional-only, so that arange(start=n) is impossible. In general I would writing x=val in error messages if x is a positional-only argument because that makes it not copy-pastable and potentially confusing if a library doesn't use the same positional-only name as the spec.

array_api_tests/test_creation_functions.py

asmeurer · 2021-10-28T21:53:49Z

array_api_tests/test_creation_functions.py

+        size <= hh.MAX_ARRAY_SIZE
+    ), f"{size=} should be no more than {hh.MAX_ARRAY_SIZE}"  # sanity check
+
+    kw = data.draw(


I think we should not test stop and step as keyword-arguments. arange is a little special. We really want it to be a function with three signatures

arange(stop, /, dtype=None) arange(start, stop, /, dtype=None) arange(start, stop, step, /, dtype=None)

(c.f. help(range)) but there's no way to represent that as a single signature in Python. The closest we can do is to make two of the arguments optional by making them keyword arguments. The complication here is first that if only one argument is passed, it is actually the stop (even though it is still called start, and second, things like arange(n, step=k) and arange(n, stop=None) are meaningless. IMO we should make this clearer in the spec, but really arange only ought to support start, stop, and step as positional, i.e., as if it were the above 3 signatures. This was discussed some in data-apis/array-api#107 and data-apis/array-api#85.

I see what you mean. So now I test the following argument combinations:

(start,) (only if stop is None)

(start, stop)

(start, stop, step)

Note step must be used as a keyword argument when the argument combination isn't (start, stop, step), which test_arange indeed passes it as (unless sometimes when step == 1).

Co-authored-by: Aaron Meurer <asmeurer@gmail.com>

honno · 2021-10-29T14:51:09Z

Looking at the code for test_meshgrid, my guess is that you are not limiting the number of input arrays, so the result is huge. This causes NumPy to try to allocate an array that doesn't fit in memory and it crashes Python. Note that the result shape from meshgrid is the product of the input shapes. You should filter any input where the resulting output would be larger than MAX_ARRAY_SIZE. It's also reasonable to not send more than some smallish number of total inputs. This is another example where the input strategy is complicated, so we want to just test the promotion in the normal test. If anything, I'm surprised this doesn't crash every time.

Also meshgrid should only really be supporting 1-d inputs. That looks like a bug in the numpy implementation.

Fixed every issue you identified, thanks! I'm not familiar enough with meshgrid() and memory things generally to dynamically limit the inputs, so I've just hardcoded them in, ii.e. no more than 5 arrays and array sizes cannot exceed 5. Not ideal for a "fully-fledged" meshgrid() test, but probably covers enough in regards to testing type promotion.

asmeurer · 2021-10-29T21:05:29Z

I've merged this. We might still need to do some work on the arange test, but for now I think it's OK. The meshgrid change is fine. We can expand it when we convert it into the actual meshgrid test.

asmeurer reviewed Oct 22, 2021

View reviewed changes

array_api_tests/test_creation_functions.py Outdated Show resolved Hide resolved

asmeurer reviewed Oct 22, 2021

View reviewed changes

array_api_tests/test_creation_functions.py Outdated Show resolved Hide resolved

asmeurer reviewed Oct 22, 2021

View reviewed changes

array_api_tests/test_creation_functions.py Outdated Show resolved Hide resolved

asmeurer reviewed Oct 22, 2021

View reviewed changes

array_api_tests/test_creation_functions.py Outdated Show resolved Hide resolved

asmeurer reviewed Oct 22, 2021

View reviewed changes

array_api_tests/test_creation_functions.py Outdated Show resolved Hide resolved

honno force-pushed the creation-refactor branch 2 times, most recently from 3992a6e to 1b30535 Compare October 27, 2021 11:08

honno marked this pull request as ready for review October 27, 2021 17:33

honno added 19 commits October 28, 2021 10:21

Rudimentary re-implementation of test_arange

4e536d6

Greater variation of step argument in test_arange

ea08680

Draw stop directly in test_arange

235118d

Rough namespacing of *_helpers

bf74cc3

I will be refactoring assertion messages anyway

Assert array elements with integer dtype in test_arange

a768c74

Refactor default and kw dtype assertions

450378e

Re-implemented test_eye

157dc99

Refactor shape assertions with assert_shape

dda7a95

Rudimentary re-implementation of test_linspace

a51010e

Refactor fill assertions

c031908

Style changes

7b4900a

Use dtype bounds in test_arange

d36c8c4

Fix tolerance issues with test_arange

23fb8e3

Nicer assertion messages in test_eye

a80b55b

Use pos-only args for assert helpers with kwargs

8e581b0

specified_kwargs() strategy to test default kwargs

ebb6c36

frange class to mimic range behaviour for floats

fa50249

Improve int_stops() scope and performance

f6ea2c1

Clearer size/shape assertions

f9b679f

honno added 2 commits October 28, 2021 10:21

More forgiving size assertion in test_arange

3a90690

More flexible size assertions for int arrays in test_arange

a4a0a35

honno force-pushed the creation-refactor branch from abafee1 to a4a0a35 Compare October 28, 2021 09:21

honno requested a review from asmeurer October 28, 2021 10:31

honno added 3 commits October 28, 2021 13:13

Move creation assert helpers to pytest_helpers

a3e5a01

Remove old custom strategies for test_elementwise.py

e446a18

Add shape assertions to more test cases

ec081d7

asmeurer reviewed Oct 28, 2021

View reviewed changes

array_api_tests/test_creation_functions.py Outdated Show resolved Hide resolved

asmeurer reviewed Oct 28, 2021

View reviewed changes

honno and others added 4 commits October 29, 2021 12:01

Fix func name in error messages

460034a

Co-authored-by: Aaron Meurer <asmeurer@gmail.com>

Specify hh.specified_kwargs() arguments as named tuple, test it

a5f294e

Avoid keywords for pos-only args in error messages

f7fc94b

Test more valid argument signatures in test_arange

aaf0a7d

honno force-pushed the creation-refactor branch from 794ef30 to aaf0a7d Compare October 29, 2021 13:41

Only generate 1D arrays in test_meshgrid, prevent memory errors

ca2ef81

honno force-pushed the creation-refactor branch from 4574441 to ca2ef81 Compare October 29, 2021 14:46

honno requested a review from asmeurer October 29, 2021 14:51

honno mentioned this pull request Oct 29, 2021

Operator tests #35

Merged

asmeurer merged commit 035e3f3 into data-apis:master Oct 29, 2021

honno deleted the creation-refactor branch February 8, 2022 10:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor assertions in `test_creation.py` #32

Refactor assertions in `test_creation.py` #32

honno commented Oct 22, 2021

honno commented Oct 27, 2021

asmeurer commented Oct 27, 2021

asmeurer commented Oct 27, 2021

honno commented Oct 28, 2021

honno commented Oct 28, 2021 •

edited

asmeurer commented Oct 28, 2021

asmeurer Oct 28, 2021

honno Oct 29, 2021

asmeurer Oct 28, 2021

asmeurer Oct 28, 2021

asmeurer Oct 28, 2021

honno Oct 29, 2021

honno commented Oct 29, 2021

asmeurer commented Oct 29, 2021

	f_func = f"[linspace({start=}, {stop=}, {step=})]"
	f_func = f"[arange({start=}, {stop=}, {step=})]"

Refactor assertions in test_creation.py #32

Refactor assertions in test_creation.py #32

Conversation

honno commented Oct 22, 2021

honno commented Oct 27, 2021

asmeurer commented Oct 27, 2021

asmeurer commented Oct 27, 2021

honno commented Oct 28, 2021

honno commented Oct 28, 2021 • edited

asmeurer commented Oct 28, 2021

asmeurer Oct 28, 2021

Choose a reason for hiding this comment

honno Oct 29, 2021

Choose a reason for hiding this comment

asmeurer Oct 28, 2021

Choose a reason for hiding this comment

asmeurer Oct 28, 2021

Choose a reason for hiding this comment

asmeurer Oct 28, 2021

Choose a reason for hiding this comment

honno Oct 29, 2021

Choose a reason for hiding this comment

honno commented Oct 29, 2021

asmeurer commented Oct 29, 2021

Refactor assertions in `test_creation.py` #32

Refactor assertions in `test_creation.py` #32

honno commented Oct 28, 2021 •

edited