Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strategies for complex numbers #1076

Closed
ghost opened this issue Jan 16, 2018 · 24 comments
Closed

Strategies for complex numbers #1076

ghost opened this issue Jan 16, 2018 · 24 comments
Labels
enhancement it's not broken, but we want it to be better

Comments

@ghost
Copy link

ghost commented Jan 16, 2018

I've just started using Hypothesis to check that some polynomial classes from my MVPoly package satisfy the ring axioms. All very smooth, great library. But I'm a bit puzzled that the complex-numbers strategy is so weak (in terms of customisation) when compared to the float strategy. Perhaps I'm missing something.

Of course, one can define one's own strategy, as done here. I wonder if there would be any interest in a tiny PR which ports this into Hypothesis proper?

@DRMacIver
Copy link
Member

I'm a bit puzzled that the complex-numbers strategy is so weak (in terms of customisation) when compared to the float strategy. Perhaps I'm missing something.

Nope, not missing anything, it's probably just got neglected because none of us were actively using it!

We'd be delighted to accept a PR. We might have to bikeshed a little bit about the API bit though. e.g. I don't think it makes sense to have a min_value and max_value parameter for complex numbers (maybe min and max magnitude instead?), which just passing the args through to floats would add.

@ghost
Copy link
Author

ghost commented Jan 16, 2018

Bikeshedding's good, and I was thinking the same thing (in terms of min/max magnitude) since the ring axioms for polynomials are mostly inherited from the coefficients, and for floating point numbers they only apply (even approximately) away from over/underflow.

I'll raise the PR in a couple of days.

@Zac-HD
Copy link
Member

Zac-HD commented Jan 17, 2018

I think we have some actual design work to do, not bikeshedding! My main question is what we're doing beyond

def complex_numbers(real=floats(), imag=floats()):
    return builds(complex, real, imag)

I mean, that's strictly speaking an improvement from the zero-argument form we have now - but we could trivially suggest that in the docs for when independent real and imaginary parts are desired. So:

  • it seems obvious to me that we want to be able to generate a complex float with non-independent parts, for example by bounding the magnitude
  • In the limit, how do we generate complex numbers with a single allowable magnitude (eg 1.0)? Filtering won't do here! How do we handle very narrow magnitude bounds? Single-ended bounds?
  • What about real=None, imag=None, min_mag=None, max_mag=None, allow_infinity=None, allow_nan=None) as an argspec? It does everything, but I think with too many interacting arguments - so what will we give up?

My proposal: implement complex_numbers(min_mag=None, max_mag=None, allow_infinity=None, allow_nan=None) and suggest the above form with real, imag in the docstring.

@DRMacIver
Copy link
Member

In the limit, how do we generate complex numbers with a single allowable magnitude (eg 1.0)? Filtering won't do here! How do we handle very narrow magnitude bounds? Single-ended bounds?

I think it might make sense to generate complex numbers by drawing a magnitude and an angle rather than drawing the two coordinates independently and filtering. This makes narrow magnitudes work essentially automatically.

Your proposal is essentially what I was thinking as the natural interface, yes. I don't think having the version that is trivially replaceable with builds makes much sense.

@ghost
Copy link
Author

ghost commented Jan 17, 2018

I've been thinking about magnitude constraints. It seems natural mathematically to draw magnitude and argument from user-specified float ranges then use Euler to convert those values to real-imaginary, but

  • the Euler step is not-trivial (in terms of floating point operations), might we introduce artificial problems arising from that?
  • The argument value is "rotationally symmetric", there is no good reason why we should have lots of samples (and the threat of underflow) near zero, nor sparse sample near pi. Zero is not a privileged angle. So use an integer type for angle? (say from INT_MIN to INT_MAX then scaled to [-pi, pi] for the sin/cos calls).
  • complex underflow/overflow occurs at the corresponding component values (Re z small/large or Im z small/large), so if we want to use constraints to keep away from those regions, we'll need to allow the user to constrain real and imaginary parts, regardless of whether constraining magnitudes is implemented.

@ghost
Copy link
Author

ghost commented Jan 17, 2018

How about this. Draw the real & imaginary parts from floats(), but with the real & imaginary constraints accounting for the magnitude constraints as much as is possible; geometrically the disc (magnitude constraint) is covered by a square (weak magnitude constraint) and that intersected by the rectangle of the real & imaginary constraints. Samples from this domain may not actually satisfy the requested constraints (e.g., the corners of the square) but we can project them onto the circle by a simple scaling. The scale is mag_max/magnitude and magnitude is hypot() in the C standard library. Lots of details to work out on this ...

Thoughts?

@Zac-HD Zac-HD added the enhancement it's not broken, but we want it to be better label Jan 24, 2018
@Zac-HD
Copy link
Member

Zac-HD commented Jan 24, 2018

Sounds at least reasonable to me - I think at this point it would be good to discuss a prototype, so we can look at the shrinking properties in more detail. Are you interested in opening a pull with this more complex (😜) proposal? I'd be happy to help out but don't have the time to write it myself at the moment.

@ghost
Copy link
Author

ghost commented Jan 24, 2018

Sure, I'll have a crack at it. I suggest that in the first instance only the mag_max is implemented since this gives a single rectangle whose corners need to be projected; adding mag_min makes a rectangle with a (origin centred) square removed, leading to a non-convex (possibly L-shaped) region; but that can be covered exactly by at most 4 rectangles which can be handled in the same way.

@ghost
Copy link
Author

ghost commented Feb 6, 2018

I've run into some issues on starting this, I did try IRC but got no responses, so perhaps I can ask in this, more asynchronous channel.

  • Most importantly, I can't for the life of me work out how to run a single test; anything from the Makefile seems to want to download and compile pythons and packages and then run a full 30 minute suite, any pointers?
  • In the tests, I see several which have a call to the library but make no assertion on the result (here for example), is this a python variant on rspec's skip? an oversight?
  • The tests on complex numbers are assertion free, and I'm baffled by both of the lines -- why would minimal(complex_numbers(), lambda x: x.imag != 0) == 0j when the proc explicitly excludes zero, and the shrinking moves float values to zero, right? so minimal(complex_numbers(), lambda x: x.real != 0) == 1 ?? (I'd expect 0 + float_min*j and float_min + 0j respectively)

@DRMacIver
Copy link
Member

DRMacIver commented Feb 6, 2018

I did try IRC but got no responses,

Sorry, IRC that time of day is a bit of a mixed bag! I'm likely to be on it on UK work hours and sometimes at other times. Others are present off and on.

Most importantly, I can't for the life of me work out how to run a single test; anything from the Makefile seems to want to download and compile pythons and packages and then run a full 30 minute suite, any pointers?

#1100 is still a work in progress, but has some pointers. Short version is that if you've got the dependencies installed and src on the path then you can use pytest to run tests.

In the tests, I see several which have a call to the library but make no assertion on the result (here for example), is this a python variant on rspec's skip? an oversight?

Some tests that don't have assertions are probably just asserting that they can find the thing at all, as the functions will raise an error if they don't (which should be find_any rather than minimal, but that's a more recent addition)

But...

The tests on complex numbers are assertion free, and I'm baffled by both of the lines -- why would minimal(complex_numbers(), lambda x: x.imag != 0) == 0j when the proc explicitly excludes zero, and the shrinking moves float values to zero, right? so minimal(complex_numbers(), lambda x: x.real != 0) == 1 ?? (I'd expect 0 + float_min*j and float_min + 0j respectively)

Uh, you know, I have absolutely no idea what that test is meant to be doing. It looks wrong on several fronts! Certainly if you added an assert to it like I'm pretty sure was intended it should fail.

One thing to note is that floating point numbers don't minimize towards 0. Roughly speaking they minimize towards "fractionally simpler" - first by their integer part closer to zero, then by the lowering the number of places after the . in their binary expansion. So that should be 1j and 1 respectively.

@ghost
Copy link
Author

ghost commented Feb 6, 2018

In reverse order

  • Ah that make sense
  • so rspec's expect { whatever }.to_not raise_error, understood
  • still no joy: with
$ echo $PYTHONPATH
/home/jjg/jjg/src/apps/hypothesis-python/src

and stock Ubuntu python

$ python --version
Python 2.7.12

and pytest installed from pip

$ python -m pytest tests/cover/test_simple_numbers.py
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/pytest.py", line 74, in <module>
    raise SystemExit(pytest.main())
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 50, in main
    config = _prepareconfig(args, plugins)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 161, in _prepareconfig
    pluginmanager=pluginmanager, args=args)
  File "/usr/local/lib/python2.7/dist-packages/pluggy/__init__.py", line 617, in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pluggy/__init__.py", line 222, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pluggy/__init__.py", line 216, in <lambda>
    firstresult=hook.spec_opts.get('firstresult'),
  File "/usr/local/lib/python2.7/dist-packages/pluggy/callers.py", line 196, in _multicall
    gen.send(outcome)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/helpconfig.py", line 68, in pytest_cmdline_parse
    config = outcome.get_result()
  File "/usr/local/lib/python2.7/dist-packages/pluggy/callers.py", line 77, in get_result
    _reraise(*ex)  # noqa
  File "/usr/local/lib/python2.7/dist-packages/pluggy/callers.py", line 180, in _multicall
    res = hook_impl.function(*args)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 944, in pytest_cmdline_parse
    self.parse(args)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 1109, in parse
    self._preparse(args, addopts=addopts)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 1071, in _preparse
    self.pluginmanager.consider_preparse(args)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 390, in consider_preparse
    self.consider_pluginarg(opt2)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 399, in consider_pluginarg
    self.import_plugin(arg)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/config.py", line 430, in import_plugin
    __import__(importspec)
  File "/usr/local/lib/python2.7/dist-packages/_pytest/assertion/rewrite.py", line 97, in find_module
    if not self._should_rewrite(name, fn_pypath, state):
  File "/usr/local/lib/python2.7/dist-packages/_pytest/assertion/rewrite.py", line 165, in _should_rewrite
    if fn_pypath.fnmatch(pat):
  File "/usr/lib/python2.7/dist-packages/py/_path/common.py", line 209, in fnmatch
    return FNMatcher(pattern)(self)
  File "/usr/lib/python2.7/dist-packages/py/_path/common.py", line 402, in __call__
    return py.std.fnmatch.fnmatch(name, pattern)
  File "/usr/lib/python2.7/dist-packages/py/_apipkg.py", line 125, in __makeattr
    result = importobj(modpath, attrname)
  File "/usr/lib/python2.7/dist-packages/py/_apipkg.py", line 48, in importobj
    module = __import__(modpath, None, None, ['__doc__'])
  File "/usr/local/lib/python2.7/dist-packages/_pytest/assertion/rewrite.py", line 97, in find_module
    if not self._should_rewrite(name, fn_pypath, state):
  File "/usr/local/lib/python2.7/dist-packages/_pytest/assertion/rewrite.py", line 165, in _should_rewrite
    if fn_pypath.fnmatch(pat):
  File "/usr/lib/python2.7/dist-packages/py/_path/common.py", line 209, in fnmatch
    return FNMatcher(pattern)(self)
  File "/usr/lib/python2.7/dist-packages/py/_path/common.py", line 402, in __call__
  :
 RuntimeError: maximum recursion depth exceeded while calling a Python object

Hmm, possibly a missing package? but not clear from the backtrace which. I have attrs 17.4 and coverage 4.5.

Trying the stock pytest:

$ sudo apt-get install python-pytest
  :

no -n option

$ python -m pytest tests/cover/test_simple_numbers.py
usage: pytest.py [options] [file_or_dir] [file_or_dir] [...]
pytest.py: error: unrecognized arguments: -n
  inifile: /home/jjg/jjg/src/apps/hypothesis-python/tox.ini
  rootdir: /home/jjg/jjg/src/apps/hypothesis-python

possibly too old? Seems not since setup.py requests 2.8.0 or later and

$ apt-cache show python-pytest | grep Version
Version: 2.8.7-4

@Zac-HD
Copy link
Member

Zac-HD commented Feb 6, 2018

no -n option

Ah, you're missing pytest-xdist. Try pip install -r requirements/test.in to get the test dependencies (or .txt for pinned versions).

@ghost
Copy link
Author

ghost commented Feb 6, 2018

Thanks @Zac-HD !

Installing via pip requirements/test.in (or test.txt) still has the strange recursion but the stock version works:

$ sudo apt-get install python-pytest python-pytest-xdist
  :
$ python -m pytest tests/cover/test_simple_numbers.py
============= test session starts ===========================
platform linux2 -- Python 2.7.12, pytest-2.8.7, py-1.4.31, pluggy-0.3.1
rootdir: /home/jjg/jjg/src/apps/hypothesis-python, inifile: tox.ini
plugins: xdist-1.8, hypothesis-3.44.24, flaky-3.4.0, forked-0.2
gw0 [188] / gw1 [188]
scheduling tests via LoadScheduling
 :
======== 188 passed, 2 pytest-warnings in 16.33 seconds ========

mr-burns-excellent-1

Will now set to coding -- thanks again 😄

@ghost
Copy link
Author

ghost commented Feb 7, 2018

More design bike-shedding requested. So, suggested above was that the real and imaginary parts be specified along the lines of

def complex_numbers(real=floats(), imag=floats(), ...):
    yada_yada(...)
    return builds(complex, real, imag)

where ... is for later. I've pondered this a bit and find myself increasingly against it

  • floats() is in a sense "atomic", one can specify a min/max nan/no-nan, inf/no-inf, but that's as far as it goes; to create more complex strategies of floats make unions with one_of, filter on particular conditions and so-on. By allowing a strategy as argument for complex_numbers, we break symmetry, it is a "composite" strategy, contrary the fact that, mathematically, complex numbers are simpler than real.
  • If we do have a strategy as argument, must it be a floats() strategy? Why not composite strategy if it provides all of the methods that the complex_numbers() strategy needs; that the Python way after all. But to implement the modulus constraint in the way that I describe above, I would need to extract the min/max values from the strategies; could be done for pure float() strategies (right?) but for composites? I doubt it. What if the user sets real=integers(), or strings()? It seems unkind to give the caller so many way to shoot him/herself in the foot. So restrict to float() strategy arguments by inspection? very un-Pythonic!

But the alternative (an "atomic" complex_numbers() with only non-strategy arguments) is not terribly attractive either, we'd need all of the arguments for floats() for each of the real and imaginary parts
(and for consistency they'd have longish names, imag_max_value etc) , possibly have unprefixed versions of all which apply to both real and imaginary, and we've not even started on the modulus constraint argument(s), so 12+ keywords, not nice ☹️.

Still, the latter seems less bad to me than the former. Any thoughts (especially alternatives) on this?

@Zac-HD
Copy link
Member

Zac-HD commented Feb 8, 2018

Here's my suggestion:

def complex_numbers(min_magnitude=0, max_magnitude=None,
                    allow_infinity=None, allow_nan=None):
    """Returns a strategy that generates complex numbers.

    This strategy views a complex number as a vector in the complex plane
    with some magnitude (length) and argument (angle).  If you would prefer
    to define complex numbers in terms of real and imaginary parts, you can
    do so with::

        my_complex_numbers = builds(complex, floats(), floats())

    ``min_magnitude`` and ``max_magnitude`` must be positive real numbers,
    or None for the upper limit. 
    TODO: describe inf and nan arguments here once behaviour is set

    Examples from this strategy shrink by shrinking their magnitude as for
    `~hypothesis.strategies.floats`, and simplifying the ratio between the
    real and imaginary parts.

    """
    # Check that both magnitude bounds are positive
    # Check that both bounds are instances of numbers.Real  (see issue #814)
    # Check that low bound is lower than high bound
    # Check that not max_magnitude is not None and allow_infinity and isfinite(max_magnitude)
    # If allow_nan is None: allow_nan = max_magnitude is None
    
    # Draw magnitude from floats(min_magnitude, max_magnitude, 
    #                            allow_infinity=allow_infinity, allow_nan=False)
    # Draw argument.  I have a cute scheme for this, but in practise you're
    # probably better off drawing from floats(0, 1).map(lambda f: f * math.pi)
    # (note: multiply by pi instead of drawing from range for better shrinking)
    
    # Construct and return complex number.  Yay trig!
    pass

Basically, we just document how to implement a strategy in terms of real and imag parts, because it's trivial. We provide the above, defined in terms of magnitude, and just leave more specific requirements to the user - after all, until now there have been no options whatsoever, and you're the first person ever to raise an issue 😄

(oh, and if we use this the whole ComplexStrategy class can be deleted!)

@ghost
Copy link
Author

ghost commented Feb 8, 2018

Ah, so avoid the bulk of keywords by not supporting them, I like it 😄. The mag/arg intermediate representation I want to avoid, since it cannot represent a wide range of values (for large magnitudes, the granularity of floating point theta is amplified). The minimal-bounding/maximal-enclosed squares projected to the annulus in the native representation approach outlined above is easier without the min/max real/imag constraints obviously.

@ghost
Copy link
Author

ghost commented Feb 12, 2018

A quickie, almost certainly me being dozy. With the complex_numbers strategy modified to have exactly the signature as above, and new test following that for floats:

def test_complex_numbers_magnitude_range_nan():
    with pytest.raises(InvalidArgument):
        complex_numbers(min_magnitude=float(u'nan')).example()

I get TypeError: complex_numbers() takes no arguments (1 given), the classic beginner's error on missing self on an instance method. Almost as if the signature is defined elsewhere (or cached somewhere), very strange. Any pointers?

@DRMacIver
Copy link
Member

Any pointers?

Hard to say without seeing the specific code, but is it possible it's an installed version confusion? e.g. are you accidentally using an installed version of Hypothesis (where complex_numbers indeed takes no arguments) rather than your local checkout?

@ghost
Copy link
Author

ghost commented Feb 12, 2018

Oh of course it is, I had manually added the hypothesis src dir to my PYTHONPATH on last session, and meant to add it to my .bashrc for persistence, but forgot; apologies for the noise.

@Zac-HD
Copy link
Member

Zac-HD commented Feb 12, 2018

Oh of course it is, I had manually added the hypothesis src dir to my PYTHONPATH on last session, and meant to add it to my .bashrc for persistence, but forgot; apologies for the noise.

I'd suggest pip install -e . instead, but whatever works 😄

@ghost
Copy link
Author

ghost commented Mar 7, 2018

Slowly working through this, and thought that I sort-of understood the ideas. But I've hit a behaviour that I can't seem to grok. In my implementation, I project "out of magnitude" values onto the annulus radially, so the draws from the real and imaginary parts will always result in a complex value satisfying the magnitude constraints.

For this test of minimal, the (magnitude-) unconstrained version (uncommented) correctly finds the simple value 1+0j, after trying the sequence

(0.0, 0.0)
(1.7976931348623157e+308, 2.00001)
(1.7976931348623157e+308, 2.00001)
(0.0, 2.00001)
(1.7976931348623157e+308, 2.00001)
(1.7976931348623157e+308, 0.0)
(1.0, 0.0)
(0.0, 1.0)
(-0.99999, inf)
(-1.7976931348623157e+308, -1.192092896e-07)
(1.1263679760248008e+16, nan)
(0.0, nan)
(1.1263679760248008e+16, nan)
(1.1263679760248008e+16, 0.0)
(nan, -0.99999)
(-inf, 0.0)
(-3.895339568724221e+16, 1.0927360291110735e+189)
(-1.1, 2.220446049250313e-16)
(inf, nan)
(0.0, nan)
(inf, nan)
(inf, 0.0)
(1.0, 0.0)

but for the constrained version (commented), for which 1+0j is a valid value, the found result is not simple at all, 0.493+0j, although it does satisfy the constraints. What puzzles me is the draw sequence is just

(0.0, 0.0)
(0.0, 0.0)
(0.0, 0.0)

(those each being projected to (0.493, 0) for the output). It seems that the projection is causing the minimisation to "not try that hard" to simplify the result, but I don't really understand why ... any pointers?

@Zac-HD
Copy link
Member

Zac-HD commented Mar 8, 2018

Simple once you know: Hypothesis minimises at the level of an underlying bytestream, not the output values. floats() is designed to make these minimisations equivalent (for a certain ordering), but the Conjecture engine knows nothing about your projection.

I think we therefore need a different technique to improve shrinking: using @st.composite to support dependencies between the parts in interactive draws. We'd then simply delete the ComplexStrategy class.

  • Draw the complex part from floats(-max_mag, max_mag, ...). This ensures that we will shrink towards a complex part of 0j according to the normal shrink ordering for floats().
  • Draw the real part from floats with scaled bounds, and draw a bool to decide the sign (False=positive). This will shrink according to the normal floats ordering within the constraints imposed by the annulus for the specific value of the complex part.
  • Optional enhancement: draw a boolean, and if True transpose the real and complex parts. This would explore exactly the same space, but might be more likely to try diverse examples or stumble across other things while shrinking.

Also: I'd suggest rebasing onto the current master, and opening a pull request - the review tools are pretty handy. On a related note, when testing that some combination of arguments does or does not raise InvalidArgument, you can just add them to the list in tests/cover/test_direct_strategies.py (I don't know why it's called that either!).

@ghost
Copy link
Author

ghost commented Mar 8, 2018

OK, I think I get why the "not try that hard" behaviour happens: since the simplicity of the input drives the shrinking and the input is (0, 0) which is as simple as you get, there's no point in shrinking further. I can see that we will need a proper complex shrinking strategy, but I'm not sure I understand the suggested approach, the second point is a typo surely (invalid arguments) ?

In any case, I've moved the InvalidArgument tests into the test_direct_strategies file as suggested and opened a PR #1154, shall we continue this discussion there?

@Zac-HD
Copy link
Member

Zac-HD commented Apr 4, 2018

Closed by #1154.

@Zac-HD Zac-HD closed this as completed Apr 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better
Projects
None yet
Development

No branches or pull requests

2 participants