MAINT: Refactor optimization methods to use scipy.stats.qmc #13469

tupui · 2021-01-31T16:02:18Z

This removes legacy QMC code in favour of scipy.stats.qmc functions introduced in #10844.

LHS is present in scipy.optimize._differentialevolution.py and Sobol' is implemented in pure python and used in scipy.optimize._shgo.py.

Also, currently the doc for shgo is using an incorrect number of samples for Sobol' as not being a power of 2.

andyfaff · 2021-01-31T22:39:45Z

@tupui, what benefits are there to changing the LHS initialisation in differential evolution from what was there to the QMC implementation? As is, the code should be distributing the initial guess according to a Latin Hypersquare. The QMC won't make diffev work any faster, because it's a miniscule part of the total work. Moreover, we will also lose an amount of reproducibility against previous versions of scipy because the initialisation will be different for the same seed.

tupui · 2021-02-01T07:59:04Z

@tupui, what benefits are there to changing the LHS initialisation in differential evolution from what was there to the QMC implementation? As is, the code should be distributing the initial guess according to a Latin Hypersquare. The QMC won't make diffev work any faster, because it's a miniscule part of the total work. Moreover, we will also lose an amount of reproducibility against previous versions of scipy because the initialisation will be different for the same seed.

It is roughly the same implementation behind, but doing so would allow in the future to easily switch to any QMC for initialisation. Or maybe I should already do this here? So add other options for the parameter init. Going that direction, I would also suggest to default to Sobol' which has better properties than LHS which can be really bad.

andyfaff · 2021-02-01T23:55:07Z

@rgommers, @ev-br, do we have a scipy policy on strict back compatibility with functions that use random numbers? In this PR changing the implementation of the latinhypersquare initialisation will mean that a differential_evolution minimisation done with different scipy versions will have different outputs for the same seed. There's no bug with the existing implementation, which would've justified the changes in this PR.

andyfaff · 2021-02-01T23:58:10Z

@tupui, if sobol would provide a well distributed set of initial population guesses, then it might be nice to have a sobol option for init. But I wouldn't change it to make it the default option to start with, for the same reasons I mentioned in my previous comment.

ev-br · 2021-02-02T06:30:26Z

Over at scipy.stats, we do change random streams once in a while without any backcompat notices.
I've no informed opinion how severe this change can be in this regard.

rgommers · 2021-02-02T07:57:02Z

I think we should document what the policy is. I'd be in favor of adopting the same policy as the new numpy.random one (for details, see NEP 19): we try hard to not change results for a seed, but there's no guarantee for exact reproducibility between different SciPy versions - we reserve the right to make changes to fix bugs, and for sigficant performance or quality of random variates improvements.

tupui · 2021-02-02T20:22:04Z

With the last commit I propose something toward adding other methods for initialization. If all this is OK, I will add tests and doc. Otherwise I can revert it OC. Just let me know what you think is best 😃

andyfaff · 2021-02-03T05:03:39Z

@tupui, after reading through the link that Ralf sent through I think the best thing to do here is to revert the change to the differential_evolution``init='latinhypersquare' keyword. The existing behaviour does not have bugs, it probably doesn't give significant benefit to make the changes in this PR, and it degrades reproducibility by giving different output from the same seed. That is, the code path for init='latinhypersquare' should give the same output before and after this PR.
However, a 'sobol' and 'halton' initialisation option would be welcomed. These should both be documented in the docstrings for differential_evolution and DifferentialEvolutionSolver. Note, the total population size is normally calculated as popsize * len(x); this would conflict with the 2**m behaviour for 'sobol'. Some thought should go into choosing the correct behaviour here. Should it be the closest power of 2? Note, the population should have at least 5 members, m=2 is too small`.

tupui · 2021-02-03T18:10:44Z

@andyfaff ok I reverted the original LHS code. I added some doc about Halton and Sobol'. To deal with the ns=2**m I am converting the population to the closest power of 2 (ceil and taking into account the min of 5).

As for leaving LHS as the default for compatibility, makes sense. But what about deprecating or using a warning to advice the user to move for another method?

tupui · 2021-02-04T15:49:02Z

@Stefan-Endres what do you thing about these changes? I tried to not touch at anything else than the number of sample for Sobol' which should be a power of 2.

Actually I have to look more into this because when iterating we do not respect anymore the 2**m condition.

tupui · 2021-02-10T17:54:34Z

I am still fighting around with the number of samples. I noticed something odd in iterate_delaunay: is it normal that self.nc += self.n was before the sampling? To be it should be after otherwise for the first iteration we are sampling using more points. @Stefan-Endres, maybe you can help here?
With the last commits, I update what I could in terms of number of samples for Sobol'. So at a new iteration, it is not using self.nc but it checks the internal state of the Sobol' generator to ask for the next 2**m points (~~I still need to change self.n so that we have a correct self.n, self.nc but sounds hackish~~). It is not making a lot of sense to use Sobol' and sample something else than 2**m. If we must do this, then I would suggest we use Halton instead.
Another possibility would be to use the scrambled version of Sobol' (which in any case we should: I am just keeping the base sequence here in case there is any theoretical reason I am not aware of) so that we don't add points exponentially.

Please advise me on what you think I should do here @andyfaff @rgommers

scipy/optimize/_shgo.py

rgommers · 2021-02-12T16:36:21Z

scipy/optimize/_differentialevolution.py

@@ -12,10 +12,11 @@
 from scipy.optimize._constraints import (Bounds, new_bounds_to_old,
                                         NonlinearConstraint, LinearConstraint)
 from scipy.sparse import issparse
-
+from scipy.stats import qmc


This makes optimize depend on stats, which wasn't the case before. stats did already depend on optimize though. Apparently it doesn't create an import cycle, but it is slightly unfortunate. In other such situations, we've had to move the code that both submodules depend on to scipy/_lib/.

Something to look at a little more closely before merging.

rgommers · 2021-02-12T16:40:59Z

With these kinds of change, there's always a question about whether there is a noticeable difference in performance (in terms of accuracy, harder than speed)? We have accuracy benchmarks in benchmarks/benchmarks/optimize.py for global optimizers, but unfortunately shgo isn't included there.

It's probably fine, since the change is not to the default method and relatively straightforward. But not quite sure - @andyfaff what do you think?

tupui · 2021-02-12T17:11:43Z

With these kinds of change, there's always a question about whether there is a noticeable difference in performance (in terms of accuracy, harder than speed)? We have accuracy benchmarks in benchmarks/benchmarks/optimize.py for global optimizers, but unfortunately shgo isn't included there.

It's probably fine, since the change is not to the default method and relatively straightforward. But not quite sure - @andyfaff what do you think?

I can make some benchmark on my side to check the speed.

What I am really concerned about is the way sobol is used. Currently it is used as a normal random generator. If we want minimal changes, using Halton instead would actually be a good option.

For Sobol', currently I am adding an exponential number of sample. This is not wanted I guess, so I will go ahed and use the scrambled version. This way we would be able to use a constant increase of number of point. Just have the 2**m constrain. Let me know @Stefan-Endres if this goes against theory.

Using the scrambled version, and also fixing 2**m will induce some change on the user side. But right now the way it's used is not correct.

tupui · 2021-02-12T17:42:25Z

I am getting terrible results with the test function from the docstrings using the scrambled version. This is very interesting...

What I can do instead is to use the generator and fast forward it before any sampling. This way I can "emulate" randomization so that we can ask for a constant number of sample and keep the nice properties.

But this is puzzling. If someone knows about what is happening here that could help.

tupui · 2021-02-13T12:50:28Z

With the last commits, I fixed the theoretical and practical issues with Sobol'.
So now we have a constant increase of points while having similar performance and respecting the theory behind the method.

To have independent samples from the unscrambled version, I am skipping self.iters_done * n (with n=2**m) between every sampling. [EDIT: ] just realising this is a particular case here as n is constant we don't need to reset the engine and fast-forward it manually. This is implicitly done at each sampling.

Also, as a reminder, I changed the position of the update of the number of points as when specifying 64 points and 1 iteration, it was actually doing 64*2 points. This explains why sometimes I had to increase (mostly double) the number of points of some tests.

Stefan-Endres · 2021-02-13T15:46:42Z

@tupui thank you for this initiative I think it would be great to refactor the random sampling for the optimization package.

Actually I have to look more into this because when iterating we do not respect anymore the 2**m condition.

Unfortunately the typical use of SHGO is to repeat iterations (with fixed or variable sampling points generated per iteration) until a satisfactory solution is found. Therefore the 2**m recommendation will rarely be satisfied (and is usually not practical in higher dimensions). The other problem is that SHGO should also be able to handle non-linear constraints, so even though the sampling points are generated in a (stretched) hypercube, often the majority of these points will be discarded so in this case many of the low-discrepancy guarantees are void anyway.

I am still fighting around with the number of samples. I noticed something odd in iterate_delaunay: is it normal that self.nc += self.n was before the sampling? To be it should be after otherwise for the first iteration we are sampling using more points. @Stefan-Endres, maybe you can help here?

This is a known bug that resulted in an extra iteration. We have fixed this in the upstream repository and I will open a PR to scipy very soon (all that's left is to update the documentation). In the latest version we have streamlined the code a lot so that the sampling points is counted in the same way that the simplicial method counts (and therefore you do not need to make any changes, because any sampling method will work correctly in the new version).

It is not making a lot of sense to use Sobol' and sample something else than 2**m. If we must do this, then I would suggest we use Halton instead.

I think you are absolutely correct from a statistical perspective, from an optimisation perspective this is not practical in higher dimensions because of intermittent convergence criteria. I think that adding Halton above Sobol is a good idea. Any low discrepancy sampling method should work. My idea was to support any sequence that can be generated since some use cases have particular needs.

Another possibility would be to use the scrambled version of Sobol' (which in any case we should: I am just keeping the base sequence here in case there is any theoretical reason I am not aware of) so that we don't add points exponentially.

This would be preferable. My question is how different would this be from the current Sobol sequence generated by SHGO? I have often get very different performance by using different seeds for the sequence.

@rgommers

With these kinds of change, there's always a question about whether there is a noticeable difference in performance (in terms of accuracy, harder than speed)? We have accuracy benchmarks in benchmarks/benchmarks/optimize.py for global optimizers, but unfortunately shgo isn't included there.

I have this in my offline library and just need to clean the code, I will try to open a separate PR as soon as possible.

tupui · 2021-02-13T16:26:28Z

Thank you for the clarifications @Stefan-Endres

The main difference with the new implementation of Sobol' is that it is radically faster (cython), can be scrambled and also we checked the convergence. Still, I also noticed these differences in terms of results with the seed. I guess it's really a matter of the function and the "luck" we had with the bare unscrambled sequence. That's why I was asking if you had maybe some specific properties you needed from the base sequence. Like having the point (0, ..., 0).

I am not sure I understand when you say that you can have a varying number of points. When setting a n I could see it was using it all the time. The way I understand the algorithm is that it uses some batches of points. And at least for the generation of batches, we have a coherent sampling. Having some good coverage here would seems to be wanted, meaning we could benefit from QMC

So from what you say, I could adapt the code to accept also Halton. And I will use then the scrambled version in both cases. I will just need to adapt the number of points for the test cases due to this seeding variation.

As you wanted to work on this, shall I add you to my fork so that you could commit changes there? It might be easier if we both work on the same branch. What do you think?

Stefan-Endres · 2021-02-13T17:16:35Z

The main difference with the new implementation of Sobol' is that it is radically faster (cython), can be scrambled and also we checked the convergence. Still, I also noticed these differences in terms of results with the seed. I guess it's really a matter of the function and the "luck" we had with the bare unscrambled sequence. That's why I was asking if you had maybe some specific properties you needed from the base sequence. Like having the point (0, ..., 0).

If possible sampling the centre of the hypercube (0.5, 0.5, ..., 0.5) usually produces better performance (especially when the optimisation bounds are infinite or unknown), but I can connect this point manually in shgo later as long as the unittests are passing. I think in general there won't be any major change in benchmark performance. As long as the unittests are passing it would be ideal to merge this code with scipy before the next PR for shgo. Since we fixed the issue with an extra iteration run, the overall performance of the algorithm should be increased anyway if we coordinate to release this in the same scipy version.

I am not sure I understand when you say that you can have a varying number of points.

There is also an option (options={`infty_constraints':False}) with Sobol sampling to keep generating sampling points until a feasible number of points is found. For example, if the algorithm generated n=64 points in a new iteration and 10 points are infeasible then shgo will generate another 10 points from QMC and this loop repeats until there are 64 feasible points. This is vital in cases where the user needs to keep memory usage low by not adding these points to the triangulation.

When setting a n I could see it was using it all the time. The way I understand the algorithm is that it uses some batches of points. And at least for the generation of batches, we have a coherent sampling. Having some good coverage here would seems to be wanted, meaning we could benefit from QMC.

If you do not specify any constraints and the objective function is well behaved then it will work like you expect with n = 2**m generated with every call to QMC. The default behaviour of shgo is to not discard infeasible points and then the calls to QMC should again always be n = 2**m.

So from what you say, I could adapt the code to accept also Halton. And I will use then the scrambled version in both cases. I will just need to adapt the number of points for the test cases due to this seeding variation.

Yes, I suspect that Halton would deliver better performance in general anyway, and ideally we can keep sobol for back compatibility together with the recommendation to use 2**m sampling points in the documentation. Since sobol is not the default sampling method anyway we would not need to make any API changes.

As you wanted to work on this, shall I add you to my fork so that you could commit changes there? It might be easier if we both work on the same branch. What do you think?

Yeah, I think the best approach should be to implement Halton in this fork, then I will work on making sobol back compatible to ensure all the tests pass. It will be a bit of a merge conflict with the newer shgo code, I think it will unfortunately take a while to review that PR, so it will be best to keep the new routine separate and deal with that merge later after this one has been merged.

tupui · 2021-02-13T20:54:31Z

Ok perfect then I will add Halton and we keep Sobol' for back compatibility. I will update the doc as well. I will do this early next week and hopefully we can move forward with this.

tupui · 2021-02-15T16:34:12Z

@Stefan-Endres I have a first version if you would like to review. Thank you! I left Sobol' unscrambled for now so that the tests pass. I will let you assess this as you know about how this should behave.

Stefan-Endres · 2021-02-15T19:12:28Z

@tupui thank you for this, this looks perfect and everything is running. I will pull this into the upstream library as well. I look forward to posting the benchmarks using halton sampling.

tupui · 2021-03-23T15:37:35Z

@rgommers @mdhaber @andyfaff If you are OK and there are no more comments, I suggest I merge this before the WE.

mdhaber · 2021-03-23T15:48:08Z

Sorry, I can't comment on the rest of this PR. I can only confirm that it addresses the concern in gh-13441.

andyfaff

I've gone through all the files except _shgo. I'm not an expert on that file.

scipy/optimize/_differentialevolution.py

scipy/optimize/_shgo.py

scipy/optimize/tests/test__shgo.py

scipy/stats/_hypotests.py

andyfaff · 2021-03-23T22:24:35Z

@tupui, in general it's good practice that PR's are merged by someone different to the person that created it.

I don't believe it's a hard and fast rule, the exceptions are if the PR is fairly simple and straightforward. This PR is a bit more complicated.

tupui · 2021-03-24T07:03:52Z

I've gone through all the files except _shgo. I'm not an expert on that file.

Thanks @andyfaff for the review. I addressed all the points. For shgo itself, @Stefan-Endres approved earlier in the discussion. Apparently, he has some other improvements that he plan to do in a follow up PR.

scipy/optimize/_differentialevolution.py

scipy/optimize/_shgo.py

tupui · 2021-03-24T09:57:06Z

@andyfaff Thanks again. I made sure the docstrings match between the class and the function for what I changed.

tupui · 2021-03-29T17:37:08Z

@andyfaff if you have time, could you have another look at it?

tupui · 2021-04-02T11:50:15Z

Friendly remainder @andyfaff. Thanks in advance.

tupui · 2021-04-07T14:41:54Z

@rgommers if you have some time, could you have a look? It should be ready as it has been reviewed.

tupui · 2021-04-13T07:42:46Z

@larsoner if you have some time, could you have a look?

tupui · 2021-04-14T17:28:37Z

@mdhaber this one of mine is still pending. If you have some time to help ;)

mdhaber · 2021-04-15T00:09:14Z

@tupui these days (until the summer probably) I would prefer to stick to stats. How about I take a look at gh-13471, instead?

tupui · 2021-04-15T05:04:38Z

@tupui these days (until the summer probably) I would prefer to stick to stats. How about I take a look at gh-13471, instead?

Sure thanks @mdhaber!

larsoner

@larsoner if you have some time, could you have a look?

Not really in my domain (more signals/linalg) but given that @andyfaff looked, his comments seem to have been addressed, and from what I can tell it makes sense, I think we can merge this. Let's give folks who have been pinged a few more days -- @tupui can you ping me on Monday to merge this if I haven't already?

tupui · 2021-04-15T13:41:06Z

@larsoner if you have some time, could you have a look?

Not really in my domain (more signals/linalg) but given that @andyfaff looked, his comments seem to have been addressed, and from what I can tell it makes sense, I think we can merge this. Let's give folks who have been pinged a few more days -- @tupui can you ping me on Monday to merge this if I haven't already?

Thank you 😃 . Sure I will ping you if this is not merged by then.

tupui · 2021-04-19T06:37:42Z

Friendly reminder @larsoner 😄 Thanks.

larsoner · 2021-04-19T12:35:50Z

Thanks @tupui !

tupui · 2021-04-19T12:38:35Z

Thanks Eric!

@Stefan-Endres it's merged, you can now propose your enhancements! Feel free to ping me.

tupui force-pushed the qmc_optim branch from 5cad0fe to e225eb4 Compare February 1, 2021 08:07

tupui force-pushed the qmc_optim branch from e225eb4 to 8874780 Compare February 2, 2021 20:18

tupui force-pushed the qmc_optim branch from 0d976a5 to bd1fa22 Compare February 3, 2021 18:07

tupui added maintenance Items related to regular maintenance tasks scipy.optimize labels Feb 6, 2021

rgommers reviewed Feb 12, 2021

View reviewed changes

scipy/optimize/_shgo.py Outdated Show resolved Hide resolved

rgommers reviewed Feb 12, 2021

View reviewed changes

tupui force-pushed the qmc_optim branch from c227bc2 to 1b32c67 Compare February 12, 2021 22:36

tupui requested a review from andyfaff February 15, 2021 21:07

andyfaff reviewed Mar 23, 2021

View reviewed changes

tupui force-pushed the qmc_optim branch from 0f03a50 to 796c97b Compare March 24, 2021 06:50

MAINT: review from andyfaff

36067d6

tupui force-pushed the qmc_optim branch from 796c97b to 36067d6 Compare March 24, 2021 06:52

andyfaff reviewed Mar 24, 2021

View reviewed changes

DOC: review from andyfaff

bc5e6ab

tupui requested a review from WarrenWeckesser April 4, 2021 22:36

tupui mentioned this pull request Apr 6, 2021

Usability of optimize.shgo #13813

Open

11 tasks

Merge remote-tracking branch 'upstream/master' into qmc_optim

7f0d3ff

tupui requested a review from ev-br April 12, 2021 13:22

larsoner approved these changes Apr 15, 2021

View reviewed changes

larsoner merged commit 71b4bc0 into scipy:master Apr 19, 2021

tupui deleted the qmc_optim branch April 19, 2021 12:36

tylerjereddy added this to the 1.7.0 milestone Apr 20, 2021

MAINT: Refactor optimization methods to use scipy.stats.qmc #13469

MAINT: Refactor optimization methods to use scipy.stats.qmc #13469

Conversation

tupui commented Jan 31, 2021 • edited

andyfaff commented Jan 31, 2021

tupui commented Feb 1, 2021

andyfaff commented Feb 1, 2021

andyfaff commented Feb 1, 2021

ev-br commented Feb 2, 2021

rgommers commented Feb 2, 2021

tupui commented Feb 2, 2021

andyfaff commented Feb 3, 2021

tupui commented Feb 3, 2021

tupui commented Feb 4, 2021 • edited

tupui commented Feb 10, 2021 • edited

rgommers Feb 12, 2021

Choose a reason for hiding this comment

rgommers commented Feb 12, 2021

tupui commented Feb 12, 2021 • edited

tupui commented Feb 12, 2021

tupui commented Feb 13, 2021 • edited

Stefan-Endres commented Feb 13, 2021 • edited

tupui commented Feb 13, 2021 • edited

Stefan-Endres commented Feb 13, 2021

tupui commented Feb 13, 2021 • edited

tupui commented Feb 15, 2021 • edited

Stefan-Endres commented Feb 15, 2021

tupui commented Mar 23, 2021 • edited

mdhaber commented Mar 23, 2021

andyfaff left a comment

Choose a reason for hiding this comment

andyfaff commented Mar 23, 2021

tupui commented Mar 24, 2021

tupui commented Mar 24, 2021

tupui commented Mar 29, 2021

tupui commented Apr 2, 2021

tupui commented Apr 7, 2021

tupui commented Apr 13, 2021

tupui commented Apr 14, 2021

mdhaber commented Apr 15, 2021

tupui commented Apr 15, 2021

larsoner left a comment

Choose a reason for hiding this comment

tupui commented Apr 15, 2021

tupui commented Apr 19, 2021

larsoner commented Apr 19, 2021

tupui commented Apr 19, 2021

tupui commented Jan 31, 2021 •

edited

tupui commented Feb 4, 2021 •

edited

tupui commented Feb 10, 2021 •

edited

tupui commented Feb 12, 2021 •

edited

tupui commented Feb 13, 2021 •

edited

Stefan-Endres commented Feb 13, 2021 •

edited

tupui commented Feb 13, 2021 •

edited

tupui commented Feb 13, 2021 •

edited

tupui commented Feb 15, 2021 •

edited

tupui commented Mar 23, 2021 •

edited