MAINT: deal with cases in `minimize` for `(bounds.lb == bounds.ub).any() #12889

andyfaff · 2020-09-26T03:04:56Z

Reference issue

What does this implement/fix?

Additional information

scipy/optimize/_differentiable_functions.py

mdhaber · 2020-09-27T06:01:35Z

scipy/optimize/_numdiff.py

@@ -491,32 +491,43 @@ def _dense_difference(fun, x0, f0, h, use_one_sided, method):
    h_vecs = np.diag(h)

    for i in range(h.size):
+        df = np.zeros_like(f0)


It looks like df will be all zeros if any of the dx are zero. Don't we want the elements of df corresponding with nonzero dx to be evaluated as usual?

dx can either be scalar or (N,) length array.

If all of dx are zero then there don't need to be any evaluations of fun(x); df should be 0 (or actually np.zeros_like(f0)), because the step is 0. This has the effect of reducing overall function evaluations.

If there are any non-zero dx then fun(x) is evaluated to calculate df as usual.

<face-palm> i was tired.

I wonder if the tests are hitting the array dx case. I think you will get a:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

How about np.any(dx), which will work for both scalar and array?

for future reference dx is an array of size > 1 when evaluating a hessian.

andyfaff · 2020-09-27T09:44:35Z

The last commit checks that ScalarFunction rejects functions that don't return scalars.

andyfaff · 2020-09-27T22:22:58Z

@mdhaber I think the failing tests are due to CI not liking the latest pytest.

scipy/optimize/_differentiable_functions.py

andyfaff · 2020-10-10T06:44:43Z

rebased to make tests run again.
@mdhaber, ping for merge?

mdhaber · 2020-10-11T22:31:36Z

I'm afraid I can't check this weekend : / I'm swamped. Is in time for 1.6.0 OK?
Tell you what, I'll trade you for taking a look at gh-12043 : P (kidding)

andyfaff · 2020-10-12T00:11:39Z

I'm afraid I can't check this weekend : / I'm swamped. Is in time for 1.6.0 OK?
Tell you what, I'll trade you for taking a look at gh-12043 : P (kidding)

1.6.0 is fine, I was just trying to clear low hanging fruit. In all seriousness I would review your PR, but I'm not an expert in the subject matter.

andyfaff · 2020-10-21T03:51:58Z

@mdhaber I was fiddling around with the idea of factorising out equal bounds (reducing dimensionality of the problem), but it's not only the objective function one has to worry about. One also have to deal with jacobians and hessians, and I'm not sure if it's possibly to do that in a clean way.

mdhaber · 2020-10-21T07:22:28Z

User provided functions for those, right? Why can't we eliminate the rows and columns of those outputs that correspond with the fixed variables?

andyfaff · 2020-10-21T07:26:43Z

Ahhh, filter afterwards rather than before.

mdhaber · 2020-10-21T07:32:37Z

Yeah, there's nothing we can do to stop them from being evaluated, but we can remove them.

I am imagining that minimize identifies equal bounds, removes those variables from the decision variable vector before calling the underlying solver. To the solver it passes not the user provided functions but wrappers that inject the fixed variables into the input, call the user provided function, and remove any derivatives corresponding with the fixed variables from the output.

I suppose it would be a pain to fix the separate fmin solvers. But... we could. Or we could leave them broken and recommend that new code use minimize.

larsoner

Just a couple of minor testing ideas -- +1 for merge with or without these suggestions

scipy/optimize/tests/test_optimize.py

scipy/optimize/tests/test_differentiable_functions.py

kif · 2020-11-02T10:09:25Z

scipy/optimize/_differentiable_functions.py

            def update_grad():
                self._update_fun()
                self.ngev += 1
                self.g = approx_derivative(fun_wrapped, self.x, f0=self.f,
                                           **finite_diff_options)
+                # when the lower and upper bounds are equal then set the
+                # gradient to 0.
+                self.g[equal_bounds] = 0.0


This is the core of the PR: set to 0 derivatives of fixed parameters (which were Nans). I wish it is back-ported to the 1.5 series.

It seems that with the change in approx_derivative, self.g[equal_bounds] will be filled with zeros by approx_derivative. Filling it explicitly is fine too.

Edit: most likely it will be filled with nans (in newer Python and numpy).

A more important question: could someone explain qualitatively why this change is fundamentally correct?

I don't want to crash into this PR all of the sudden and destroy all the work, but I have some concerns.

From a logical/mathematical point of view, the fact that lb[i] == ub[i] doesn't impose any constraints on the gradient at lb[i]. So it's very unclear what setting elements of g to zero achieves.

What is going to happen if we run the same problem now with analytical and numerical differentiation and some bounds being equal? In principle we must get close and correct results (including x_opt[i] == lb[i] == ub[i]), but I don't see how it is guaranteed with the current code.

It seems to me that the ability of a solver to handle lb[i] == ub[i] correctly with all possible settings requires more considerations and testing.

I'm less concerned about losing work for the PR, and more concerned that we get behaviour right here. It'd be great to have more input with people familiar with optimisation, to make sure that happens. @nmayorov I'd appreciate any further input you're able to give.

The changes in this PR are by no means perfect.

Logically setting lb[i] == ub[i] means that you know, with 100% certainty, that x[i] has to be a constant value. Should a function have a gradient for x[i] if it's not allowed to take any other value? A graph of f vs x would show a singular dot, and not a line. In such a case the gradient should probably be undefined (nan), even for an 'analytic' derivative (or should it be zero?).

An analytic jacobian function (derived for lb[i] < ub[i]) may be able to calculate the gradient at x[i] == lb[i] with lb[i] == ub[i]. However, estimating the gradient with finite differences at that point and respecting bounds is impossible. If bounds are ignored during numeric differentiation, then it is possible to do that (you're also saying that you're not 100% sure that lb[i] == ub[i]).
There were two sets of issues reported that have been involved in the lead to this PR:

(a) estimating grad via finite differences should respect bounds. This lead to the use of approx_derivative that does so.

(b) when the bounds-respecting was done for the previous issue this lead to issues in numerical optimizers, such as slsqp or lbfgsb that don't work when the jacobian contains nan (df/dx=0.0/0.0)

From testing SLSQP and LBFGSB do work when those nan are replaced by zeros, so that was the approach I took here. A different approach is to recognise problems with equal bounds and factor out those fixed parameters, thereby reducing the dimensionality of the problem. I think this factorisation would increase code complication and not straightforward to implement across all the solvers.
Yet another approach could be to get the actual iterative code in SLSQP and LBFGSB to behave properly for lb[i] == ub[i], I don't think anyone has the appetite for altering Fortran code.

I don't think that this PR, or an alternate that factored out fixed parameters, would lead to the analytic or numerical differentation approaches giving the same answer unless we rolled back the changes from 2a.

TLDR: even though it's not a great solution, it may be the best of a bad lot.

If one of the stumbling blocks is the alteration of nan-->0 in _differentiable_functions, then that could be done in a wrapper function at a higher level.

Not 100% sure, but conceptually it looks feasible using Python-side wrappers. The problem must be completely reformulated to remove fixed variable, then solved by the optimizer, then the result is restored to contain the fixed variables.

@nmayorov, the elimination solution is likely to take some time, and I think it's vital to get at least a temporary fix in for the next release. At the moment this is either going to be this PR, or #13069. The alternate PR lifts the issue from the numdiff code up into the solver method code.

Are you sure it's vital to get such an ad-hoc fix? I mean if it fails with an error - it's an expected behavior and the error message is correct in this case. The case with equal bounds seems to be "undefined area" in the current state to be honest.

Well I'm not actively involved in the development as of late, so don't want to block anything. The second PR looks preferable to me.

As for _approx_derivative - maybe it should even raise an error with equal bounds, because it's an incorrect setting considering the logic involved. But it won't play well with you changes, so it is just a thought.

The reported issues (+ their volume) and the documentation for minimize indicate that setting equal bounds has historically been acceptable/valid.
However, the last release made the numerical differentiation obey any bounds, so back compatibility with this historical behaviour was broken (because the core of slsqp+lbfgsb can't cope with NaN in the gradient). Squash one bug, and another pops up.
Given that the alternate approach in #13069 might be preferable, would you have time to review that PR? It would be so, so, helpful. I agree that the change of changing Nan entries to zero in the gradient isn't 100% desirable. However, for the test example I constructed the modifications worked, they also work for various users posting these issues.

@andyfaff @nmayorov do you see problems with the approach in gh-13096 (different from gh-13069)? Even if it's not ready for the next release, is that what we'd like to head towards?

kif · 2020-11-02T10:11:48Z

For me, the main reason to use this "feature" (bounds.ub == bounds.lb) to fix a parameter is because re-rewriting the function (and its derivative) is probably more complicated & error prone.

This PR basically enforces the derivative of the function along fixed parameters to be zero instead of 0/0=Nan.

Beside this, looks good to me.

andyfaff · 2020-11-03T07:09:59Z

Test failures are unrelated.

andyfaff · 2020-11-03T20:06:28Z

@larsoner rebased to fix merge conflict.

andyfaff · 2020-11-08T11:41:27Z

@larsoner ping for a merge?

scipy/optimize/_numdiff.py

andyfaff · 2020-11-10T04:39:37Z

@nmayorov @larsoner, I've made an alternate version, andyfaff#21. In that version the replacement of nan-->0 occurs within a wrapper in _minimize_*, and not in approx_derivative. @nmayorov, would this be preferred to this PR?

tylerjereddy · 2020-12-08T04:31:06Z

milestone bumped, if this really is important for the 1.6.x series you may still have a tiny bit of time, but I'm trying to branch soon

andyfaff · 2021-07-30T04:15:48Z

@mdhaber, @nmayorov, @larsoner on the whole I prefer this option rather than Matt's alternative in #13096, because it seems simpler and more maintainable.

Merging this PR would recover the ability to have lower bounds equal to upper bounds for several solvers (a long standing issue for L-BFGS-B, SLSQP).
Remaining issues for discussion:

the behaviour for analytical and finite-difference gradients would be different. This PR sets the gradient entries for equal bounds to 0. Corresponding entries for analytic gradients would be different. For me the a gradient of zero for a fixed parameter seems logical. When using these kinds of minimisers I'd certainly expect an inverse Hessian (c.f. covariance matrix) to have zero entries for those parameters, because they're fixed.

mdhaber · 2021-08-02T21:02:42Z

scipy/optimize/tests/test_optimize.py

+
+
+@pytest.mark.parametrize('method', ['Powell', 'L-BFGS-B', 'SLSQP',
+                                    'trust-constr'])


I think we'll need more comprehensive tests, as this currently passes in master for Powell and trust-constr.

I can't remember from the original issue if Powell and trust-constr were sensitive to this. I can only remember that lbfgsb and slsqp were.

andyfaff · 2021-10-15T02:30:45Z

This functionality was added in #13096

mdhaber reviewed Sep 27, 2020

View reviewed changes

scipy/optimize/_differentiable_functions.py Outdated Show resolved Hide resolved

mdhaber reviewed Sep 27, 2020

View reviewed changes

pv reviewed Sep 28, 2020

View reviewed changes

scipy/optimize/_differentiable_functions.py Show resolved Hide resolved

WarrenWeckesser mentioned this pull request Oct 1, 2020

CI: Remove 'env' from pytest.ini #12902

Merged

andyfaff force-pushed the equal_bounds branch from 06dadf2 to b5ac138 Compare October 10, 2020 06:44

andyfaff changed the title ~~Equal bounds~~ MAINT: deal with cases in minimize for `(bounds.lb == bounds.ub).any() Oct 20, 2020

andyfaff mentioned this pull request Oct 20, 2020

SLSQP constrained minimization error in 1.5.2 #12981

Closed

andyfaff added this to the 1.6.0 milestone Oct 27, 2020

rgommers added maintenance Items related to regular maintenance tasks scipy.optimize labels Nov 1, 2020

andyfaff force-pushed the equal_bounds branch from d35559e to 01aaf8b Compare November 1, 2020 21:51

andyfaff mentioned this pull request Nov 1, 2020

MAINT: approx_derivative with FP32 closes #12991 #12998

Merged

kif mentioned this pull request Nov 2, 2020

pyFAI-Calib2 and SAXS constrain silx-kit/pyFAI#1421

Closed

larsoner approved these changes Nov 2, 2020

View reviewed changes

scipy/optimize/tests/test_optimize.py Show resolved Hide resolved

scipy/optimize/tests/test_differentiable_functions.py Outdated Show resolved Hide resolved

kif reviewed Nov 2, 2020

View reviewed changes

andyfaff force-pushed the equal_bounds branch from 0f467cf to 2f6f79b Compare November 3, 2020 20:05

nmayorov reviewed Nov 8, 2020

View reviewed changes

scipy/optimize/_numdiff.py Outdated Show resolved Hide resolved

nmayorov reviewed Nov 8, 2020

View reviewed changes

scipy/optimize/_numdiff.py Show resolved Hide resolved

andyfaff mentioned this pull request Nov 11, 2020

MAINT: deal with cases in minimize for `(bounds.lb == bounds.ub).any() (alternate for #12889) #13069

Closed

mdhaber mentioned this pull request Nov 18, 2020

BUG: optimize: alternative fix for minimize issues with lb==ub #13096

Merged

andyfaff marked this pull request as draft November 27, 2020 01:23

tylerjereddy modified the milestones: 1.6.0, 1.7.0 Dec 8, 2020

tylerjereddy modified the milestones: 1.7.0, 1.8.0 May 25, 2021

andyfaff added 8 commits July 30, 2021 13:12

MAINT: minimize with equal bounds

ac044c7

ENH: remove redundant parameters

845301a

MAINT: _prepare_bounds uses array

2c6a43f

MAINT: ScalarFunction rejects non-scalar func

b0ec922

MAINT: specific error message

a558496

PEP8

819bd24

MAINT: use pytest.raises

267a10d

MAINT: changes from review

7704ce6

andyfaff force-pushed the equal_bounds branch from 2f6f79b to 7704ce6 Compare July 30, 2021 03:50

MAINT: remove unused import

eec3562

DOC: update docstring for ScalarFunction

4c0a4c2

andyfaff marked this pull request as ready for review July 30, 2021 04:37

mdhaber reviewed Aug 2, 2021

View reviewed changes

MAINT: np.any

2e3d5ea

andyfaff closed this Oct 15, 2021

mdhaber mentioned this pull request Jan 15, 2023

ENH: Scipy Optimize, equal Bounds should be directly passed to target function #17788

Closed

andyfaff deleted the equal_bounds branch April 28, 2023 11:10



		@pytest.mark.parametrize('method', ['Powell', 'L-BFGS-B', 'SLSQP',
		'trust-constr'])

MAINT: deal with cases in minimize for `(bounds.lb == bounds.ub).any() #12889

MAINT: deal with cases in minimize for `(bounds.lb == bounds.ub).any() #12889

Conversation

andyfaff commented Sep 26, 2020 • edited

Reference issue

What does this implement/fix?

Additional information

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdhaber Sep 27, 2020 • edited

Choose a reason for hiding this comment

mdhaber Aug 2, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyfaff commented Sep 27, 2020

andyfaff commented Sep 27, 2020

andyfaff commented Oct 10, 2020

mdhaber commented Oct 11, 2020

andyfaff commented Oct 12, 2020

andyfaff commented Oct 21, 2020

mdhaber commented Oct 21, 2020

andyfaff commented Oct 21, 2020

mdhaber commented Oct 21, 2020 • edited

larsoner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmayorov Nov 8, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmayorov Nov 15, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdhaber Nov 18, 2020 • edited

Choose a reason for hiding this comment

kif commented Nov 2, 2020

andyfaff commented Nov 3, 2020

andyfaff commented Nov 3, 2020

andyfaff commented Nov 8, 2020

andyfaff commented Nov 10, 2020

tylerjereddy commented Dec 8, 2020

andyfaff commented Jul 30, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyfaff commented Oct 15, 2021

MAINT: deal with cases in `minimize` for `(bounds.lb == bounds.ub).any() #12889

MAINT: deal with cases in `minimize` for `(bounds.lb == bounds.ub).any() #12889

andyfaff commented Sep 26, 2020 •

edited

mdhaber Sep 27, 2020 •

edited

mdhaber Aug 2, 2021 •

edited

mdhaber commented Oct 21, 2020 •

edited

nmayorov Nov 8, 2020 •

edited

nmayorov Nov 15, 2020 •

edited

mdhaber Nov 18, 2020 •

edited