Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: scipy.optimize.minimize with method=SLSQP returns KKT multipliers TAKE 2 #15641

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

andyfaff
Copy link
Contributor

@andyfaff andyfaff commented Feb 23, 2022

@msparapa this is an updated version of your PR #9839 to return the KKT multipliers from SLSQP. (To save git hassles).

For reviewers:

  1. This needs tests. This PR naively returns the values that SLSQP said should be returned (https://github.com/scipy/scipy/blob/main/scipy/optimize/slsqp/slsqp_optmz.f#L154). EDIT: I've added some simple tests.
  2. A dict containing the KKT multipliers is returned. The 'eq' key contains a list of the multipliers for the equality constraints, in the order in which they appear in the constraints list. A similar approach is used for the inequality constraints.
  3. HOWEVER, if new-style NonlinearConstraint or LinearConstraint are used, then it's possible for a single new-style constraint to simultaneously contain both inequality and equality constraints. The first step in SLSQP is to separate these mixed new-style constraints into separated old-style constraints. This means that if there is mixing within a single constraint, then the returned list of multipliers is going to appear to be of a different length to the original new-style constraints. Hopefully this doesn't lead to confusion.
  4. The original PR said that it seems to be necessary to call slsqp again to force the multipliers to be updated. This won't cause any more function evaluations, but it will add extra overhead. I don't particularly want to dive into the fortran code to see if this can be 'fixed'.

@tylerjereddy tylerjereddy added scipy.optimize enhancement A new feature or improvement labels Feb 25, 2022
print(
f"{majiter:5} {sf.nfev:5} {fx:16.6E} "
f"{linalg.norm(g):16.6E}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a "good first issue" type thing would be for someone to add a test for this print--codecov saying it doesn't get hit, which isn't hugely surprising. Something like pytest capsys fixture is pretty good for capturing stdout and asserting on it.

@andyfaff
Copy link
Contributor Author

andyfaff commented Mar 1, 2022

@mdhaber do you have time to review this?

@mdhaber
Copy link
Contributor

mdhaber commented Mar 3, 2022

Can't guarantee a complete review but I write some tests to see if it's doing the right thing. Maybe this weekend, but I have a lot of work to do for the SciPy conference right now.

@mdhaber
Copy link
Contributor

mdhaber commented Mar 4, 2022

Started playing with this:

From https://www.bauer.uh.edu/rsusmel/phd/MR-11.pdf Slide 39

import numpy as np
from scipy.optimize import minimize

def f(x):
    x1, x2 = x
    return (x1 - 4)**2 + (x2 - 4)**2

def con(x):
    x1, x2 = x
    c1 = 2*x1 + 3*x2 - 6
    c2 = 12 - 3*x1 - 2*x2
    return [c1, c2]

bounds = [(0, None), (0, None)]

x0 = [3, 4]
res = minimize(f, x0, method='slsqp', constraints=[{"type": "ineq", "fun": con}])
res.kkt  # kkt: {'eq': [], 'ineq': [array([0.        , 1.23076922])]}

Checks out!

@andyfaff could you write a general test that the necessary conditions are satisfied? They don't look too hard to express in code, and then we could test with any arbitrary problem, not just ones that we have a reference for. For example, we have some tests like this for linprog here. (Sufficient conditions would be a plus, but are harder.)

I'm not sure how deeply you wanted to dive into this, but we thought quite a bit in gh-13893 about how we would present related information in linprog.

@msparapa
Copy link

msparapa commented Mar 5, 2022

Thank you for doing this. I no longer have the capacity to contribute to this problem like I did back in 2019, but I'm still very interested in this feature!

@jaharris85
Copy link

You mentioned about the additional overhead of having to call slsqp to improve the accuracy of the multipliers. There might be two relatively easy ways of avoiding this:

  1. The riskier way is to modify the multiplier update rule in slsqp_optmz.f. On line 415 we have
h3 = ABS(r(j))
mu(j) = MAX(h3,(mu(j)+h3)/two)

you can change this to

mu(j) = h3

The multipliers then seem to be correct for the few examples I have tested against. But you risk changing the algorithm and all the problems that might create with reliability etc.

  1. You can alternatively read r directly from the working space w once the algorithm has finished. The indices are given by line 252 in slsqp_optmz.f:
      im = 1
      il = im + MAX(1,m)
      il = im + la
      ix = il + n1*n/2 + 1
      ir = ix + n

Take abs(w(ir)) toabs(w(ir + m)) as your Lagrange multipliers.

Hope that is helpful. Thanks.

@andyfaff
Copy link
Contributor Author

andyfaff commented Mar 11, 2022

9929b39 makes changes as suggested by @jaharris85, thank you. I checked that the entries in _kkt_mult where correct by retaining the extra slsqp call and comparing _kkt_mult to w[:m] using an np.testing.assert_allclose in _minimize_slsqp. I then ran the test suite in test_slsqp (meaning all the SLSQP tests would've examined the assertion). There were no fails. Following this I removed the extra slsqp call, and the assert.

@andyfaff
Copy link
Contributor Author

@mdhaber, I think the PR is probably good as-is, in that it's returning the multipliers that SLSQP works out. If I understand the generalised test you suggested would be examining if SLSQP is doing its job correctly?

The only other point might be to consider if the Optimize.kkt dict has the right kind of structure. There are multiple ways people can pass constraints to slsqp (mixing up ineq, with eq, etc), and the return needs to be understandable. Hopefully the note I added makes sense.

@mdhaber
Copy link
Contributor

mdhaber commented Mar 11, 2022

If I understand the generalised test you suggested would be examining if SLSQP is doing its job correctly?

It would, but that would not be the purpose. It would allow us to test more general problems, e.g. if a new-style constraint contained mixed inequality and equality constraints, to see whether the information from SLSQP is being interpreted correctly in non-trivial cases. If you can add a more substantial problem otherwise, that could work, too.

'fun': con_fun,
'jac': con_jac}

cs = np.linspace(0, 3, 51)
Copy link
Contributor

@mdhaber mdhaber Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hitting the three cases (c < 1, 1 < c < 2, and c > 2) once each would suffice.

il = im + la
ix = il + (n1*n)//2 + 1
ir = ix + n - 1
_kkt_mult = np.abs(w[ir: ir + m])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. I've been testing this code out in the context of an optimal control problem where I need to extract the KKT multipliers. The code has been very helpful, but taking the absolute value here gives the wrong results. Is there a reason this is done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have (minimal) test cases for which you know the code gives incorrect output? It'd be great to include those to see where remaining gaps in the implementation lie.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately nothing small and easy to check, the optimal control problem requires a whole mess of other code to set up.

I guess I would first ask why the absolute value is needed? If it is just to match the previous implementation where SQP is called twice then it is likely that the previous implementation wasn't quite correct. If I remember correctly, the KKT multipliers for equality constraints can be positive or negative, but those for inequality constraints have to be positive (or negative, depending on how you set the problem up).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to see that the taking abs can't be right is to interpret the multipliers as the sensitivity of the objective function w.r.t. changes in the right hand side of the constraints.

That is, suppose the equality constraint function is $g(x) = 0$. Suppose we find the optimal value of the objective function $f^\ast $, and the KKT multiplier corresponding with this constraint is $\lambda$. If we were to perturb the constraint to $g(x) = \epsilon$, where $\epsilon$ is small, then we would expect the new optimal value of the objective function to be $f_h^\ast \approx f^\ast + \epsilon \lambda $.

The problem is that we can write the same constraints as either $g(x) = 0$ or $-g(x) = 0$. When we perturb the two RHSs by $+\epsilon$, the change in the objective function will be in different directions - that is, the signs of $\lambda$ should be opposite. I'll illustrate this in the example below.

The reason why the Lagrange multipliers corresponding with the inequality constraints must be positive is that if we are solving a minimization problem with an inequality constraint $h(x) \geq 0$, increasing the right hand side by $\epsilon$ can only make the objective function value worse, because we are tightening the constraint (making it more restrictive). For a minimizaiton problem, making it worse means making it more positive, so if it was $f^\ast$ before the perturbation, it becomes $f_h^\ast \approx f^\ast + \epsilon \mu$ with positive KKT multiplier $\mu$. (We can't multiplying both sides of $h(x) \geq 0$ by $-1$ as we did with the equality constraint because that would also change the direction of the inequality sign, and our problem is restricted to constraints expressed using a $\geq$ sign.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaharris85 thanks for your help finding the indices of the KKT multipliers. Could you tell us why you recommended taking the absolute value after getting the values from w? They seem to be correct without the absolute value.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention was to line it up with the original multiplier values returned by the the Fortran code. Line 415 contains the multiplier update:

h3 = ABS(r(j))
mu(j) = MAX(h3,(mu(j)+h3)/two) 

where r is the more accurate version of the multipliers. So mu(j) must be positive since it is the max of something positive. I don't understand the intention of the second line though since that is what introduces the error in the multipliers.

Has anyone confirmed whether the original multipliers were always positive?

Copy link
Contributor

@mdhaber mdhaber Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the perspective of the algorithm, the multipliers associated with inequality constraints must be positive to satisfy the first order necessary conditions. See 2.3.1 of A Software Package for Sequential Quadratic Programming. (But were you asking about whether that is actually the case in the code? All I know is it should be.)

@mdhaber
Copy link
Contributor

mdhaber commented May 22, 2022

@andyfaff I've had my hands full with stats, but I'll come back to this over the summer (maybe late June). I can write the tests I suggested (to smoke out potential problems like #15641 (comment)) then, and I think then I could merge this.

@andyfaff
Copy link
Contributor Author

@Tenavi , @mdhaber this PR is in a state that is close to merging. However, it really needs someone with domain knowledge of how KKT multipliers work to be able to complete it (I don't have that knowledge).

viz: what should the sign of the multipliers be for a given type of constraint, as mentioned by @Tenavi? I can supply the coding knowledge, but I can't supply the maths knowledge.

@mdhaber
Copy link
Contributor

mdhaber commented Jul 20, 2022

what should the sign of the multipliers be for a given type of constraint, as mentioned by @Tenavi?

Based on some research, I think the multipliers on equality constraints $\lambda_i$ can be of either sign, and multipliers on inequality constraints $\mu_i$ will always be positive. I don't think that means we need to add an abs on the $\mu_i$. Unless SLSQP is doing something unconventional, it should indicate positive $\mu_i$ on its own (although I suppose you could clip to be above 0 in case they dip slightly below 0 due to numerical issues).

Most online resources (e.g. [1], [2], [3]) show inequality constraints for minimization problems in the form $g(x) \leq 0$ and show dual feasibility conditions $\mu_i \geq 0$.

A few references ([4], [5] show inequality constraints for minimization problems of the form $g(x) \geq 0$, but still $\mu_i \geq 0$. What changes is the sign of some terms in the KKT stationarity constraint.

andyfaff#25 adds a test from [4] that fails with the abs left in and passes with the abs removed.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 9, 2022

@andyfaff I think this is close. It just needs the multipliers associated with the simple bounds and documentation of how the multipliers correspond with the constraints only if the user passes in old-style constraints.

@Tenavi
Copy link

Tenavi commented Mar 9, 2023

@andyfaff I think this is close. It just needs the multipliers associated with the simple bounds and documentation of how the multipliers correspond with the constraints only if the user passes in old-style constraints.

Sorry for my lack of activity on this issue. @mdhaber, by this comment do you mean that

  1. We don't yet have multipliers corresponding to constraints like $x &gt; 3$, and that these should be separated from 'eq' and 'ineq' multipliers in the dict?
  2. The documentation should be written assuming the user passes old-style constraints?

@mdhaber
Copy link
Contributor

mdhaber commented Mar 9, 2023

  1. Yes, exactly. Currently, this PR does not return the Lagrange multipliers corresponding with bounds, but it easily could.
  2. Yes. If the user passes new-style constraints, it's very challenging to succinctly describe how those objects get turned into constraints and in what order their corresponding Lagrange multipliers will appear.

Tenavi and others added 2 commits March 13, 2023 13:30
* ENH: KKT multipliers from SLSQP

* MAINT: some streamlining

* MAINT: remove enumerate

* MAINT: correct output from slsqp

* TST: kkt multiplier return

* STY and use some f-strings

* DOC: add KKT docstring to slsqp

* TST: kkt for equality

* MAINT: remove extra call to slsqp

* TST: optimize.minimize: add KKT necessary conditions check

* MAINT: optimize.minimize: remove abs on KKT multipliers; correct test

* ENH: KKT multipliers from SLSQP

* MAINT: some streamlining

* MAINT: remove enumerate

* MAINT: correct output from slsqp

* TST: kkt multiplier return

* STY and use some f-strings

* DOC: add KKT docstring to slsqp

* TST: kkt for equality

* MAINT: remove extra call to slsqp

---------

Co-authored-by: Andrew Nelson <andyfaff@gmail.com>
Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>
@Tenavi
Copy link

Tenavi commented Mar 15, 2023

@mdhaber I made the requested modifications, but the git version history may get a little ugly. The changes are in https://github.com/Tenavi/scipy/tree/slsqp_kkt, a branch on my fork from the most recent scipy version. I tried to fork from @andyfaff 's repo but I couldn't build scipy for some reason.

I can try to create a pull request into https://github.com/andyfaff/scipy/tree/kkt but this will also need to incorporate all the changes made to main since @andyfaff 's last pull. Alternatively, I can create a new pull request to scipy main. What's the best course of action?

@mdhaber
Copy link
Contributor

mdhaber commented Mar 15, 2023

Let me take a look at your branch and I'll let you know.

@mdhaber
Copy link
Contributor

mdhaber commented Mar 15, 2023

I just merged main here for now.@Tenavi The merge conflicts with your branch weren't bad, so I'll just push those hereI just pushed those here. If they look good to @andyfaff, I think we can continue working with this PR rather than creating a new one. @andyfaff what do you think? Here is the relevant diff.

@@ -24,6 +26,44 @@ def __call__(self, x):
self.ncalls += 1


def _check_kkt(res, constraints):
Copy link
Contributor

@mdhaber mdhaber Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tenavi could you add the conditions for simple bounds to these checks? It looks like the relevant information would just need to be added into mus and gs. Also, ideally, we'd add an example where some of the bounds are active at the solution. My guess is that such a problem would have failed _check_kkt because it neglects the (nonzero) multipliers corresponding with these bounds, but considering the KKT multipliers, it should pass.

It would also be terrific if you'd be willing to make a small modification to the slsqp example in the minimize documentation to show how the kkt multipliers can be accessed and used.

For instance, the code is:

from scipy.optimize import minimize, rosen, rosen_der

fun = lambda x: (x[0] - 1)**2 + (x[1] - 2.5)**2
cons = ({'type': 'ineq', 'fun': lambda x:  x[0] - 2 * x[1] + 2.01},
        {'type': 'ineq', 'fun': lambda x: -x[0] - 2 * x[1] + 6},
        {'type': 'ineq', 'fun': lambda x: -x[0] + 2 * x[1] + 2})

bnds = ((0, None), (0, None))

res = minimize(fun, (2, 0), method='SLSQP', bounds=bnds,
               constraints=cons)

At the solution, the first constraint is active:

cons[0]['fun'](res.x)  # 5.329070518200751e-15

and has a nonzero KKT multiplier:

res.kkt['ineq'][0]  # array([0.79599999])

which is the local sensitivity of the optimal value of the objective function with respect to changes in the first constraint. If we adjust the constraint by a small amount delta:

eps = 0.01
cons[0]['fun'] = lambda x:  x[0] - 2 * x[1] + 2 - eps

we would expect the change in the optimal value of the objective function to be approximately eps * res.kkt['ineq'][0].

f0 = res.fun  # previous optimal value of the objective function
res = minimize(fun, (2, 0), method='SLSQP', bounds=bnds,
               constraints=cons)
f1 = res.fun  # new optimal value of the objective function
f1 - f0  # 0.008019998807906381
delta * res.kkt['ineq'][0]  # array([0.00804])

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdhaber @andyfaff I made some updates to fix the bound multipliers, modified _check_kkt to test for these, and also added a specific test to make sure I was getting the indexing right. Please take a look at the PR when you can: andyfaff#42

I still would need to update the documentation for minimize as you suggested. I think your writeup looks good as is... I could push these changes as well, or were you looking for something a little more?

Copy link
Contributor

@mdhaber mdhaber Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still would need to update the documentation for minimize as you suggested. I think your writeup looks good as is...

You mean the example I asked for? You are welcome to use the example I suggested if it looks good to you. You don't necessarily need more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdhaber I added to the minimize documentation (thanks for your example) and made a test to verify the behavior.

Nakamura-Zimmerer, Tenavi and others added 2 commits March 23, 2023 17:27
@Tenavi
Copy link

Tenavi commented Mar 29, 2023

How do things look @mdhaber ? Is there anything else I can contribute at this point?

@mdhaber
Copy link
Contributor

mdhaber commented Mar 29, 2023

According to the plan proposed in andyfaff#42 (comment), there is nothing for you to do at this time. Thanks for your patience.

@mdhaber
Copy link
Contributor

mdhaber commented Mar 29, 2023

Actually @Tenavi please address the CI failures while you're waiting for @andyfaff to review.

Non-deprecated objects in __all__: 69
Objects in refguide: 69

scipy.optimize.minimize
-----------------------

File "build/testenv/lib/python3.9/site-packages/scipy/optimize/_minimize.py", line 544, in minimize
Failed example:
    eps = 0.01
    cons[0]['fun'] = lambda x: x[0] - 2 * x[1] + 2 - eps
Exception raised:
    Traceback (most recent call last):
      File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/doctest.py", line 1334, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest minimize[14]>", line 1
        eps = 0.01
                  ^
    SyntaxError: multiple statements found while compiling a single statement


File "build/testenv/lib/python3.9/site-packages/scipy/optimize/_minimize.py", line 552, in minimize
Failed example:
    f0 = res.fun # Keep track of the previous optimal value
    res = minimize(fun, (2, 0), method='SLSQP', bounds=bnds,
                   constraints=cons)
    f1 = res.fun # New optimal value
    f1 - f0
Exception raised:
    Traceback (most recent call last):
      File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/doctest.py", line 1334, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest minimize[16]>", line 1
        f0 = res.fun # Keep track of the previous optimal value
                     ^
    SyntaxError: multiple statements found while compiling a single statement

Also:

=================================== FAILURES ===================================
______________________ TestSLSQP.test_infeasible_initial _______________________
[gw1] linux -- Python 3.9.5 /usr/bin/python3.9
/usr/local/lib/python3.9/dist-packages/scipy-1.11.0.dev0+unknown.unknown-py3.9-linux-i686.egg/scipy/optimize/tests/test_slsqp.py:574: in test_infeasible_initial
    _check_kkt(sol, constraints=cons_u)
        cons_l     = [{'fun': <function TestSLSQP.test_infeasible_initial.<locals>.<lambda> at 0xdf9551d8>, 'type': 'ineq'}]
        cons_u     = [{'fun': <function TestSLSQP.test_infeasible_initial.<locals>.<lambda> at 0xdf955220>, 'type': 'ineq'}]
        cons_ul    = [{'fun': <function TestSLSQP.test_infeasible_initial.<locals>.<lambda> at 0xdf9550b8>, 'type': 'ineq'}, {'fun': <function TestSLSQP.test_infeasible_initial.<locals>.<lambda> at 0xdf955388>, 'type': 'ineq'}]
        f          = <function TestSLSQP.test_infeasible_initial.<locals>.f at 0xdf9d9fa0>
        self       = <scipy.optimize.tests.test_slsqp.TestSLSQP object at 0xe912aeb0>
        sol        =  message: Optimization terminated successfully
 success: True
  status: 0
     fun: 0.9999999999999538
       x: [ 2.3...   eq: []
            ineq: [array([ 2.000e+00])]
          bounds: lb: [ 0.000e+00]
                  ub: [ 0.000e+00]
/usr/local/lib/python3.9/dist-packages/scipy-1.11.0.dev0+unknown.unknown-py3.9-linux-i686.egg/scipy/optimize/tests/test_slsqp.py:84: in _check_kkt
    assert np.all(g_eval >= -1e-14)
E   assert False
E    +  where False = <function all at 0xf2f5a658>(array([-2.30926389e-14]) >= -1e-14)
E    +    where <function all at 0xf2f5a658> = np.all
        DgTmu      = []
        atol       = 1e-06
        bound_funs = []
        bounds     = []
        constraints = [{'fun': <function TestSLSQP.test_infeasible_initial.<locals>.<lambda> at 0xdf955220>, 'type': 'ineq'}]
        g          = <function TestSLSQP.test_infeasible_initial.<locals>.<lambda> at 0xdf955220>
        g_eval     = array([-2.30926389e-14])
        gradf      = array([-1.99999999])
        gs         = [<function TestSLSQP.test_infeasible_initial.<locals>.<lambda> at 0xdf955220>]
        hs         = []
        lams       = []
        mu         = array([1.99999999])
        mus        = [array([1.99999999])]
        res        =  message: Optimization terminated successfully
 success: True
  status: 0
     fun: 0.9999999999999538
       x: [ 2.3...   eq: []
            ineq: [array([ 2.000e+00])]
          bounds: lb: [ 0.000e+00]
                  ub: [ 0.000e+00]
        x          = array([2.30926389e-14])

@Tenavi Tenavi mentioned this pull request Mar 29, 2023
@Tenavi
Copy link

Tenavi commented Mar 29, 2023

@mdhaber : this PR addresses the CI/CD failures you found. Please let me know if there's anything else I can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.optimize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants