-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] ENH: New least-squares algorithms #5044
Conversation
4ca0edb
to
7f4e942
Compare
J = J.toarray() | ||
elif not issparse(J): | ||
J = np.atleast_2d(J) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this checks sparsity and warns on each iteration. Better move the checks, warnings and construction of an appropriate Jacobian out of jac_wrapped. Same for the else branch below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning will appear only once. I thought about it and this approach seemed as the most clear for me. That's basically the reason to introduce jac_wrapped
to handle all trivia in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Jacobian is know to be dense, why checking if issparse(J)
repeatedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it does no harm and takes the least amount of code (most readable) :). But, as I suggested above, let's just remove these conversions and checks all together (if a user wants to convert to dense - it's trivial to do so in his code).
bc58791
to
36a0db4
Compare
n = x0.size | ||
|
||
if diff_step is None: | ||
epsfcn = np.finfo(float).eps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: The call to finfo(float) is repeated at least four times, twice at module level. Better do it once, store the result, and import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, better import it from ._lsq_common
.
36a0db4
to
789075d
Compare
if jac in ['2-point', '3-point']: | ||
if jac_sparsity is not None: | ||
if method == 'lm': | ||
warn("`jac_sparsity` is ignored for method='lm', dense " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make this an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Motivation? To avoid heavy computations? I think you might be right, better to stop here.
On the other hand there are a lot of similar places: converting sparse Jacobian to dense is also looking for troubles. Maybe leave it to a user to choose correct options and only warn if something happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LM and sparsity do not play together, this can never be a useful thing.
Generally, with this many mysterious switches I think it makes sense to weed out obviously wrong things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you are right! Let's also disallow sparse->dense conversion, i. e. be more strict and explicit. Agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm interested to know your opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
return groups | ||
|
||
|
||
def group_sparse(int m, int n, int [:] indices, int [:] indptr): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is using int for indices overflow-safe here? I'd unlikely someone is using this large matrices, but still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially might be not, but np.int32
is the type for sparse matrices indices
and indptr
, so it can be changed only on much more fundamental level.
I think this link is again relevant https://github.com/scikit-learn/scikit-learn/wiki/C-integer-types:-the-missing-manual
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be careful about this. As of Scipy 0.14, sparse matrices use int32
unless the number of nonzero elements exceeds 2^31
, at which point int64
is used. So this is probably fine given the application here, but it is possible to get 64 bits ints as indices from a sparse matrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both indices
and indptr
are affected by that? Should I then cast the arrays to int64
?
On the other hand, arrays with more than 2^31
elements are already too big for the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the clean solution would be to use Cython's fused types here, but it looks like slicing a huge matrix to obtain a small one brings indices.dtype
back to int32
, so small matrices can be expected not to have int64
indices.
Two random comments:
|
I've done some refactoring of Jacobian validation code. I think it's better now. |
Running |
Re validation/wrapping of Jacobian: I still prefer to only do this validation once: nmayorov#1 |
3d9370e
to
7aabb06
Compare
|
||
def solve_lsq_trust_region(n, m, uf, s, V, Delta, initial_alpha=None, | ||
rtol=0.01, max_iter=10): | ||
"""Solve a trust-region problem arising in least-squares minimization by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summary lines should be kept < 80 characters total. So for this, maybe something like
"""Use MINPACK method to solve a trust-region least squares problem
blah, blah
"""
Or some such, depending on what the function actually does, which is not clear; "problem" could refer to almost anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does "< 80 characters total" come from? I can find several methods in scipy which have multiline "Short summary" and it looks really well on the site. And this functions are not public actually.
It is said "trust-region problem", which is quite well known and specific in optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is inherited from the Python docstring convention https://www.python.org/dev/peps/pep-0257/, in combination with the pep 8 line length limitation.
It is hard to write short summaries, but the multiline ones are definitely a style violation. Travis doesn't check that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that we should respect PEP Style recommendations (maybe not always completely). I will try to figure out one-liners.
I want to rename About the last example: maybe keep it? Yes doctring is long, but you don't need to read it all (if you don't want to). On the other hand, we have now examples covering all features. Matlab docpages are very long too. My general plan:
|
Your plan sounds good :-). On Re last example: it does feel a little out of place. It reads a little more narrative --- which is good!, but more appropriate for a tutorial. So maybe move it over there or #5233. (we need to advertise tutorials more!) One little knot to tie is #5044 (comment). It's likely not a serious issue, but it'd be good to not leave holes where reasonable. I've to admit I've no idea. Maybe @ewmoore or @larsmans can weigh in on this? |
About huge sparse matrices: I just couldn't test in on my machine, as I said, it is unlikely to create troubles (time / our of ram issues will come first). Updated: I meant I couldn't test in on my machine. Sorry about that. |
@@ -0,0 +1,7 @@ | |||
"""Module contains last-squares algorithms.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: last-squares. Also "This module".
Yeah, it's quite hard to believe it'll cause much trouble. FWIW, I'd punt on this. So, it seems the only thing which holds this PR is a docstring example :-) |
OK, I keep maintaining that the last docstring example should be moved to the tutorial, but this can be done in a follow-up PR for the item 3 of #5044 (comment). |
Time to merge this, I'd think. (Further review is always welcome of course). Thank you Nikolay, it's a great one. |
[GSoC] ENH: New least-squares algorithms
Sorry Evgeni, that I was so slow for making final changes. But great to see it merged! |
Yay! |
Now that curve-fit PR is in, the plan #5044 (comment) is done save item 3 "docs/tutorial" :-). |
Hi, Evgeni! Can you explain to me what is the current status of using IPython Notebooks as examples in scipy documentation? |
Nice to see you back Nikolay :-). The discussion is going on in #5233: AFAIU, the only last thing to agree upon is whether to host them in a separate repository or add them to the main scipy repo. I think the best way to get the ball rolling is to send a WIP PR with your tweaks to the tutorial and links to your notebooks with examples. |
fun : callable | ||
Function which computes the vector of residuals with the signature | ||
``fun(x, *args, **kwargs)``, i.e., the minimization proceeds with | ||
respect to it's first argument. The argument ``x`` passed to this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "it's" -> "its"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Antony --- would you care to send a PR with the fix please
09.01.2016 11:57 пользователь "Antony Lee" notifications@github.com
написал:
In scipy/optimize/_lsq/least_squares.py
#5044 (comment):
- We call f(x) as a vector of residuals or simply residuals, and F(x) as a
- cost function or simply cost.
- We call rho(s) as a loss function, its purpose to reduce the influence
- of outliers on the solution.
- Partial derivatives of f with respect to x form m-by-n matrix called
- Jacobian, where an element (i, j) equals the partial derivative of f[i]
- with respect to x[j].
- Parameters
- fun : callable
Function which computes the vector of residuals with the signature
`fun(x, *args, **kwargs)`, i.e., the minimization proceeds with
respect to it's first argument. The argument `x` passed to this
typo: "it's" -> "its"
—
Reply to this email directly or view it on GitHub
https://github.com/scipy/scipy/pull/5044/files#r49262246.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any corrections to the text would be helpful. Likely there are more mistakes and generally bad usage of English.
Hi! This is an extension of my previous PR #5019.
I moved some commits to the previous PR, now the first new commit is 2710986
It adds special adjustments to algorithms which allow to solve large problems, when Jacobian matrix is sparse (only matrix-vector products are computed). Also sparse Jacobians can be estimated fast by finite differencing, when each row contains only few non-zero elements.
UPDATE:
I think that's about it what I wanted to implement in least_squares function. I would appreciate more active feedback and code review.
Here are two examples of usage:
I see that conceptually an example with interesting bounds usage is missing. Your suggestions in this directions are most welcome.