Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: sparse: Fix LIL full-matrix assignment #18211

Merged
merged 3 commits into from
Apr 4, 2023

Conversation

perimosocordiae
Copy link
Member

Reference issue

Fixes gh-18202.

What does this implement/fix?

Previously, LIL's __setitem__ would dispatch through IndexMixin.__setitem__, which doesn't currently handle slice-based keys in an efficient way: it converts slices to index arrays. This means that the eventual call to LIL's overload of _set_arrayXarray_sparse would not be able to detect the full-matrix assignment, and would thus densify the sparse matrix every time.

Until we improve the IndexMixin side of things, this solves the issue by moving the fast-path detection higher up in the call stack, before the slices are converted.

@perimosocordiae perimosocordiae added scipy.sparse defect A clear bug or issue that prevents SciPy from being installed or used as expected labels Mar 29, 2023
@tylerjereddy tylerjereddy added the backport-candidate This fix should be ported by a maintainer to previous SciPy versions. label Mar 29, 2023
# Fast path for full-matrix sparse assignment.
if (isinstance(row, slice) and isinstance(col, slice) and
row == slice(None) and col == slice(None) and
isspmatrix(x)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case it helps, this patch seems to allow the full suite to pass for me locally:

--- a/scipy/sparse/_lil.py
+++ b/scipy/sparse/_lil.py
@@ -322,7 +322,7 @@ class lil_matrix(spmatrix, IndexMixin):
             # Fast path for full-matrix sparse assignment.
             if (isinstance(row, slice) and isinstance(col, slice) and
                     row == slice(None) and col == slice(None) and
-                    isspmatrix(x)):
+                    isspmatrix(x) and self.shape == x.shape):
                 x = self._lil_container(x, dtype=self.dtype)
                 self.rows = x.rows
                 self.data = x.data

I confirmed that the original reproducer gives MemoryError on main, while this modification still allows it to be avoided. I don't know if there are "broadcasting" fast path considerations you'd like to make though vs. a hard shape match requirement, though the hard assignment of the data to self seems to imply a pretty strict requirement for shape similarity I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, very good point. This sorts the issue for now, and we can deal with broadcasting later.

@tylerjereddy tylerjereddy added this to the 1.11.0 milestone Apr 4, 2023
@tylerjereddy tylerjereddy merged commit e168cff into scipy:main Apr 4, 2023
@tylerjereddy
Copy link
Contributor

Ok, CI was passing and I believe the regression test/reproducer in the original issue was too slow to be practical when I tried it locally.

@tylerjereddy
Copy link
Contributor

I suppose we could follow up with an xslow test, though prio on that may be low.

doronbehar added a commit to doronbehar/scipy that referenced this pull request Apr 8, 2023
* main: (51 commits)
  ENH: stats.ttest_ind: add degrees of freedom and confidence interval (scipy#18210)
  DOC fix link in release notes v1.7 (scipy#18258)
  BLD: fix missing build dependency on cython signature .txt files
  DOC: Clarify executable tutorials and MyST/Jupytext
  ENH: stats: Add relativistic Breit-Wigner Distribution (scipy#17505)
  MAINT: setup.sh as executable file
  DOC: remove content related to `setup.py` usage from the docs (scipy#18245)
  DOC: orthogonal_procrustes fix date of reference paper and DOI (scipy#18251)
  BLD: implement version check for minimum Cython version
  ci: touch up wheel build action
  DOC: link to info about Gitpod.
  ENH: ndimage.gaussian_filter: add `axes` keyword-only argument (scipy#18016)
  BUG: sparse: Fix LIL full-matrix assignment (scipy#18211)
  STY: Remove unnecessary semicolon
  DOC: Pin docutils and filter additional sphinx warnings
  BUG: vq.kmeans() compares signed diff to a threshold. (scipy#8727)
  MAINT: stats: consistently return NumPy numbers (scipy#18217)
  TST: stats.dunnett: fix seed and filter warnings in test_shapes
  DOC: add info to use codespaces.
  MAINT: add devcontainer configuration
  ...
@tylerjereddy tylerjereddy modified the milestones: 1.11.0, 1.10.2 Apr 16, 2023
tylerjereddy added a commit to tylerjereddy/scipy that referenced this pull request Apr 16, 2023
* also indentation fixes and add required shape check

Co-authored-by: Tyler Reddy <tyler.je.reddy@gmail.com>
@tylerjereddy tylerjereddy removed the backport-candidate This fix should be ported by a maintainer to previous SciPy versions. label Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: MemoryError when using "+=" operator with scipy.sparse.lil_matrix
2 participants