Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculations done with sparse matrices may yield different results than when done with dense matrices #1401

Closed
cleydyr opened this issue Feb 12, 2019 · 7 comments

Comments

@cleydyr
Copy link

cleydyr commented Feb 12, 2019

Using "mathjs": "^5.4.2"

function fun(horario, carga, disponibilidade) {
  const [d, t] = disponibilidade.size();
  return math.chain(horario)
    .multiply(carga.map(v => math.sign(v)))
    .bitNot()
    .bitOr(math.transpose(disponibilidade))
    .bitNot()
    .sum()
    .done();
}

const H3 = math.matrix([
  [0, 0, 1, 0, 0, 1, 1, 0, 0],
  [0, 1, 0, 0, 1, 0, 0, 0, 1],
  [1, 0, 0, 1, 0, 0, 0, 0, 1],
  [1, 0, 0, 1, 0, 0, 0, 1, 0]
]);

const H3_ = math.matrix([
  [0, 0, 1, 0, 0, 1, 1, 0, 0],
  [0, 1, 0, 0, 1, 0, 0, 0, 1],
  [1, 0, 0, 1, 0, 0, 0, 0, 1],
  [1, 0, 0, 1, 0, 0, 0, 1, 0]
], 'sparse');

const K = math.matrix(
  [  [ 2, 0, 0, 0, 0 ],
     [ 0, 0, 1, 0, 0 ],
     [ 0, 0, 0, 1, 0 ],
     [ 0, 0, 2, 0, 0 ],
     [ 0, 1, 0, 0, 0 ],
     [ 0, 0, 1, 0, 0 ],
     [ 1, 0, 0, 0, 0 ],
     [ 0, 0, 0, 0, 1 ],
     [ 0, 0, 0, 2, 0 ] ],
     'sparse'
);

const D = [
  [1, 1, 1, 0],
  [0, 1, 0, 0],
  [1, 1, 1, 1],
  [1, 1, 1, 1],
  [0, 1, 0, 1]
]

assert.equal(fun(H3, K, D), fun(H3_, K, D))

Results in an assertion error.

@cleydyr cleydyr changed the title Calculations done with sparse matrices may yield different results Calculations done with sparse matrices may yield different results than when done with dense matrices Feb 12, 2019
@josdejong josdejong added the bug label Feb 12, 2019
@josdejong
Copy link
Owner

Thanks for reporting @cleydyr , looks like a bug in SparseMatrix.

I've done a bit of digging already. The following shows something interesting going in when either multiply of two sparse matrices, or in the SparseMatrix.map which is called by bitNot:

const math = require('mathjs')

const a = math.matrix([
  [0, 0, 1, 0, 0, 1, 1, 0, 0],
  [0, 1, 0, 0, 1, 0, 0, 0, 1],
  [1, 0, 0, 1, 0, 0, 0, 0, 1],
  [1, 0, 0, 1, 0, 0, 0, 1, 0]
]);

const b = math.sparse([
  [1, 0, 0, 0, 0],
  [0, 0, 1, 0, 0],
  [0, 0, 0, 1, 0],
  [0, 0, 1, 0, 0],
  [0, 1, 0, 0, 0],
  [0, 0, 1, 0, 0],
  [1, 0, 0, 0, 0],
  [0, 0, 0, 0, 1],
  [0, 0, 0, 1, 0]
])

console.log('m = multiply(a, b)')

const m = math.multiply(a, b)
const m_sparse = math.multiply(math.sparse(a), b)

// notice that the _index property of the two sparce matrices is not equal
// but the result of .toArray is.
console.log('m =', m)
/*
m = Matrix {
  _values: [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ],
  _index: [ 0, 2, 3, 1, 0, 1, 2, 3, 0, 1, 2, 3 ],
  _ptr: [ 0, 3, 4, 8, 11, 12 ],
  _size: [ 4, 5 ],
  _datatype: undefined }
*/
console.log('m_sparse =', m_sparse)
/*
m_sparse = Matrix {
  _values: [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ],
  _index: [ 2, 3, 0, 1, 1, 2, 3, 0, 0, 1, 2, 3 ],
  _ptr: [ 0, 3, 4, 8, 11, 12 ],
  _size: [ 4, 5 ],
  _datatype: undefined }
*/

console.log('m =       ', JSON.stringify(m.toArray()))
console.log('m_sparse =', JSON.stringify(m_sparse.toArray()))
// m =        [[1,0,1,1,0],[0,1,1,1,0],[1,0,1,1,0],[1,0,1,0,1]]
// m_sparse = [[1,0,1,1,0],[0,1,1,1,0],[1,0,1,1,0],[1,0,1,0,1]]

console.log()
console.log('b = bitNot(m)')

const n = math.bitNot(m)
const n_sparse = math.bitNot(m_sparse)

console.log('n =       ', JSON.stringify(n.toArray()))
console.log('n_sparse =', JSON.stringify(n_sparse.toArray()))
// n =        [[-2,-1,-2,-2,-1],[-1,-2,-2,-2,-1],[-2,-1,-2,-2,-1],[-2,-1,-2,-1,-2]]
// n_sparse = [[-2,-1,-2,-2,-1],[-1,-2,-1,-2,-1],[-1,-1,-1,-2,-1],[-1,-1,-1,-1,-2]]

Help in finding the underlying cause would be very welcome!

@josdejong
Copy link
Owner

I've found the issue causing this behavior: the values in a SparseMatrix can be out of order, but the map and forEach methods assumed the values are in order.

Fixed via 69acb4f. @rjbaucells would be great if you can have a look at this fix to see whether I didn't overlook anything here, and whether the solution makes sense.

@josdejong
Copy link
Owner

@cleydyr this issue is fixed now in v5.5.0, can you give that a try?

@rjbaucells
Copy link
Collaborator

@josdejong, if I remember correctly the Compressed column storage format requires the index to be ordered (not 100% sure since it has been a while I do not work with sparse matrices), if this is the case the problem should in in the sparse multiplication that is returning an invalid matrix and not in the actual matrix code. Your fix will fix an invalid matrix but has a high computation penalty since you are rebuilding the matrix, imagine a big matrix...

@josdejong
Copy link
Owner

Thanks for your feedback Rogelio, would be great if you can double check.

When the indexes in a SparseMatrix are guarenteed to beordered, that allows many algorithms to work much more simple and efficient than the "brute force" solution I implemented to fix this issue. That would be great.

The reason I thought the indexes can be unordered is after reading your comments here: #450 (comment).

@rjbaucells
Copy link
Collaborator

@josdejong, yes you are correct, it has been a while since I worked on this implementation. SparseMatrix indices are not guaranteed to be ordered within a given row, the bug is related to filling the missing values in the SparseMatrix with zeros. Since indices are not ordered we cannot safely fill the row values from the index[k] to index[k+1]. Your implementation is correct!

I noticed there is some code to support sparse matrices with the string data type, redefining the zero value and eq operator:

// equal signature to use
let eq = equalScalar
// zero value
let zero = 0
if (isString(matrix._datatype)) {
// find signature that matches (datatype, datatype)
eq = typed.find(equalScalar, [matrix._datatype, matrix._datatype]) || equalScalar
// convert 0 to the same datatype
zero = typed.convert(0, matrix._datatype)
}

I think the zero filling should use the redefined value (replacing 0 with the redefined value zero).

Just consider the following change for performance reasons (not sure if it really has a penalty!):

const value = (i in values) ? values[i] : 0

changed by:

const value = values[i] || zero

This avoids checking the map twice for existing values.

@josdejong
Copy link
Owner

@rjbaucells thanks for double checking! Makes sense.

it is possible that const value = values[i] || 0 would be faster though I think we should do a benchmark then and only change it if there is a faster way. Browser engines optimize these sort of constructs aggressively, and often they can do their work better if you're more explicit in what you expect For example if (someObj === undefined) { ...} can be faster then if (someObj) { ... }. I'm no expert in these kind of optimizations though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants