New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance drop after upgrading 5 => 6 on populate with large volume of data #12865
Comments
Hmm. Well. Object.values is an iterator, which is passed to the .find() method. The issue with that is, that we probably dont use the iterator functionality in a performant way. We just want to determine if the result has atleast one array attribute or not. So if we have an Object with alot of attributes and the first atribute is an array, we should be exiting fast. If the last element is an Array we should exit slow. So the runtime for this is O(n). .find() third parameter is the chained Array. This means, that the javascript runtime has to pass the complete Array to the find method, even though we dont need it in this case. So instead of being able to iterate each value step by step Object.values() has to generate and hold the whole Array in memory and then pass it to the find(). So we have here a potential memory issue. Well. i dont know if v8 can magically detect that it does not need the whole array in the find operation. and tbh i am to lazy to look it up. But it would be my first guess why this operation is slow. So I would probably change it to let hasResultArrays = false;
for (const value of Object.values(resultOrder)) {
if (Array.isArray(value)) {
hasResultArrays = true;
break;
}
} let hasResultArrays = false;
for (const key in resultOrder) {
if (Array.isArray(resultOrder[key])) {
hasResultArrays = true;
break;
}
} not tested. |
Yes, i understand, il think you should probably use Previous version was not good ? |
.some() has also array as third parameter. So we have probably again the memory issue. see:
I personally wont invest time into investigating changes between version 5 and 6. Also So how could that even be a solution? But you are free to propose a PR. |
Yes i understand, of course. I think a lit trick can be done with a simple condition.
Something like that ?
for .find() vs .some() you can find a small benchmark here : Thank you ! |
Your benchmark is imho not testing the issue. The issue is, that you generate the array on the fly with Object.values(). Object.values() returns an Iterator and not an Array. To be able to call find() or some(), the javascript runtime has to generate the full Array and then pass it to the method. But your benchmark just creates some Array and then calls find() and same(), which is of course fast but does not keep in account that we have to generate an Array or an Iterator on each call. I think this benchmark is illustrating the performance issues better: |
To your first claim: Yes. Calculate it only if relevant. But also keep in mind, that recursive is set to true in the function itself to indicate that assignRawDocsToIdStructure called itself. This means, that it would be calculated on every call when recursive is set to true. You could make use of the fact that options is an Object and as such is passed as function parameter in javascript by reference. So you could do if (sorting && recursed && options.hasArray === undefined) {
options.hasArray = false;
for (const key in resultOrder) {
if (Array.isArray(resultOrder[key])) {
options.hasArray = true;
break;
}
}
} and then instead of And so you calculate it truely only once. |
Yes ! You are right, it's better like this ! That's perfect ! Do you know when this fix could be added ? Thank you so much ! |
I would prefer that you create a PR and test if the performance really improves for your case. |
Yes it works perfectly, i was creating the PR. Do i need to do something else or will it pass at next build ? |
So you can confirm that the performance improved? |
Just tested with your branch, and it works perfectly ! |
If you want you can also write into the PR that you can confirm the performance change. I guess it will than be reviewed, merged and then released as usual. Releases are made about every 1-2 weeks. So probably you have to wait a little till it is released. |
Prerequisites
Mongoose version
6.8.2
Node.js version
16.17.0
MongoDB server version
4.2.23
Typescript version (if applicable)
No response
Description
Hi ✋
I noticed a very big performance drop when i switch from Mongoose 5 to 6.
This drop is only on very large populate (bulk 500 + thousands items inside each). It was fine in v5.
After digging, it seems thats it is caused by this line in
assignRawDocsToIdStructure.js
const hasResultArrays = Object.values(resultOrder).find(o => Array.isArray(o));
If i switch with the previous condition (in mongoose 5), it works fine !
Array.isArray(_resultOrder) && Array.isArray(doc) && _resultOrder.length === doc.length
Can I put the old condition back while waiting for a fix?
Thank you for your reply
Steps to Reproduce
Expected Behavior
No response
The text was updated successfully, but these errors were encountered: