New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eachAsync: Cursor already in use #8235
Comments
const stream = DbModel.find({
createdAt: { $lt: new Date() },
})
.lean()
.cursor({ batchSize: 1500 });
await stream.eachAsync(myObject => asyncDataProcessing(), { parallel: 5 }); |
One more thing: It seems sometimes if there is no error, the script just finishes earlier without all processed data sets. Sometimes a lot (e.g. supposed to iterate over 3500 sets, finished only around 700). |
Backtrace:
|
It turns out this issues is even more critical. Details to my setup: I have 2 shards with 1 replace and 1 arbitor each. I believe the problems only happen on sharded collections, but couldn't verify this yet. Anyway BEWARE: the current version of mongoose / mongodb driver does not work correctly! |
After running some tests I've found that this is definitely an issue with
For reference this is my test function: model
.find()
.read('sp')
.batchSize(1000)
.cursor()
.eachAsync(async item => console.log(item._id), { parallel: 8 }) Perhaps most interesting is that
I doubt the 8 has too much significance, seeing as original thread was using 5, but its incredibly consistent for the data set I'm testing on. Mongoose v5.7.5 |
@AndrewBarba @simllll thanks for pointing out this issue and providing details on how to repro it. Turns out some recent refactoring to 'use strict';
const mongoose = require('mongoose');
mongoose.set('debug', true);
const Schema = mongoose.Schema;
run().catch(err => console.log(err));
async function run() {
await mongoose.connect('mongodb://localhost:27017/test', { useNewUrlParser: true, useUnifiedTopology: true });
const Model = mongoose.model('Test', Schema({ name: String }));
const count = await Model.countDocuments();
const numDocs = 50000;
if (count < numDocs) {
const docs = [];
for (let i = count; i < numDocs; ++i) {
docs.push({ name: `doc${i}` });
}
await Model.create(docs);
}
console.log('Executing query...');
let numProcessed = 0;
await Model.
find().
batchSize(1000).
cursor().
eachAsync(async item => {
console.log(item.name);
++numProcessed;
}, { parallel: 9 });
console.log('NumProcessed:', numProcessed);
} |
Thanks for the quick fix, really appreciate it |
Thanks also from me, great that you found the issue so quick +1 Looking forward to 5.7.6 :) |
I think the original issue is solved, but somehow my executions take now way longer... e.g. I iterate over a big collection without any conditions ({}) and I "only" get arround 250 entries/seconds. With mongoose 5.6 I got around 2000-3000 per second. I tried playing around with batchSize, but it seems it does not make any difference. Unsure how to dbeug this correclty, but could it be that batchSize does not apply anymore? |
Do you want to request a feature or report a bug?
bug
What is the current behavior?
Since the last update of mongoose I experience a lot of "Cursor already in use" in long running queries where a cursor is used in combination with eachAsync().
If the current behavior is a bug, please provide the steps to reproduce.
Run a long runnign query and use eachASync to iterate over it. It's not 100% reproducable, but it happens significantely often.
What is the expected behavior?
No cursor already in use error.
What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version.
The text was updated successfully, but these errors were encountered: