Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(NODE-6010): Avoid triple-emitting close events #4031

Closed
wants to merge 1 commit into from

Conversation

vkarpov15
Copy link
Contributor

Description

If you kill and then restart the underlying replica set in the following script, you'll see that the close event is emitted 3 times:

'use strict';
  
const { MongoClient } = require('mongodb');

void async function main() {
  const uri = 'mongodb://127.0.0.1:27017,127.0.0.1:27018/mongoose_test';
  const client = new MongoClient(uri, { useNewUrlParser: true, useUnifiedTopology: true });
  await client.connect();

  const db = client.db();
  const collection = db.collection('people');

  let resumeAfter = null;
  let changeStream = collection.watch();
  addChangeStreamListeners(changeStream);

  function addChangeStreamListeners(changeStream) {
    changeStream.on('change', data => console.log(new Date(), data));
    changeStream.on('resumeTokenChanged', data => {
      console.log(new Date(), 'resumeTokenChanged', data);
      resumeAfter = data._data;
    });
    changeStream.on('error', data => console.log(new Date(), 'error', data));
    changeStream.on('close', async data => {
      console.log(new Date(), 'close', data, new Error().stack);
      changeStream = collection.watch([], { resumeAfter });
      addChangeStreamListeners(changeStream);
    });
  }

  while (true) {
    await new Promise(resolve => setTimeout(resolve, 8000));
    // Insert a doc, will trigger the change stream handler above
    console.log(new Date(), 'Inserting doc');
    await collection.insertOne({ name: 'Axl Rose' });
    console.log(new Date(), 'Inserted doc');
  }
}();

First time because abstract cursor next() cleans up the cursor:

if (cursor.isDead) {
// if the cursor is dead, we clean it up
// cleanupCursorAsync should never throw, but if it does it indicates a bug in the driver
// and we should surface the error
await cleanupCursor(cursor, {});
return null;
}
, which emits 'close' here:
cursor.emit(AbstractCursor.CLOSE);

2nd time because ReadableCursorStream calls destroy():

return this.destroy(err);
, which calls close() on the already closed cursor:
this._cursor.close().then(

3rd time because destroy triggers an 'error' event for reasons that I haven't quite been able to figure out

This PR makes it so that only 1 'close' event is emitted, and it looks like all it does is prevent an extra unnecessary close() call?

What is changing?

Is there new documentation needed for these changes?

What is the motivation for this change?

Release Highlight

Fill in title or leave empty for no highlight

Double check the following

  • Ran npm run check:lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: type(NODE-xxxx)[!]: description
    • Example: feat(NODE-1234)!: rewriting everything in coffeescript
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

@baileympearson baileympearson self-assigned this Mar 12, 2024
@baileympearson
Copy link
Contributor

Hey @vkarpov15 - is there a specific bug you're trying to fix here or a specific issue this is addressing? It seems like you're only touching cursor behavior in this PR, but the reproduction uses change streams so I'm curious about the actual bug reported.

I haven't been able to reproduce this behavior when testing locally. How are you restarting the cluster when testing?

3rd time because destroy triggers an 'error' event for reasons that I haven't quite been able to figure out

Our cursor stream is a Nodejs stream, which emits an error on destroy(): https://nodejs.org/api/stream.html#writabledestroyerror

@vkarpov15
Copy link
Contributor Author

is there a specific bug you're trying to fix here or a specific issue this is addressing? Node.js readable stream docs say that "the close event indicates that no more events will be emitted, and no further computation will occur". Emitting further events after emitting "close" is incorrect and can cause issues when using libs that integrate with readable streams.

It seems like you're only touching cursor behavior in this PR, but the reproduction uses change streams so I'm curious about the actual bug reported. the root cause of the issue seems to be that abstract cursor calls cleanupCursor twice: first time because next() calls cleanupCursor() if the cursor is dead, and 2nd time because the next() callback calls destroy() even if cleanupCursor() was already called. This seems unnecessary. I just noticed this issue when working with change streams, but it seems to affect all cursors.

I haven't been able to reproduce this behavior when testing locally. How are you restarting the cluster when testing? I used run-rs to start a replica set using the following command:

run-rs --mongod ~/Workspace/libs/mongodb-linux-x86_64-enterprise-ubuntu2004-5.0.7/bin/mongod

Then killed the whole replica set with Ctrl+C, waited a bit, and restarted the replica set using the same command. Below is the output.

$ node gh-14419-2.js 
(node:31632) [MONGODB DRIVER] Warning: useNewUrlParser is a deprecated option: useNewUrlParser has no effect since Node.js Driver version 4.0.0 and will be removed in the next major version
(Use `node --trace-warnings ...` to show where the warning was created)
(node:31632) [MONGODB DRIVER] Warning: useUnifiedTopology is a deprecated option: useUnifiedTopology has no effect since Node.js Driver version 4.0.0 and will be removed in the next major version
2024-03-22T15:18:40.430Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:41.433Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:42.437Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:43.440Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:44.444Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:45.447Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:46.450Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:47.453Z resumeTokenChanged { _data: '8265FDA14D000000082B0229296E04' }
2024-03-22T15:18:48.405Z Inserting doc
2024-03-22T15:18:48.448Z Inserted doc
2024-03-22T15:18:48.448Z resumeTokenChanged {
  _data: '8265FDA158000000022B022C0100296E5A1004C276126FBC214B50942DB962A3FB6B6A46645F6964006465FDA158CAE998631883171D0004'
}
2024-03-22T15:18:48.449Z {
  _id: {
    _data: '8265FDA158000000022B022C0100296E5A1004C276126FBC214B50942DB962A3FB6B6A46645F6964006465FDA158CAE998631883171D0004'
  },
  operationType: 'insert',
  clusterTime: new Timestamp({ t: 1711120728, i: 2 }),
  fullDocument: { _id: new ObjectId('65fda158cae998631883171d'), name: 'Axl Rose' },
  ns: { db: 'mongoose_test', coll: 'people' },
  documentKey: { _id: new ObjectId('65fda158cae998631883171d') }
}
2024-03-22T15:18:49.451Z resumeTokenChanged {
  _data: '8265FDA158000000022B022C0100296E5A1004C276126FBC214B50942DB962A3FB6B6A46645F6964006465FDA158CAE998631883171D0004'
}
2024-03-22T15:18:50.454Z resumeTokenChanged {
  _data: '8265FDA158000000022B022C0100296E5A1004C276126FBC214B50942DB962A3FB6B6A46645F6964006465FDA158CAE998631883171D0004'
}
2024-03-22T15:18:51.456Z resumeTokenChanged {
  _data: '8265FDA158000000022B022C0100296E5A1004C276126FBC214B50942DB962A3FB6B6A46645F6964006465FDA158CAE998631883171D0004'
}
2024-03-22T15:18:56.448Z Inserting doc
2024-03-22T15:19:00.537Z close undefined Error
    at ChangeStream.<anonymous> (/troubleshoot-mongoose/gh-14419-2.js:25:46)
    at ChangeStream.emit (node:events:514:28)
    at ChangeStreamCursor.<anonymous> (/troubleshoot-mongoose/node_modules/mongodb/lib/change_stream.js:267:52)
    at ChangeStreamCursor.emit (node:events:514:28)
    at completeCleanup (/troubleshoot-mongoose/node_modules/mongodb/lib/cursor/abstract_cursor.js:613:28)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async cleanupCursor (/troubleshoot-mongoose/node_modules/mongodb/lib/cursor/abstract_cursor.js:632:9)
    at async next (/troubleshoot-mongoose/node_modules/mongodb/lib/cursor/abstract_cursor.js:555:13)
2024-03-22T15:19:00.538Z close undefined Error
    at ChangeStream.<anonymous> (/troubleshoot-mongoose/gh-14419-2.js:25:46)
    at ChangeStream.emit (node:events:514:28)
    at ChangeStreamCursor.<anonymous> (/troubleshoot-mongoose/node_modules/mongodb/lib/change_stream.js:267:52)
    at ChangeStreamCursor.emit (node:events:514:28)
    at completeCleanup (/troubleshoot-mongoose/node_modules/mongodb/lib/cursor/abstract_cursor.js:613:28)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2024-03-22T15:19:00.539Z close undefined Error
    at ChangeStream.<anonymous> (/troubleshoot-mongoose/gh-14419-2.js:25:46)
    at ChangeStream.emit (node:events:514:28)
    at ChangeStreamCursor.<anonymous> (/troubleshoot-mongoose/node_modules/mongodb/lib/change_stream.js:267:52)
    at ChangeStreamCursor.emit (node:events:514:28)
    at completeCleanup (/troubleshoot-mongoose/node_modules/mongodb/lib/cursor/abstract_cursor.js:613:28)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2024-03-22T15:19:12.217Z resumeTokenChanged { _data: '8265FDA165000000012B0229296E04' }
2024-03-22T15:19:12.218Z resumeTokenChanged { _data: '8265FDA165000000012B0229296E04' }

@nbbeeken nbbeeken changed the title fix(cursor): Avoid triple-emitting close events fix(NODE-6010): Avoid triple-emitting close events Apr 4, 2024
@W-A-James
Copy link
Contributor

Hi @vkarpov15, apologies for the delay in response. Thanks for reporting this bug. I've reproduced the results you reported and have run your patch against our CI. While your patch does fix the immediate issue you reported, it does lead to a number of other test failures as we have certain functionality that depends on how our change streams emit close events.
Fixing this issue will take a more concerted effort to refactor our cursors which we're not able to undertake at this time. I'll update the ticket (NODE-6010) with the details you've provided here and we'll keep track of it for future consideration.

@W-A-James W-A-James closed this Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants