Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(NODE-4385): add cmap pool pausing functionality #3321

Merged
merged 55 commits into from Sep 7, 2022

Conversation

dariakp
Copy link
Contributor

@dariakp dariakp commented Jul 13, 2022

Description

NODE-4385 / NODE-2994

What is changing?

CMAP implementation updates

  • implementation of ready() method
  • implementation of PoolClearedError
    • There is follow up work noted in two tickets:
      • one to propagate the original error, which requires sdam-awareness
      • another to make sure the error is fully retryable, which involves a test from a separate DRIVERS ticket
        • this is partially, and possibly fully, addressed in this PR, but I left the TODO ticket in order to verify with the follow up retryability testing
  • checkout failure is now exclusively handled by processWaitQueue, as is clearing the wait queue
    • This consolidation was done so that pool paused and pool closed checkout errors could be propagated uniformly to all connection checkout events regardless of whether they were already in the wait queue when the clear or close was called or happened afterwards
  • createConnection now always calls the provided callback so that we make sure members on top of the waitqueue awaiting connection establishment don't get left in limbo when the pool is cleared or closed
  • pending connections are now exclusively tracked inside the createConnection method for easier handling
  • clearing a closed pool should be a no-op

SDAM implementation updates

  • Server description transition to data-bearing or direct connection marks pool as ready
  • New ResetPool error label is used to clear the pool on server update, when applicable, in order to synchronize pool clearing logic with server marked unknown (per the spec requirement) - this eliminates potential race conditions where a server with a paused pool is still selectable
  • Removes the resetConnectionPool event emission because the pool reset is now handled synchronously with server reset update
  • Removes duplicate call to resetServer that is already performed by the original failure handler
  • Server/topology errors can be any errors: updated logic to wrap incoming error in MongoError so that we can safely check for labels; the original error is stored in a cause property, consistent with the new cause standard on errors in general (available in Node >16.9, so implemented manually for now, for compatibility reasons)

Test updates

  • CMAP
    • Unskipping all CMAP tests
  • SDAM
    • New prose test
    • Unskipping minPoolsize-error automated (unified) test
  • Unified valid-pass
    • Unskipping command tests that are working now
Is there new documentation needed for these changes?

Possibly, depends on what we currently say about the inner workings of the connection pool

What is the motivation for this change?

Avoiding connection storms

Double check the following

  • Ran npm run check:lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: <type>(NODE-xxxx)<!>: <description>
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

@dariakp dariakp added the wip label Jul 13, 2022
Base automatically changed from NODE-4415/refactor-cmap to main July 14, 2022 14:13
@dariakp dariakp force-pushed the NODE-4385/implement-pool-pausing branch from 17764c1 to 4ce5cf0 Compare July 20, 2022 21:00
@dariakp dariakp force-pushed the NODE-4385/implement-pool-pausing branch from 1605637 to fe8417f Compare July 27, 2022 21:30
@dariakp dariakp force-pushed the NODE-4385/implement-pool-pausing branch 2 times, most recently from 3e9715e to b696513 Compare August 26, 2022 21:59
@dariakp dariakp force-pushed the NODE-4385/implement-pool-pausing branch from 9334235 to 83d2a21 Compare August 29, 2022 22:23
@dariakp dariakp force-pushed the NODE-4385/implement-pool-pausing branch from 75d48d3 to 0206e84 Compare August 31, 2022 19:39
@dariakp dariakp changed the title feat: add cmap pool pausing functionality feat(NODE-4385): add cmap pool pausing functionality Sep 1, 2022
@dariakp dariakp added Primary Review In Review with primary reviewer, not yet ready for team's eyes and removed wip labels Sep 1, 2022
Copy link
Contributor Author

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to re-review everything here and update the PR description, but it's ready for a primary review

src/cmap/connection_pool.ts Outdated Show resolved Hide resolved
src/sdam/monitor.ts Show resolved Hide resolved
src/sdam/monitor.ts Outdated Show resolved Hide resolved
src/sdam/server.ts Show resolved Hide resolved
src/sdam/server.ts Show resolved Hide resolved
src/sdam/server_description.ts Outdated Show resolved Hide resolved
@dariakp dariakp marked this pull request as ready for review September 2, 2022 21:27
@nbbeeken nbbeeken added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Sep 2, 2022
@nbbeeken nbbeeken merged commit 335ee55 into main Sep 7, 2022
@nbbeeken nbbeeken deleted the NODE-4385/implement-pool-pausing branch September 7, 2022 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team Review Needs review from team
Projects
None yet
2 participants