Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(spanner): update PDML to take sessions from pool #2736

Merged
merged 4 commits into from Aug 24, 2020

Conversation

hengfengli
Copy link
Contributor

@hengfengli hengfengli commented Aug 17, 2020

Fixes #1951

The changes to BatchReadOnlyTransaction are reverted because the cleanup of a checked-out session across multiple clients is uncertain and may lead to a session leaking issue.

@hengfengli hengfengli added the api: spanner Issues related to the Spanner API. label Aug 17, 2020
@hengfengli hengfengli self-assigned this Aug 17, 2020
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Aug 17, 2020

// Create session.
s, err = c.sc.createSession(ctx)
sh, err := c.idleSessions.take(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this session ever returned to the pool? Or is there a specific reason why that is not necessary in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchReadOnlyTransaction returns a transaction. If we recycle the session inside this function, then the transaction would not work after returning, right?

I thought BatchReadOnlyTransaction.Close() or BatchReadOnlyTransaction.Cleanup() should recycle the session. Do you think I should add it to Close() or Cleanup()? It looks like .Close() is the right place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused about the purpose of Close() and Cleanup().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchReadOnlyTransactions are intended to be used across multiple clients. One client should create the transaction, and all other clients should reference this same transaction. Example:

  1. Client 1 creates a BatchReadOnlyTransaction by calling Client.BatchReadOnlyTransaction(ctx context.Context, tb TimestampBound). This step used to create a new session, but should now take one from the session pool of Client 1.
  2. Client 2 references the above transaction by calling Client.BatchReadOnlyTransactionFromID(tid BatchReadOnlyTransactionID). This means that Client 2 will technically be referencing a session from the session pool of Client 1.
  3. Client 1 is the one responsible for the clean up once all clients have finished using the batch transaction. That means that Client 1 should call BatchReadOnlyTransaction.Cleanup, and that method should return the session to the pool.
  4. Client 2 (and any other clients referencing the transaction) should call BatchReadOnlyTransaction.Close. That does not really do anything other than marking the transaction as no longer in use by that client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update the comment on Client.BatchReadOnlyTransaction as it specifically states that the method creates a new session.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the detailed explanation! 💯 I also updated the doc in Cleanup(), which was confusing as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think that we might need to split the two changes in this PR, as the change for BatchReadOnlyTransaction has some more impact than we first thought. Prior to this change, the way to use BatchReadOnlyTransaction was the following:

  1. Create a BatchReadOnlyTransaction on a client and pass the transaction id to any other client that should also use that transaction to read data.
  2. Once a client is done reading data, the client should call BatchReadOnlyTransaction.Close.
  3. Once all clients are done reading data, any client may optionally call BatchReadOnlyTransaction.Cleanup. Failing to do so will not really have any consequences.

With the changes in this PR the behavior changes to:

  1. Create a BatchReadOnlyTransaction on a client and pass the transaction id to any other client that should also use that transaction to read data.
  2. Once a client is done reading data, the client should call BatchReadOnlyTransaction.Close, unless it is the client that created the BatchReadOnlyTransaction.
  3. Once all clients are done reading data, only the client that created the transaction must call BatchReadOnlyTransaction.Cleanup. Failing to do so will cause a session to leak from the pool of the client that created the transaction.

I think the above might also be a good reason to reconsider whether we should make this change at all, and if we do, it should be a separate change so it can be clearly marked in the release notes as a (at least somewhat) breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2, any client can call BatchReadOnlyTransaction.Close, which code looks like:

// Close marks the txn as closed.
func (t *BatchReadOnlyTransaction) Close() {
t.mu.Lock()
defer t.mu.Unlock()
t.state = txClosed
}

Why do you say ", unless it is the client that created the BatchReadOnlyTransaction"?

For 3, "only the client that created the transaction must call BatchReadOnlyTransaction.Cleanup". I think all clients can call Cleanup, which is very similar to the previous code that I just replaced the DeleteSession request to sh.recycle().

Yes, if it fails to call Cleanup, which would cause a session leak from the pool of the client that created the transaction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how Java handles this issue then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, all clients could call both Cleanup and Close without any problems. The only real change here is that the client that created the transaction should call Cleanup, while that is optional in the current implementation.

(The Java client has not been changed to use a session from the pool yet)
The Java client does not have separate Close and Cleanup methods, instead it only has a Close method. The Close method works very much in the same way as the Cleanup method in Go and always deletes the session that is used, so it should only be called once all clients have finished reading. So for the Java client we will have a similar problem, as the current API allows Close to be called from any client, while if we choose to use a session from the pool it will change to a situation where Close should at least be also be called on the client that created the transaction.

@hengfengli hengfengli changed the title fix(spanner): update PDML and BatchROTxn to take sessions from pool fix(spanner): update PDML to take sessions from pool Aug 23, 2020
@hengfengli
Copy link
Contributor Author

@olavloite Please take a look again. Thanks.

Copy link
Contributor

@olavloite olavloite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hengfengli hengfengli merged commit e658318 into googleapis:master Aug 24, 2020
@hengfengli hengfengli deleted the take-session-from-pool branch August 24, 2020 06:27
tritone pushed a commit to tritone/google-cloud-go that referenced this pull request Aug 25, 2020
* fix(spanner): update PDML to take sessions from pool
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

spanner: PDML and BatchReadOnlyTransaction create new sessions instead of using one from the session pool
2 participants