Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connector: don't return ErrBadConn when failing to connect #1020

Merged
merged 1 commit into from Nov 8, 2019

Conversation

bouk
Copy link
Contributor

@bouk bouk commented Oct 25, 2019

ErrBadConn should only be returned for an already established connection, not when creating a new one.

This reverts #867

Description

Fixes #1019, see description there.

Checklist

  • Code compiles correctly
  • Created tests which fail without the change (if possible)
  • All tests passing
  • Added myself / the copyright holder to the AUTHORS file

Copy link
Contributor

@tjenkinson tjenkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a complete revert is the correct thing here because the points in #867 are still valid, and this will change the behaviour for people who are depending on the changes in #867 now.

I understand that you want to get the real error on the initial connection, and it was a concern I had initially in #867 also

One concern with this change is that users might be expecting and handling a net.Error from dial(), which with this PR would only be logged now.

However by reverting that PR any transient errors (including ones that are caused by an attempted query long after sql.Open() succeeded) will no longer be automatically retried by database/sql

I wonder if it's possible to detect if the call in the driver is from the initial database/sql.Open() call? If this is the case maybe we could not return ErrBadConn in this case, but still return it from automated retries

@bouk
Copy link
Contributor Author

bouk commented Oct 26, 2019

The initial Open call does not initiate a database connection, that is only done when a connection is first requested. The built in SQL package does not have a retry mechanism for connecting to the database, it has a retry mechanism for when an existing connection is in a bad state (this is why it's called ErrBadConn). PR #867 just ‘abuses’ this mechanism to retry initial connections.

If you want to have connection retries you could supply your own dialer or wrap the call site where queries are used. This is not the right way to do it. #867 also broke the existing API, so this PR reverting that (and changing the behavior again) should be done.

@tjenkinson
Copy link
Contributor

tjenkinson commented Oct 26, 2019

The initial Open call does not initiate a database connection

Sorry I meant db.Conn

The issue we were having a while back is that new connections to the pool were sometimes failing and the retry fixed it, and given it was classed as a temporary error it would make sense for it to be handled automatically, given it’s possible for anyone to get temporary errors.

My thinking at the time was to use a custom dial function that would retry internally, but I opened the pr thinking it felt more like something that should be the default behaviour, and ErrBadConn appeared to make sense

Where does it mention that ErrBadConn should be for errors during an existing connection only?

This is what I went off in the docs

// ErrBadConn should be returned by a driver to signal to the sql
// package that a driver.Conn is in a bad state (such as the server
// having earlier closed the connection) and the sql package should
// retry on a new connection.
//
// To prevent duplicate operations, ErrBadConn should NOT be returned
// if there's a possibility that the database server might have
// performed the operation. Even if the server sends back an error,
// you shouldn't return ErrBadConn.

@bouk
Copy link
Contributor Author

bouk commented Oct 26, 2019

“ a driver.Conn is in a bad state” implies that the driver.Conn already exists, which it doesn’t in the case of the Open call.

@tjenkinson
Copy link
Contributor

I'm still not sure which is correct here

I see things like this in another driver:

	// Handle any panics during connection initialization.  Note that we
	// specifically do *not* want to use errRecover(), as that would turn any
	// connection errors into ErrBadConns, hiding the real error message from
	// the user.
	defer errRecoverNoErrBadConn(&err)

https://github.com/lib/pq/blob/3427c32cb71afc948325f299f040e53c1dd78979/conn.go#L280-L284

but then I'm confused why database/sql handles the error in Conn() if it shouldn't be returned from there.

Here it is explicitly checking for the error in Conn(): https://github.com/golang/go/blob/843fec1c7d75cac3f76620e79f1680d8f058c501/src/database/sql/sql.go#L1752 which eventually calls driver.Open()

@bouk
Copy link
Contributor Author

bouk commented Oct 26, 2019

Conn handles it, because it can also return a pre-existing connection, it doesn’t need to open a new one. And note that the conn (lowercase c) method can also return ErrBadConn if the lifetime of an already open connection has expired. If the Open method of the driver doesn’t return ErrBadConn, then there’s no way to ever get that error when calling a method on DB. It’s not supposed to leak out to the user of DB.

I agree with the committer from the Postgres driver: lib/pq@04c77ed that returning ErrBadConn is the wrong thing to do.

@tjenkinson
Copy link
Contributor

Right. Yeh I think I’m swaying more towards this being the right approach now.

It would be good to flag somewhere users might start seeing more errors though without registering a custom dial function that includes retries.

@tjenkinson
Copy link
Contributor

If the Open method of the driver doesn’t return ErrBadConn, then there’s no way to ever get that error when calling a method on DB

Right. I wish this was a bit clearer from the code. Maybe Connect could replace it with something else and include a warning or something

tjenkinson added a commit to tjenkinson/go that referenced this pull request Oct 26, 2019
because it seems this error is only meant to be internal, and if `ErrBadConn` is returned from here it can also be returned from the public api

See some confusion about this here: go-sql-driver/mysql#1020

And a fix in the postgres driver here: lib/pq@04c77ed
Copy link
Member

@methane methane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

ErrBadConn should only be returned for an already established
connection, not when creating a new one.
@bouk
Copy link
Contributor Author

bouk commented Nov 8, 2019

@methane fixed the test

@methane
Copy link
Member

methane commented Nov 8, 2019

Oh, I will drop Go 1.9 support. The PR (#1017) is waiting review.

@bouk
Copy link
Contributor Author

bouk commented Nov 8, 2019

It will work either way, so it's fine to merge 🙂

@methane methane merged commit b57978c into go-sql-driver:master Nov 8, 2019
@julienschmidt julienschmidt added this to the v1.5.0 milestone Jan 3, 2020
tz70s pushed a commit to tz70s/mysql that referenced this pull request Sep 5, 2020
…iver#1020)

ErrBadConn should only be returned for an already established
connection, not when creating a new one.
tz70s pushed a commit to tz70s/mysql that referenced this pull request Sep 5, 2020
…iver#1020)

ErrBadConn should only be returned for an already established
connection, not when creating a new one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing to connect leads to ErrBadConn
4 participants