Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom dial func causes pgx to hang indefinitely #1413

Closed
enocom opened this issue Dec 5, 2022 · 2 comments
Closed

Custom dial func causes pgx to hang indefinitely #1413

enocom opened this issue Dec 5, 2022 · 2 comments

Comments

@enocom
Copy link
Contributor

enocom commented Dec 5, 2022

The Cloud SQL Go Connector works with pgx by providing an implementation of pgconn.DialFunc that creates a TLS 1.3 connection in a seamless way for callers.

The connector wraps the resulting TLS connection in part to support a MySQL health check by adding an implementation of syscall.SyscallConn. The full details are here, but in short, the MySQL driver uses syscall.SyscallConn to do a zero byte read to ensure the connection is alive, and otherwise knows to recycle it. Even though the Go standard library advises that reading and writing to the underlying connection will corrupt the TLS session, this works because no reads or writes are actually performed.

Meanwhile, in the pgx v5, there is now a new NetConn type that uses the syscall.SyscallConn interface to perform non-blocking reads and writes.

When I try to upgrade the Go Connector to pgx v5, though, the reads and writes hang indefinitely, presumably because pgx is trying to use the syscall.SyscallConn interface which otherwise does not work given the connection is actually encrypted.

The options for fixing this in the Go Connector seem to be:

  1. Drop support for the MySQL health check
  2. Don't use pgx v5
  3. Have pgx add a configuration value for opting out of the non-blocking reads and writes

None of the above are ideal. Are there any other options here?

@jackc
Copy link
Owner

jackc commented Dec 6, 2022

It is really messy working with non-blocking IO and timeouts and TLS. The ultimate solution would be non-blocking IO to be directly supported by net.Conn and tls.Conn (something like golang/go#15735 or golang/go#36973). I'm not happy with pgx's NetConn -- it's just the only way I could figure out.

But that doesn't solve anything now.

The Cloud SQL Go Connector works with pgx by providing an implementation of pgconn.DialFunc that creates a TLS 1.3 connection in a seamless way for callers.

Could this be wrapped with something that doesn't implement the syscall.SyscallConn interface? e.g.

		// Tell the driver to use the Cloud SQL Go Connector to create connections
		config.ConnConfig.DialFunc = func(ctx context.Context, _ string, instance string) (net.Conn, error) {
			conn, err := d.Dial(ctx, "project:region:instance")
			if err != nil {
				return nil, err
			}
			return NetConnWrapperThatDoesNotImplementSyscallConn(conn), nil
		}

  1. Have pgx add a configuration value for opting out of the non-blocking reads and writes

This wouldn't be too bad. It's not supported on Windows so there already is an entire path with it disabled. It would just be a bit of plumbing to expose that setting. Or for that matter, the syscall.SyscallConn disabling wrapper would also be a reasonable approach for pgx to recommend if non-blocking IO was a problem.


I know this doesn't help now, but I really think that the Go net and tls packages are really the only places this can be cleanly and totally correctly solved.

@enocom
Copy link
Contributor Author

enocom commented Dec 10, 2022

After thinking about this for a few days, I realize we can isolate the SyscallConn behavior to just MySQL and so don't need any changes here. Thanks for your hep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants