Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORACLEDB: Updated handling of connection errors for disposal #2608

Merged
merged 12 commits into from Jun 19, 2019

Conversation

cemremengu
Copy link
Contributor

@cemremengu cemremengu commented May 15, 2018

Solves #2514 and #2603. I tested it on my-machine 😄 I am thinking that simulating this will be tricky and probably take about 15-20 minutes for oracle (or maybe I am exaggerating)

In addition to updating connection error list, we also need to update oracledb to check for connection errors.

Disclaimer: I am not good with vanillajs (prototypes etc.) so it would be great if someone reviews the syntax/approach.

@cemremengu cemremengu changed the title Updated handling of connection errors to dispose connections correctly ORACLEDB: Updated handling of connection errors to dispose connections correctly May 15, 2018
@cemremengu cemremengu changed the title ORACLEDB: Updated handling of connection errors to dispose connections correctly ORACLEDB: Updated handling of connection errors for disposal May 15, 2018
@elhigu
Copy link
Member

elhigu commented May 16, 2018

To me code looks pretty good, I would need to checkout the connection.__knex__disposed setting if that is the thing that should be done there (probably it is correct though). @atiertant how does tihs looks to you?

@cemremengu
Copy link
Contributor Author

cemremengu commented May 16, 2018

One thing I want to add is that the connections are disposed on the knex side but the database is never notified about this due to communication error. The inactive sessions/connections stay on the database side. Not sure if the database eventually kills them though depending on the settings. This flow can be represented as:

Client:
1) grabbed a connection 
2) hit an exception 
3) propagated over and out of the scope of the connection handle from (a) 
4) so it can never give it back - it lost the track of connection 

Is there any way you know that can help this issue or any generally related experience?

@@ -135,13 +135,19 @@ Client_Oracledb.prototype.acquireRawConnection = function() {
if (options.resultSet) {
connection.execute(sql, bindParams || [], options, function(err, result) {
if (err) {
if (Client_Oracle.prototype.isConnectionError(err)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't client.isConnectionError reference the same fonction ?

@atiertant
Copy link
Contributor

@cemremengu oracle database has a dead connection detection mechanism but we should call connection.close before set the connection.__knex__disposed.
@elhigu looks great, but didn't tested it

@cemremengu
Copy link
Contributor Author

@atiertant calling connection.close seems to throw an error though so I am not sure if it achieves anything

 ERROR 8672 nodejs.unhandledRejectionError: NJS-003: invalid connection

@atiertant
Copy link
Contributor

@cemremengu see node-oracledb examples
https://github.com/oracle/node-oracledb/blob/master/examples/select1.js#L73
should ask @cjbj on a node-oracledb issue which errors needs to close the connection and which other need only to remove the connection object from pool ...
can't find this kind of mechanism in the node-oracledb driver pool, does it has the same bug ?

@cemremengu
Copy link
Contributor Author

Created #913

@cemremengu
Copy link
Contributor Author

cemremengu commented May 19, 2018

Added connection.close() to conform what @cjbj stated

Closing the connection is the way to remove it from the connection pool, which is important so the pool slot can be reused by a new valid connection. You should always call conn.close(). What you may want to do is ignore errors on the close.


// If the error is any of these, we'll assume we need to
// mark the connection as failed
isConnectionError(err) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/oracle/odpi/blob/master/src/dpiError.c#L65 for a bigger error list used by the node-oracledb connection pool. Node-oracledb also uses an OCI call at https://github.com/oracle/odpi/blob/29968fc2085941c40fce8f56a42c9920305d2e29/src/dpiConn.c#L204 which does most of the same checks anyway.

'NJS-040',
'NJS-024',
'NJS-003',
'NJS-024',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NJS-024 is duplicated a few lines above. And arguably shouldn't be on the list.

@cjbj
Copy link

cjbj commented May 21, 2018

The issue heading mentions oracledb but the files changed are for dialect/oracle, not dialect/oracledb. Is this what you intended?

Also, my comments about connection pooling are for the node-oracledb connection pool, which isn't used in dialects/oracledb

@elhigu
Copy link
Member

elhigu commented May 21, 2018

@cjbj looks like oracledb is actually inherited from oracle, so that might explain the changes in the oracle directory.

@kibertoad
Copy link
Collaborator

@cjbj Does that mean that currently used list of errors that should close connections is invalid in case of not using oracledb connection pool?

@cjbj
Copy link

cjbj commented Jun 5, 2018

@kibertoad the ORA errors that indicate a bad connection are pretty much going to be the same for pooled & non-pooled connections. You could check for the same numbers that ODPI-C uses, see my earlier link.

@kibertoad
Copy link
Collaborator

@cemremengu Could you please provide me write permissions to your repo so that I could fix the PR accordingly to cjbj's comments without forking and resubmitting the pr?

@kibertoad
Copy link
Collaborator

@cjbj Thanks!

@cemremengu
Copy link
Contributor Author

cemremengu commented Jun 6, 2018

@kibertoad Sorry, snowed under work lately. Updated the error list accordingly

Would be great if you test and give feedback on this patch!

@kibertoad
Copy link
Collaborator

@cemremengu looks like there are random failures, can you rerun tests by closing and opening pr?

@cemremengu cemremengu closed this Jun 6, 2018
@cemremengu cemremengu reopened this Jun 6, 2018
@cemremengu
Copy link
Contributor Author

@kibertoad done. Also given you permission in case you want to tweak stuff

@kibertoad
Copy link
Collaborator

awesome, thank you! also tests passed, yay :)

@elhigu
Copy link
Member

elhigu commented Jun 22, 2018

I think this could be tested by mocking/monkeypatching client.isConnectionError method and returning the original implementation after tests.

@kibertoad
Copy link
Collaborator

@elhigu What should such test assert?

@elhigu
Copy link
Member

elhigu commented Jul 2, 2018

@kibertoad that connection state is set to disposed in all of the changed code paths (test should fail before this change for some paths).

@alanchristensen
Copy link

What's the status of this PR?

@kibertoad
Copy link
Collaborator

@alanchristensen I could resolve the merge conflict, but would really appreciate it if someone else wrote the test :)

@pablopen
Copy link
Contributor

Hi, any updates on this?
Looks like it could solve the issues when oracle connections are consuming one pool's connection and failing each request that fall in that connection.

@kibertoad
Copy link
Collaborator

@pablopen Any chance you would volunteer to write at least some test for it?

@pablopen
Copy link
Contributor

pablopen commented Mar 6, 2019

@kibertoad I could try. I'll check elhigu suggestion of mocking the errors.

@kibertoad
Copy link
Collaborator

@pablopen You are my hero :)

@kibertoad kibertoad merged commit 6e5e296 into knex:master Jun 19, 2019
@kibertoad kibertoad deleted the oracle-connection-error branch June 19, 2019 19:26
@elhigu
Copy link
Member

elhigu commented Jun 20, 2019

I would still have preferred having some tests for this, now this is waiting for regression to happen if anyone ever touches that code.

@kibertoad
Copy link
Collaborator

@elhigu In ideal world, yes. However, year passed, and noone did, and I lost all hope to ever get to it myself; and this change is both straightforward enough and useful (as evidenced by already reported problems solved by it).

@elhigu
Copy link
Member

elhigu commented Jun 20, 2019

In this particular case I considered also that benefits were bigger than risks of it causing problems later on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants