Initializing one Connection cost ~300MB heap utilization and it reaches ~600+MB when I have 10 connections #16

sssinghsyr · 2018-06-26T22:32:29Z

I was comparing blocking JDBC with this async-pgsql2 API.

Design:
Blocking JDBC - 1000 threads making connection with the DbServer. In this case, heap utilization was <100 MB.

Async-pgsql2 - 1000 async sql-submission using 10 different connections. Heap utilization reaches >600MB.

Any suggestion for the improvement in the design.
I guess, whole pgsql2 will work efficiently when we will re-use connection, do we have ConnectionPool feature available here? Are we going to have?

How can I delete connection? Each connection initialize multiple containers which eats huge heap space.

davecramer · 2018-06-26T22:35:12Z

can you elaborate on reuse connection? I presume you are aware that connections cannot be shared ?

sssinghsyr · 2018-06-26T22:50:47Z

I have Database connector service build using this async-pgsql2 API which connects with the DbServer.
1000 Clients make connections with the connector service and it routes sql to the DbServer.
Having more connections eat lot of heap space.

Connections cannot be shared mean? I can use same connection to submit different type of submission (different sql query).

alexanderkjall · 2018-06-27T07:12:01Z

I think I will need to do a bit of profiling in order to have an informed opinion on how to do this better. Do you have any code to share on how you did your test?

My uninformed opinion is that the ByteBuffer usage in the network communication layer have potential to be optimized, we don't clear and reuse them.

Regarding connection pools, we don't have that available, but it's clear that it's needed in order to use the driver efficiently, same as the normal jdbc driver. But it should maybe be it's own project? Building a connection pool have it's own challenges and I guess that it should be database agnostic

davecramer · 2018-06-27T12:50:12Z

Regarding connection pools, we don't have that available, but it's clear that it's needed in order to use the driver efficiently, same as the normal jdbc driver. But it should maybe be it's own project? Building a connection pool have it's own challenges and I guess that it should be database agnostic.

Wow, this is not a light undertaking. There are many good poolers out there and they are significant projects in their own right. I would advise strongly that this is way out of scope

alexanderkjall · 2018-06-27T12:55:30Z

I totally agree, I tend to insert maybe's when I'm still thinking stuff through, will try to communicate more clearly.

sssinghsyr · 2018-06-27T19:25:47Z

@alexanderkjall
This is my project: async-db-connector.
I am testing with 1000 clients sending queries in parallel. Async-db-connector uses IOmultiplexing to receive all queries with single thread and call async-pgsql2 method multipleRowOperation using newly created CONNECTION.
Problem: Each client query will require new CONNECTION and it will be added into DataSource.connections LinkedQueue. There is no method to remove it from the list and delete its resources once CONNECTION is used. All the connections keep being added and there will be unnecessary iteration of all those.
This restrict me to re-use each connection to limit extra overhead of their creation.

Please correct me with the understanding of CONNECTION.
Also, why I cannot re-use CONNECTION? @davecramer

davecramer · 2018-06-27T22:17:42Z

@sssinghsyr you can re-use connections. What you can't do is use the same connection across threads concurrently

alexanderkjall · 2018-06-29T15:15:09Z

@sssinghsyr Thanks, I'll look into it and run it through a profiler, it sounds to me like there might be a lingering reference somewhere that doesn't get cleaned up when the connection is closed.

sssinghsyr · 2018-07-03T16:56:16Z

@alexanderkjall
Could you please help me to setup this project's test suite? Cannot I run the testcases without docker?
Any setup link will be preferable!

alexanderkjall · 2018-07-03T17:11:27Z

@sssinghsyr how to install docker greatly depends on what OS you are running, but maybe this can be a start: https://docs.docker.com/install/

Regarding the profiler it turns out to be a bit trickier to run visualvm with java 10 than it was with java 8, I haven't had time to set that up yet.

sagenschneider · 2018-08-01T08:22:33Z

I would like to get in #18 to see if this is still an issue. The current thread running 100% creates a ByteBuffer each loop of 1Kb. This can quickly fill the heap space.

Question: are there OutOfMemoryExceptions? if not likely just a growing heap size due to this.

sssinghsyr · 2018-08-09T21:02:52Z

@sagenschneider There was no OutOfMemoryExceptions. I was comparing application memory usage between JDBC and ADBA.

sagenschneider · 2018-08-10T04:33:22Z

@sssinghsyr Yes, likely cause. I'm near finishing the Selector work. It is running for my NioLoop tests. I'm just trying to get it running for remaining tests before merging in to complete #19 PR.

Early look is available here https://github.com/sagenschneider/pgsql2/tree/NetworkLayer/src/main/java/org/postgresql/sql2/communication/network (Note: I'm looking to read/write to ByteBuffers to avoid unnecessary copies of data and object creation).

With Selector and reading from re-used Direct ByteBuffers, it will consume a little more memory than the JDBC driver (though mostly in direct memory space and not heap). This, however, should not be in the 100's of megabytes (and potentially can be shared with HTTP handling ByteBuffers to reduce unused buffers in pools).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initializing one Connection cost ~300MB heap utilization and it reaches ~600+MB when I have 10 connections #16

Initializing one Connection cost ~300MB heap utilization and it reaches ~600+MB when I have 10 connections #16

sssinghsyr commented Jun 26, 2018

davecramer commented Jun 26, 2018

sssinghsyr commented Jun 26, 2018

alexanderkjall commented Jun 27, 2018

davecramer commented Jun 27, 2018

alexanderkjall commented Jun 27, 2018

sssinghsyr commented Jun 27, 2018

davecramer commented Jun 27, 2018

alexanderkjall commented Jun 29, 2018

sssinghsyr commented Jul 3, 2018

alexanderkjall commented Jul 3, 2018

sagenschneider commented Aug 1, 2018

sssinghsyr commented Aug 9, 2018

sagenschneider commented Aug 10, 2018

Initializing one Connection cost ~300MB heap utilization and it reaches ~600+MB when I have 10 connections #16

Initializing one Connection cost ~300MB heap utilization and it reaches ~600+MB when I have 10 connections #16

Comments

sssinghsyr commented Jun 26, 2018

davecramer commented Jun 26, 2018

sssinghsyr commented Jun 26, 2018

alexanderkjall commented Jun 27, 2018

davecramer commented Jun 27, 2018

alexanderkjall commented Jun 27, 2018

sssinghsyr commented Jun 27, 2018

davecramer commented Jun 27, 2018

alexanderkjall commented Jun 29, 2018

sssinghsyr commented Jul 3, 2018

alexanderkjall commented Jul 3, 2018

sagenschneider commented Aug 1, 2018

sssinghsyr commented Aug 9, 2018

sagenschneider commented Aug 10, 2018