Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBZ-2027 Upgrading Postgres JDBC driver to 42.2.12 #1452

Merged
merged 3 commits into from Apr 29, 2020

Conversation

gunnarmorling
Copy link
Member

@gunnarmorling
Copy link
Member Author

java.util.NoSuchElementException: No value present
	at java.util.Optional.get(Optional.java:135)
	at io.debezium.connector.postgresql.PostgresValueConverter.numericSchema(PostgresValueConverter.java:301)
	at io.debezium.connector.postgresql.PostgresValueConverter.schemaBuilder(PostgresValueConverter.java:192)

That one is interesting, it indicates a behavioral change around NUMERIC columns without length and scale. They used to no scale and length set to -1, whereas now with that driver update length is set to 131089. // CC @davecramer, was that an intentional change? Just curiosity on my side, I think we can handle it.

@gunnarmorling gunnarmorling force-pushed the DBZ-2027 branch 3 times, most recently from ca73ffb to d3038f6 Compare April 28, 2020 13:25
@gunnarmorling
Copy link
Member Author

@jpechane, can you check out this one? Let's see how the next CI run goes, but I think remaining failures are unrelated timing issues in the test suite.

@gunnarmorling
Copy link
Member Author

The change related to scale was introduced by pgjdbc/pgjdbc#1767; see the first commit message in this PR for details.

@gunnarmorling gunnarmorling force-pushed the DBZ-2027 branch 2 times, most recently from 373092d to 7151aab Compare April 28, 2020 14:49
@gunnarmorling
Copy link
Member Author

And another behavioural change in the driver (see pgjdbc/pgjdbc#1708).

@@ -434,7 +435,7 @@ public void shouldGenerateSnapshotsForPartitionedTables() throws Exception {

// verify each topic contains exactly the number of input records
assertEquals(1, topicCounts.get("test_server.public.first_table").intValue());
assertEquals(30, topicCounts.get("test_server.public.partitioned").intValue());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpechane, WDYT about this one? I think it's ok to include this change in a minor; I'd argue it makes more sense that way actually. But it also settles that we shouldn't add this in 1.1.x IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunnarmorling That's definiteyl breaking compatibility change so no way adding it to 1.1.x. IIUC this now aligns Postres with MySQL and we must warn the user to use ByLogicalTableRouter to get the functionality back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, hum. So is MySQL indeed creating one topic per partition? Wondering whether it'd not make more sense to write all partitions to a single topic actually; is the information on partitioning relevant for downstream consumers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunnarmorling I stand corrected. MySQL partitioning is about spreading the table into different physical files. POstgresSQL partitioning is about spreading to different tables. MySQL equivalent is sharding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so what number of topics do we get for a partitioned table in MySQL? Is it the same as for Postgres with the driver update?

Copy link
Contributor

@jpechane jpechane Apr 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunnarmorling If I understand this change then for n paritions and one aggregate table we get

  1. n tables for PostgreSQL partitioned table
  2. 1 table for MySQL partitioned table
  3. n table for MySQL sharded table

As PostgreSQL partitioning seems to be equivalent to MySQL sharding

@gunnarmorling
Copy link
Member Author

Sigh, something still is broken. @jpechane, should you have any idea regarding the "Unknown type named timestamp without time zone requested" issue in the logs, I'll be all ears :)

@Naros
Copy link
Member

Naros commented Apr 28, 2020

@gunnarmorling I'm wondering if the driver change is now giving us some variant column types that we didn't get previously and perhaps TypeRegistry#get(String) needs to account for this now. What if you changed the specific line:

PostgresType r = nameToType.get(name);

to

PostgresType r = nameToType.get(normalizeTypeName(name));

It looks like that would be backward compatible and might properly resolve the types.

@davecramer
Copy link
Contributor

are you using 42.2.11 or 42.2.12? This makes sense for .11, but not .12

@jpechane
Copy link
Contributor

@davecramer We are using 11 in this PR. Should we use 12 and the issue will be fixed?

@jpechane
Copy link
Contributor

@gunnarmorling I am quite unhappy with the current state of affairs. IMHO with things as they are now we should introduce compatibility matrix in the docs not only for database versions but also for JDBC driver versions.

The driver upgrade mitigates some issues with using this connector with
Postgres on Azure. It comes with some behavioural changes, though:

* column metadata for DECIMAL without scale is returned differently by
the (see pgjdbc/pgjdbc#1767): while it used
to be returned as 0, it's now returned as null. This should be
transparent to DBZ consumers
* snapshots for partitioned tables only export change events on the
partition topics now due to pgjdbc/pgjdbc#1708;
this has an impact on consumers, but I think it's more reasonable than
exporting all change events twice, one partition table and main table
topics
* Using Awaitility so we can use 100ms looping intervals; also it's more concise
* Avoiding creation of one temporary connection
@gunnarmorling
Copy link
Member Author

Should we use 12 and the issue will be fixed?

Yes, we should use 42.2.12; 42.2.11 is borked as per Dave. And in fact I was using 42.2.12, but then went back (temporarily, or so I hoped) to the earlier one to understand when the partition change came in. Just rebased and force-pushed to use 42.2.12 again.

@gunnarmorling
Copy link
Member Author

we should introduce compatibility matrix in the docs not only for database versions but also for JDBC driver versions.

@jpechane, yes, sounds like a good idea to add the driver versions. Can you log an issue?

@jpechane jpechane merged commit 3d0606b into debezium:master Apr 29, 2020
@jpechane
Copy link
Contributor

@gunnarmorling Applied, thanks for the thorough analysis.

@gunnarmorling gunnarmorling deleted the DBZ-2027 branch April 30, 2020 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants