Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBZ-6709 STORE_ONLY_CAPTURED_DATABASES_DDL should default to false #4875

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ani-sha
Copy link
Member

@ani-sha ani-sha commented Sep 20, 2023

@jpechane
Copy link
Contributor

@ani-sha Thanks, it seems the test failures are related.

@ani-sha
Copy link
Member Author

ani-sha commented Sep 21, 2023

@jpechane working on the test!

@ani-sha ani-sha force-pushed the DBZ-6709 branch 2 times, most recently from 5560871 to 72284ad Compare September 27, 2023 11:43
@github-actions
Copy link

Hi @ani-sha, thanks for your contribution. Please prefix the commit message(s) with the DBZ-xxx JIRA issue key.

@ani-sha
Copy link
Member Author

ani-sha commented Sep 29, 2023

@Naros Could you please take a look on why the snapshot tests are failing? I have fixed the remaining tests. Thanks.

@Naros
Copy link
Member

Naros commented Oct 3, 2023

Hi @ani-sha, so this is somewhat of a nasty bug with how we handle default values and resolving of those values with the Configuration and Field classes. Lets look at the differences before/with your PR.

The two test failures before the PR had the following state set:

  • RelationalTableFilters determined that neither configuration was set, used the "else" block.
  • MySqlDatabaseSchema determined storeOnlyCapturedDatabases was set to true.

So essentially the schema changes that were emitted were filtered.

With the PR changes, we see the following:

  • RelationalTableFilters determines that the store.only.captured.databases is set, so uses the "else if" block.
  • MySqlDatabaseschema determined storeOnlyCapturedDatabases was set to true.

When the key is set to a non-default value, the outcome is correct; however, the issue is when the configuration property is set to its default or not provided, there was a deviation in behavior. By aligning the default value for MySQL with that of the super-type, we should be back to a common slate.

I would suggest the following changes:

  1. Modify the SnapshotParallelSourceIT and SnapshotSourceIT tests either based on using the explicit non-default of true or overriding it back to its default of false in these two tests. In either case, the test assertions will need to be adjusted accordingly as the tests took advantage of the fuzzy case described above.
  2. Consider removing the extended field from MySQL entirely and don't exclude the super-type in the MySQL config class so that there is no reference to MySqlConnectorConfig.STORE_ONLY_CAPTURED_DATABASES_DDL.

@jpechane would you concur?

@jpechane
Copy link
Contributor

jpechane commented Oct 4, 2023

Hi, in fact the option 2 makes sense as now the default is the same as the referred field so it is completely suprefluous. I'd also recommend to check description of io.debezium.relational.history.SchemaHistory.STORE_ONLY_CAPTURED_DATABASES_DDL as it seems it is not consistent with default value.

@ani-sha
Copy link
Member Author

ani-sha commented Oct 4, 2023

@jpechane seeing the description of SchemaHistory.STORE_ONLY_CAPTURED_DATABASES_DDL should the description be changed or the default behaviour?

@ani-sha
Copy link
Member Author

ani-sha commented Oct 4, 2023

Why do we have this separate config MySqlConnectorConfig .STORE_ONLY_CAPTURED_DATABASES_DDL in the first place? Since it is using the same behaviour as the SchemaHistory.STORE_ONLY_CAPTURED_DATABASES_DDL? So what was the intention behind this?

@Naros
Copy link
Member

Naros commented Oct 4, 2023

Why do we have this separate config MySqlConnectorConfig .STORE_ONLY_CAPTURED_DATABASES_DDL in the first place? Since it is using the same behaviour as the SchemaHistory.STORE_ONLY_CAPTURED_DATABASES_DDL? So what was the intention behind this?

Hi @ani-sha, the main reason was that the initial implementation of this option was that for non-MYSQL we would always capture all database-related DDL changes by default, but for MYSQL we limited this to only the captured databases likely due to the potential volume of changes that could be captured with MYSQL supporting multiple independent databases by instance install.

Now that we've unified the behavior of MYSQL with the other relational connectors, the MYSQL-specific setting no longer makes sense.

@@ -213,7 +213,7 @@ else if (Objects.equals(previousSnapshotSourceField, "last_in_data_collection"))
.isEqualTo(schemaEventsCount);
assertThat(schemaChanges.ddlRecordsForDatabaseOrEmpty("").size()
+ schemaChanges.ddlRecordsForDatabaseOrEmpty(OTHER_DATABASE.getDatabaseName()).size())
.isEqualTo(useGlobalLock ? 1 : 5);
.isEqualTo(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ani-sha Have you tried to investigate why the change is necessary? Why the test is not behaving in the same way as before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No @jpechane, I haven't checked that. Let me try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants