-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CDCSDK][PG Parity] Dynamic addition of table: Error encountered due to Catalog Version Mismatch: Unable to Open Relation with OID 16913 #22398
Labels
Comments
shamanthchandra-yb
added
area/cdcsdk
CDC SDK
status/awaiting-triage
Issue awaiting triage
labels
May 15, 2024
yugabyte-ci
added
kind/bug
This issue is a bug
priority/medium
Medium priority issue
labels
May 15, 2024
yugabyte-ci
changed the title
[CDCSDK][PG Parity] Dynamic addition pf table: Error encountered due to Catalog Version Mismatch: Unable to Open Relation with OID 16913
[CDCSDK][PG Parity] Dynamic addition of table: Error encountered due to Catalog Version Mismatch: Unable to Open Relation with OID 16913
May 15, 2024
yugabyte-ci
added
priority/high
High Priority
and removed
priority/medium
Medium priority issue
status/awaiting-triage
Issue awaiting triage
labels
May 15, 2024
siddharth2411
added a commit
that referenced
this issue
May 29, 2024
Summary: In Walsender, when the 1st DML arrives, we had a special startup logic wherein we were setting the yb_read_time to the value stored in field `record_id_commit_time` instead of using the record's commit_time. This creates a problem in case there is a restart and the 1st record comes from a table that was created after stream creation. Consider the following scenario: # create table named BEFORE # create slot # create table named AFTER # wait for pub refresh to happen # insert into AFTER --> When WS receives this, the yb_read_time would have been reset from pub_refresh_time to consistent_snapshot_time, so as_of query will try to find table AFTER as of the consistent_snapshot_time and will hit the "could not open relation" error. To fix this, we have removed the startup logic and simplified the logic to always use the record's commit_time to perform a cache refresh. Additionally, after initVirtualWAL RPC call from Walsender, we will set the yb_read_time to the `record_id_commit_time` field from cdc_state entry of replication slot. Added some debug logs while shipping RELATION message from Walsender & while updating yb_read_time. Jira: DB-11300 Test Plan: Jenkins: test regex: .*ReplicationSlot.* ./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot#testDynamicTableAdditionForTablesCreatedAfterStreamCreation' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot#testDDLWithDynamicTableAddition' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot#testDDLWithRestart' Reviewers: stiwary, asrinivasan, skumar, sumukh.phalgaonkar Reviewed By: asrinivasan Subscribers: yql, ycdcxcluster, stiwary Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D35228
siddharth2411
added a commit
that referenced
this issue
May 30, 2024
…che refresh Summary: **Backport description:** Had compilation failures while creating slot in java UTs since this [[ https://phorge.dev.yugabyte.com/D35189 | diff ]] has not been backported yet which renamed the slot creation method in java UTs. Fixed it by using the slot creation method present in 2024.1 **Original descrption:** Original commit: ec7f2ef / D35228 In Walsender, when the 1st DML arrives, we had a special startup logic wherein we were setting the yb_read_time to the value stored in field `record_id_commit_time` instead of using the record's commit_time. This creates a problem in case there is a restart and the 1st record comes from a table that was created after stream creation. Consider the following scenario: # create table named BEFORE # create slot # create table named AFTER # wait for pub refresh to happen # insert into AFTER --> When WS receives this, the yb_read_time would have been reset from pub_refresh_time to consistent_snapshot_time, so as_of query will try to find table AFTER as of the consistent_snapshot_time and will hit the "could not open relation" error. To fix this, we have removed the startup logic and simplified the logic to always use the record's commit_time to perform a cache refresh. Additionally, after initVirtualWAL RPC call from Walsender, we will set the yb_read_time to the `record_id_commit_time` field from cdc_state entry of replication slot. Added some debug logs while shipping RELATION message from Walsender & while updating yb_read_time. Jira: DB-11300 Test Plan: Jenkins: test regex: .*ReplicationSlot.* ./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot#testDynamicTableAdditionForTablesCreatedAfterStreamCreation' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot#testDDLWithDynamicTableAddition' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot#testDDLWithRestart' Reviewers: stiwary, asrinivasan, skumar, sumukh.phalgaonkar Reviewed By: asrinivasan Subscribers: stiwary, ycdcxcluster, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D35393
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Jira Link: DB-11300
Description
Please find stress report in JIRA
Test steps:
Observed below error in connector log:
Source connector version
fourpointfour/ybdb-debezium:0.6
Connector configuration
YugabyteDB version
2.23.0.0-b325
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: