Skip to content

mtarnovan/ruby_pg_cdc

Repository files navigation

Ruby Postgres CDC (Change data capture)

An experiment in implementing a CDC system using Postgres' replication protocol.

You need a WAL decoder plugin installed on the server, the defaults use wal2json.

Usage

The PgReplicationClient class can be passed a config (see below for defaults) and a callback which will be called with each received message of type XLogData. The message will be a Decoder::XLogData struct, see that module and the Postgres docs for details.

The default config is:

  {
    host: 'localhost',
    port: 5432,
    user: 'postgres',
    password: nil,
    dbname: 'postgres',
    slotname: 'pg_logical_test', # replication slot name
    status_interval: 10, # interval to sent status updates to server, in seconds
    plugin: 'wal2json', # server-side WAL decoder plugin
    plugin_opts: %q(("include-types" 'false', "pretty-print" 'true')),
    create_slot: true # create slot on startup
  }

Run ruby test.rb then make some queries (inserts, deletes etc.) on the specified database.

Example:

From a psql console:

$psql postgres
psql (10.5)
Type "help" for help.
postgres=# \d foo
                                      Table "public.foo"
 Column |            Type             | Collation | Nullable |            Default
--------+-----------------------------+-----------+----------+--------------------------------
 a      | integer                     |           | not null | nextval('foo_a_seq'::regclass)
 b      | character varying(30)       |           |          |
 c      | timestamp without time zone |           | not null |
Indexes:
    "foo_pkey" PRIMARY KEY, btree (a, c)

postgres=# insert into foo(b, c) values('blabla', now());
INSERT 0 1
postgres=# delete from foo;
DELETE 1
$ ruby test.rb
I, [2018-10-03T18:09:49.774895 #75649]  INFO -- : host=localhost dbname=postgres port=5432 user=postgres slotname=pg_logical_test status_interval=10 (seconds)
W, [2018-10-03T18:09:49.779846 #75649]  WARN -- : pg-logical: tried to create replication slot pg_logical_test, but it already exists
Received: {
  "change": [
    {
      "kind": "insert",
      "schema": "public",
      "table": "foo",
      "columnnames": [
        "a",
        "b",
        "c"
      ],
      "columnvalues": [
        1,
        "blabla",
        "2018-10-03 18:09:58.859627"
      ]
    }
  ]
}
Received: {
  "change": [
    {
      "kind": "delete",
      "schema": "public",
      "table": "foo",
      "oldkeys": {
        "keynames": [
          "a",
          "c"
        ],
        "keyvalues": [
          1,
          "2018-10-03 18:09:58.859627"
        ]
      }
    }
  ]
}

How it works

After connecting we create a replication slot (if config.create_slot is true), then execute IDENTIFY_SYSTEM to get the current XLog position and start replication from that XLog position. After starting replication we add a Concurrent::TimerTask that will periodically send Standby status update packets to the server (every config.status_interval), so we keep the connection alive.

Then we monitor the socket underlying the connection object and when we receive something we decode it and if it's a XLogData packet we call the callback with it. If it's a Keepalive with the requires_reply we reply immediatly (from the main thread).

To track the current lsn, which is required when replying, we keep a ReplicationState object and update it as needed each time we receive something from the server.

Set the log level to Logger::DEBUG for a more verbose output, including keepalives and output from an observer of the Status update task (StatusUpdateObserver).

Releases

No releases published

Packages

No packages published

Languages