Skip to content

Setting up Replication for a new Customer

Mike Lissner edited this page Apr 10, 2024 · 12 revisions

One of our major and unique services is a replicated database of all our public data. This page describes how to set this up.

There are four major steps:

  1. Creating the database
  2. Setting up the networking for the database
  3. Setting up replication
  4. Setting up alarms for the replication
  5. Monitoring the replication

Prelude

Back when PACER Santa was sending us tons of content, I went a little crazy trying to make our replicas keep up. They couldn't because subscribers only use a single CPU. In Postgres 16, this is fixed. The release notes state:

Allow parallel application of logical replication

The CREATE SUBSCRIPTION STREAMING option now supports parallel to enable application of large transactions by parallel workers. The number of parallel workers is controlled by the new server variable max_parallel_apply_workers_per_subscription. Wait events LogicalParallelApplyMain, LogicalParallelApplyStateChange, and LogicalApplySendData were also added. Column leader_pid was added to system view pg_stat_subscription to track parallel activity.

We should start using Postgres 16 as a matter of standard practice, and if we have another PACER Santa, now we know the solution. I think.

Create a new AWS server & database

  1. Create an AWS server with 1.2TB of general purpose gp3 storage space, as much RAM and CPU as needed. Recommended instance running Postgresql 16 in the Oregon availability zone.

  2. The server needs a database configured as follows:

    • Single DB instance
    • Instance name (a spice, was customers, but no more)
    • maser username: postgres (we'll create the django user later)
    • create and use a password
    • instance type: t3.xlarge
    • Storage:
      • 1200 GB gp2 storage, autoscale to 1500.
    • Connections:
      • Don't connect to EC2
      • VPC of vpc-0d1020b9cea79a3e3
      • Subnet group of default-vpc-0d102...
      • publicly accessible
      • Security group of rds-launch-wizard, and replication-sg-2
      • Availability zone of us-west-2b
      • No RDS proxy
      • Certificate authority: default
      • port: 5432
    • Password auth
    • Monitoring:
      • Performance insights on
      • No devops guru
      • Enhanced monitoring 60s.
    • Additional config:
      • dbname: postgres (we'll create the courtlistener DB ourselves)
      • replica-16 parameter group
      • Automated backups of three days (see #1530)
      • Enable encryption
  3. Allow ingress to the server:

    • From your laptop, ping the RDS domain name for the new server.
    • Add the IP of the server to the inbound rules in the security group.

    In AWS this is done through your VPC's security group. The one to edit is: sg-044759e3ba310566a.

  4. Test the connection:

     psql --host replica-for-documentation2.xxx.rds.amazonaws.com
    

Networking

Generally this went well. The general architecture is:

  • Route 53 connecting a domain name to ...
  • Elastic Load Balancer (ELB) connected to...
  • ELB Target Groups, routing data to...
  • An EC2 proxy running...
  • HAProxy, which forwards traffic to...
  • The RDS instance

The trick is to set this all up from bottom to top and to have it all work at the end. You have to do it that way so each piece can connect to the piece before.

EC2 & HAProxy

This is running on an EC2 micro instance that's built from a saved AMI. Just launch that AMI:

Launch!

Once launched, ssh into the instance...

ssh -i .ssh/aws-replica-keys.pem ubuntu@xyz

...and tweak the proxy settings to point to the RDS instance. (see bash history in the AMI for details)

Restart HAProxy for good measure and check its logs. They should say almost nothing.

Check that you can use psql to get into RDS from the CL server through the webhook proxy:

apk add postgresql-client
psql --host ec2-xxx.compute.amazonaws.com

ELB Target Groups

Create a target groups (plural) and register the EC2 instance as a target:

  • Target type: Instance
  • Name $spice-{5432,5433}-tcp
  • Protocol: tcp
  • Port: 5432, 5433
  • free-law-vpc
  • Health check protocol: tcp
  • Select EC2 instance as target

Elastic Load Balancer

  • Network load balancer
  • Name: {spice}-replica-elb
  • Scheme: Internet-facing
  • free-law-vpc
  • us-west-2d
  • No security groups (it'll give a warning, that's OK)
  • Add listeners from above

It may take a moment to provision (note the "Status" field). Even once it's live, it can take a minute, but eventually you'll be able to use psql to connect directly to the DNS name.

Route 53

This part is easy. Just set up an A record as an alias to the ELB. Remember that subdomains aren't private, so use codenames here if needed.

Once it's done, you should be able to psql directly to something like spicey.courtlistener.com!

If you get an error like:

↪ psql -h spice.courtlistener.com
psql: error: could not translate host name "lavendar.courtlistener.com" to address: No address associated with hostname

That's OK. Just wait a few minutes for DNS to propagate.

Set up Replication

Replication Ingress and Egress

There are two steps. First, you have to allow the customer server's IP address for inbound and outbound traffic. Get their IP and allow it.

Second, you have to allow inbound traffic from the external IP address of the new client RDS server. This should already be done if you did the process above. You can get that IP address by pinging it from your laptop. Something like:

ping some-rds-server-us-west-2.rds.amazonaws.com

The above is done in the Security Group settings.

Set up replication

  1. On the subscriber server, log in as the postgres user and create the django user:

    CREATE USER django CREATEDB NOSUPERUSER;
    \password django
    
  2. Create the latest version of the database schema. Run the following on the maintenance server:

     TABLES=$(
       psql --host $DB_HOST -p 5432 -d courtlistener -U django  --tuples-only -c \
         "SELECT CONCAT(table_schema, '.', table_name) \
          FROM information_schema.tables \
          WHERE table_type='BASE TABLE' AND table_schema IN         ('public','auth');" | \
       xargs -I{} echo -n " -t {}"
     )
    

    That makes a nice $TABLES variable. Do echo $TABLES to check it, then:

     pg_dump --host $DB_HOST \
       --username django \
       --create \
       --schema-only \
       --no-privileges \
       --no-publications \
       --no-subscriptions courtlistener \
       $TABLES \
       --password | grep -v -i 'trigger'
    

    Copy the file to your laptop in a file named cl_schema.some-date.sql, then open it in an editor. Audit references toTRIGGER, plpgsql, GRANT, PUBLICATION, SUBSCRIPTION, EXTENSION, or REPLICA (you should not see any of these). Delete the locale_provider parameter until all DBs are on version 15.

  3. Import the schema into the new server:

      psql --host spicy.courtlistener.com \
           --port=5432 --user django --dbname postgres < cl_schema.2019-02-27.sql
    

    Note that the above uses the new django user, not the postgresql admin user to connect.

  4. Create a publication on cl-replica for the user to subscribe to:

    • Log into the publisher with django user:

      ./manage.py dbshell --database replica

    • List existing publications:

      select * from pg_publication;
      
    • Make a new publication:

      For search, people DB, and disclosures (2023-10-20):

      CREATE PUBLICATION $SPICE_publication FOR TABLE ONLY
          --Audio
          audio_audio, audio_audio_panel, 
          --Disclosures
          disclosures_agreement, disclosures_debt, disclosures_financialdisclosure, disclosures_gift, disclosures_investment, disclosures_noninvestmentincome, disclosures_position, disclosures_reimbursement, disclosures_spouseincome,
          --People
          people_db_abarating, people_db_attorney, people_db_attorneyorganization, people_db_attorneyorganizationassociation, people_db_criminalcomplaint, people_db_criminalcount, people_db_education, people_db_party, people_db_partytype, people_db_person, people_db_person_race, people_db_politicalaffiliation, people_db_position, people_db_race, people_db_role, people_db_school, people_db_source,
          --FJC
          recap_fjcintegrateddatabase,
          --Search
          search_bankruptcyinformation, search_citation, search_claim, search_claim_tags, search_claimhistory, search_court, search_court_appeals_to, search_courthouse, search_docket, search_docket_panel, search_docket_tags, search_docketentry, search_docketentry_tags, search_opinion, search_opinion_joined_by, search_opinioncluster, search_opinioncluster_non_participating_judges, search_opinioncluster_panel, search_opinionscited, search_opinionscitedbyrecapdocument, search_originatingcourtinformation, search_parenthetical, search_parentheticalgroup, search_recapdocument, search_recapdocument_tags,  search_tag
          ;
      

      To create a new list of tables, use \pset pager off then \dt then copy/paste into a spreadsheet. Certainly better ways to do this.

  5. Create a new user for the subscriber to subscribe with on cl-replica.

    You can see existing roles with:

     SELECT rolname FROM pg_roles;
    

    On cl-replica do:

     CREATE ROLE $SPICEY WITH LOGIN;
     GRANT rds_replication to $SPICEY;
    

    Set the password (note that really long passwords will fail when you try to create the subscription, so be careful about that):

     \password some-user
    

    And grant them select on the needed tables:

     GRANT SELECT on table
         -- ALL OF THE TABLES ABOVE, USE PSQL 
         -- HISTORY BY PRESSING UP
     TO XXXXX;
    
  6. Set up the subscription on the new server:

    • On subscriber, connect to courtlistener DB as postgres user.

    • Run:

      CREATE SUBSCRIPTION $SPICE_subscription CONNECTION 'host=cl-replica.xxx.amazonaws.com port=5432 user=$spice password=xxx dbname=courtlistener' PUBLICATION $SPICE_publication;
      

    Subscriptions can be viewed with:

     select * from pg_subscription;
    

Statistics

Once the data has synced, you'll want to log in as the django user and run VACUUM ANALYZE. This will clean things up and update the statistics used by the query planner so that it can make smart queries for our clients. It takes a few minutes to run. (See #3954 for details.)

Alarms

We need to monitor replication storage, because if it runs out databases get very sad. To do this you have to create a new alarm for the new database, and then you need to add that alarm to the composite alarm.

  1. To create the new alarm, use this link and just put the new instance's name into the instance ID field:

  2. To update the composite alarm, go here and add the alarm you created in step one.

Monitoring

There are a few tools:

  • You should look at the logs for the publisher and subscriber. These are good-looking logs on a subscriber:

    2023-09-09 19:15:22 UTC::@:[24225]:LOG: logical replication apply worker for subscription "basil_subscription" has started
    2023-09-09 19:15:22 UTC::@:[24226]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_opinioncluster_non_participating_judges" has started
    2023-09-09 19:15:22 UTC::@:[24227]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "people_db_school" has started
    2023-09-09 19:15:22 UTC::@:[24226]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_opinioncluster_non_participating_judges" has finished
    2023-09-09 19:15:22 UTC::@:[24228]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_opinioncluster_panel" has started
    2023-09-09 19:15:22 UTC::@:[24227]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "people_db_school" has finished
    2023-09-09 19:15:22 UTC::@:[24229]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_opinion_joined_by" has started
    2023-09-09 19:15:22 UTC::@:[24229]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_opinion_joined_by" has finished
    2023-09-09 19:15:22 UTC::@:[24230]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_docketentry_tags" has started
    2023-09-09 19:15:31 UTC::@:[24228]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_opinioncluster_panel" has finished
    2023-09-09 19:15:31 UTC::@:[24542]:LOG: logical replication table synchronization worker for subscription "basil_subscription", table "search_opinionscited" has started
    2023-09-09 19:15:55 UTC::@:[391]:LOG: checkpoint starting: time
    2023-09-09 19:16:34 UTC::@:[391]:LOG: checkpoint complete: wrote 78074 buffers (15.6%); 0 WAL file(s) added, 29 removed, 33 recycled; write=37.844 s, sync=0.074 s, total=38.743 s; sync files=58, longest=0.030 s, average=0.002 s; distance=2111986 kB, estimate=2111986 kB
    2023-09-09 19:16:37 UTC::@:[391]:LOG: checkpoint starting: wal
    2023-09-09 19:17:30 UTC::@:[391]:LOG: checkpoint complete: wrote 92775 buffers (18.5%); 0 WAL file(s) added, 0 removed, 32 recycled; write=52.285 s, sync=0.097 s, total=52.539 s; sync files=14, longest=0.045 s, average=0.007 s; distance=2152682 kB, estimate=2152682 kB
    2023-09-09 19:17:34 UTC::@:[391]:LOG: checkpoint starting: wal
    2023-09-09 19:18:18 UTC::@:[391]:LOG: checkpoint complete: wrote 108679 buffers (21.7%); 0 WAL file(s) added, 0 removed, 33 recycled; write=43.611 s, sync=0.148 s, total=44.028 s; sync files=14, longest=0.069 s, average=0.011 s; distance=2161386 kB, estimate=2161386 kB
    2023-09-09 19:18:23 UTC::@:[391]:LOG: checkpoint starting: wal
    2023-09-09 19:19:06 UTC::@:[391]:LOG: checkpoint complete: wrote 112159 buffers (22.4%); 0 WAL file(s) added, 0 removed, 33 recycled; write=42.995 s, sync=0.058 s, total=43.228 s; sync files=13, longest=0.032 s, average=0.005 s; distance=2170398 kB, estimate=2170398 kB
    2023-09-09 19:19:10 UTC::@:[391]:LOG: checkpoint starting: wal
    2023-09-09 19:19:52 UTC::@:[391]:LOG: checkpoint complete: wrote 117611 buffers (23.5%); 0 WAL file(s) added, 1 removed, 33 recycled; write=41.823 s, sync=0.099 s, total=42.290 s; sync files=15, longest=0.042 s, average=0.007 s; distance=2160626 kB, estimate=2169420 kB
    2023-09-09 19:19:56 UTC::@:[391]:LOG: checkpoint starting: wal
    2023-09-09 19:20:45 UTC::@:[391]:LOG: checkpoint complete: wrote 114389 buffers (22.9%); 0 WAL file(s) added, 0 removed, 32 recycled; write=49.108 s, sync=0.023 s, total=49.286 s; sync files=15, longest=0.012 s, average=0.002 s; distance=2160740 kB, estimate=2168552 kB
    2023-09-09 19:20:51 UTC::@:[391]:LOG: checkpoint starting: wal
    2023-09-09 19:21:43 UTC::@:[391]:LOG: checkpoint complete: wrote 216523 buffers (43.3%); 0 WAL file(s) added, 0 removed, 33 recycled; write=51.979 s, sync=0.072 s, total=52.248 s; sync files=14, longest=0.032 s, average=0.006 s; distance=2163518 kB, estimate=2168049 kB
    2023-09-09 19:21:48 UTC::@:[391]:LOG: checkpoint starting: wal
    2023-09-09 19:22:37 UTC::@:[391]:LOG: checkpoint complete: wrote 152179 buffers (30.4%); 0 WAL file(s) added, 0 removed, 33 recycled; write=49.360 s, sync=0.072 s, total=49.635 s; sync files=17, longest=0.038 s, average=0.005 s; distance=2160177 kB, estimate=2167262 kB
    

    The publisher is harder to analyze, because it has lots of places it is publishing, but this is a good-looking publisher log:

    2023-09-09 19:23:48 UTC:54.191.250.243(53690):basil@courtlistener:[28109]:STATEMENT: START_REPLICATION SLOT "pg_19905_sync_17216_7264776336224564190" LOGICAL EC21/3F0BB3F0 (proto_version '2', publication_names '"basil_publication"')
    2023-09-09 19:23:48 UTC:54.191.250.243(53698):basil@courtlistener:[28111]:LOG: logical decoding found consistent point at EC21/3F0BBB40
    2023-09-09 19:23:48 UTC:54.191.250.243(53698):basil@courtlistener:[28111]:DETAIL: There are no running transactions.
    
  • We have a webpage that can be helpful: https://www.courtlistener.com/monitoring/replication-lag/

    Generally, the lag shown on that page should be pretty small, or even zero. When you set up a new replica though, you should see a few new slots show up and they're going to get backed up pretty quickly. I believe newer versions will use a bunch of slots to sync multiple tables simultaneously, so it's OK to see lots of new slots show up.

  • You can check the disk IO, CPU, and NetworkReceiveThroughput of the publisher and subscriber. All should go up when the replication starts.