Skip to content

Commit

Permalink
Blog: Deferrable Constraints in PostgreSQL (#218)
Browse files Browse the repository at this point in the history
* Draft

* Edits

* Edits

* Edits

* Edits to other pages.

* Image

* Edits

* Edits

* Edits to previous articles.

* Get rid of problematic markdown rendering code from #94.

* Edits
  • Loading branch information
emmercm committed Jun 21, 2022
1 parent 7114ff1 commit 6b2be09
Show file tree
Hide file tree
Showing 9 changed files with 533 additions and 12 deletions.
2 changes: 0 additions & 2 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -141,8 +141,6 @@ markdownRenderer.code = (_code, infostring, escaped) => {
language,
ignoreIllegals: true
}).value : highlight.highlightAuto(code).value;
// Fix https://github.com/segmentio/metalsmith-markdown/issues/48
_code = _code.replace(new RegExp(`^[ ]{${_code.search(/\S/)}}`, 'gm'), '');
// v1.1.0
const escapeTest = /[&<>"']/;
const escapeReplace = /[&<>"']/g;
Expand Down
4 changes: 3 additions & 1 deletion src/blog/creating-a-gremlin-playground.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
---

title: Creating a Gremlin Playground
title: Creating a Gremlin Playground in Docker
date: 2022-04-22T23:16:00
permalink: blog/creating-a-gremlin-playground
tags:
- databases
- docker
- gremlin

---
Expand Down
4 changes: 3 additions & 1 deletion src/blog/creating-a-mysql-playground.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
---

title: Creating a MySQL Playground
title: Creating a MySQL Playground in Docker
date: 2022-04-27T23:26:00
permalink: blog/creating-a-mysql-playground
tags:
- databases
- docker
- mysql

---
Expand Down
4 changes: 3 additions & 1 deletion src/blog/creating-a-postgresql-playground.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
---

title: Creating a PostgreSQL Playground
title: Creating a PostgreSQL Playground in Docker
date: 2022-04-27T22:30:00
permalink: blog/creating-a-postgresql-playground
tags:
- databases
- docker
- postgres

---
Expand Down
325 changes: 325 additions & 0 deletions src/blog/deferrable-constraints-in-postgresql.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,325 @@
---

title: Deferrable Constraints in PostgreSQL
date: 2022-06-21T02:06:00
tags:
- databases
- postgres

---

Constraints in PostgreSQL are validated immediately row-by-row by default, which might be confusing when updating multiple values in columns that have a uniqueness constraint.

Consider this contrived scenario of one table with one column that has a unique constraint (and index):

```sql
CREATE TABLE numbers (
number INT,
UNIQUE (number)
);

INSERT INTO numbers
VALUES (1)
, (2);
```

If we try to update every row in the table, incrementing the value by 1, we run into a problem:

```sql
UPDATE numbers
SET number = number + 1;

-- ERROR: duplicate key value violates unique constraint "numbers_number_key"
-- DETAIL: Key (number)=(2) already exists.
```

## Immediate vs. deferred

[Constraints](https://www.postgresql.org/docs/current/ddl-constraints.html) are `NOT DEFERRABLE INITIALLY IMMEDIATE` by default in PostgreSQL.

- `INITIALLY IMMEDIATE` means that the constraint is being validated after _each_ row is updated individually by default.
- `NOT DEFERRABLE` means we _can't_ change this setting in a transaction.

We can see every constraint in our database with this query:

```sql
SELECT ns.nspname AS schema
, class.relname AS "table"
, con.conname AS "constraint"
, con.condeferrable AS "deferrable"
, con.condeferred AS deferred
FROM pg_constraint con
INNER JOIN pg_class class ON class.oid = con.conrelid
INNER JOIN pg_namespace ns ON ns.oid = class.relnamespace
WHERE con.contype IN ('p', 'u')
AND ns.nspname != 'pg_catalog'
ORDER BY 1, 2, 3;
```

And we can see that indeed the unique constraint is `NOT DEFERRABLE` (`deferrable = false`) and `INITIALLY IMMEDIATE` (`deferred = false`):

```text
schema | table | constraint | deferrable | deferred
--------+---------+--------------------+------------+----------
public | numbers | numbers_number_key | f | f
```

When a constraint is `DEFERRED` it is not validated until the transaction commits. A constraint can be `DEFERRED` two different ways:

- The constraint can be created as `INITIALLY DEFERRED` which will set the constraint to `DEFERRED` by default.
- The constraint can be temporarily `DEFERRED` if it is `DEFERRABLE` with one of these statements:

```sql
SET CONSTRAINTS numbers_number_key DEFERRED;

SET CONSTRAINTS ALL DEFERRED;
```

✅ The following types of constraints can be deferred:

- [`PRIMARY KEY`](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-PRIMARY-KEYS) (but I wouldn't do it, see the performance considerations below)
- [`UNIQUE`](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-UNIQUE-CONSTRAINTS)
- [`REFERENCES`](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-FK) (foreign key)
- [`EXCLUDE`](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-EXCLUSION)

❌ The following types of constraints can't be deferred, which means PostgreSQL deviates from the SQL standard some:

- [`NOT NULL`](https://www.postgresql.org/docs/current/ddl-constraints.html#id-1.5.4.6.6)
- [`CHECK`](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS)

## Changing the constraint

[PostgreSQL v9.4 (2014)](https://www.postgresql.org/docs/9.4/release-9-4.html) added the ability to execute `ALTER TABLE <name> ALTER CONSTRAINT <setting>;`. Let's try changing the existing constraint on our table:

```sql
ALTER TABLE numbers
ALTER CONSTRAINT numbers_number_key DEFERRABLE;

-- ERROR: constraint "numbers_number_key" of relation "numbers" is not a foreign key constraint
```

We got an unhelpful error! That's because buried under `ALTER CONSTRAINT` in the [`ALTER TABLE` documentation](https://www.postgresql.org/docs/14/sql-altertable.html) we can find:

> Currently only foreign key constraints may be altered.
Which means we have to drop the constraint entirely and recreate it, but thankfully we can do that in one statement:

```sql
ALTER TABLE numbers
DROP CONSTRAINT numbers_number_key,
ADD CONSTRAINT numbers_number_key UNIQUE (number) DEFERRABLE INITIALLY DEFERRED;
```

And if we run the above `pg_constraint` query we can see the change take place:

```text
schema | table | constraint | deferrable | deferred
--------+---------+--------------------+------------+----------
public | numbers | numbers_number_key | t | t
```

And if we try that `UPDATE` statement again we can see it now succeeds:

```sql
UPDATE numbers
SET number = number + 1;

SELECT *
FROM numbers;
```

```text
number
--------
2
3
```

## The three constraint settings

Under the [`ALTER TABLE` documentation](https://www.postgresql.org/docs/current/sql-altertable.html) synopsis we can see that the legal syntax for `ALTER TABLE name ALTER CONSTRAINT` is:

> `ALTER CONSTRAINT constraint_name [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]`
This gives us three different combinations of settings we can create constraints with:

1. **`NOT DEFERRABLE [INITIALLY IMMEDIATE]` (default)**

Constraints are validated immediately and this setting can't change per transaction. This is the default for both primary keys and unique indexes.

2. **`DEFERRABLE [INITIALLY IMMEDIATE]`**

```sql
CREATE TABLE numbers (
number INT,
UNIQUE (number) DEFERRABLE INITIALLY IMMEDIATE
);
```

Constraints are validated immediately, but this setting can change per transaction. I would not recommend this setting as I consider per-connection, per-transaction, and per-query settings an anti-pattern that are likely covering up a root issue.

3. **`[DEFERRABLE] INITIALLY DEFERRED`**

```sql
CREATE TABLE numbers (
number INT,
UNIQUE (number) DEFERRABLE INITIALLY DEFERRED
);
```

Constraints are validated at transaction commit time and this setting can change per transaction. This is the setting we used above in our example.

## Use cases

There are quite a few reasons why you might want deferrable constraints, but not all the reasons follow traditional best practices.

### 1. Position / weight / priority columns

One reason to have a unique constraint on a single numeric column like we have above is so those rows can remember a human-sorted order. Imagine a to-do list application where you can drag and drop items, prioritizing them by due date or importance.

Re-ordering of items here is only possible because of deferred constraints:

```sql
CREATE TABLE todo_items (
id SERIAL PRIMARY KEY,
task VARCHAR NOT NULL,
priority INTEGER NOT NULL,
UNIQUE (priority) DEFERRABLE INITIALLY DEFERRED
);

-- Insert two to-do items
INSERT INTO todo_items (task, priority)
VALUES ('Clean the bathroom', 1)
, ('Go grocery shopping', 2);

-- Swap the order of the two to-do items
BEGIN;
UPDATE todo_items SET priority = 2 WHERE task = 'Clean the bathroom';
UPDATE todo_items SET priority = 1 WHERE task = 'Go grocery shopping';
COMMIT;
```

### 2. Creating a circular reference between two tables

I think this is a terrible idea and that you shouldn't do it because of the problems it can cause, but maybe you have a valid use case, or you inherited the situation. Here are some inserts in a transaction that only work because the foreign key constraints are deferred:

```sql
CREATE TABLE manufacturers (
name VARCHAR PRIMARY KEY,
flagship_phone_name VARCHAR NOT NULL
);

CREATE TABLE phones (
name VARCHAR PRIMARY KEY,
manufacturer_name VARCHAR NOT NULL REFERENCES manufacturers (name) DEFERRABLE INITIALLY DEFERRED
);

-- Add a circular foreign key reference

ALTER TABLE manufacturers
ADD CONSTRAINT manufacturers_latest_phone_id_fkey FOREIGN KEY (flagship_phone_name) REFERENCES phones (name) DEFERRABLE INITIALLY DEFERRED;

-- Insert some circularly-linked rows

BEGIN;

INSERT INTO manufacturers (name, flagship_phone_name)
VALUES ('Google', 'Pixel 6 Pro')
, ('Apple', 'iPhone 13 Pro Max')
, ('Samsung', 'Galaxy S22 Ultra');

INSERT INTO phones (manufacturer_name, name)
VALUES ('Google', 'Pixel 6')
, ('Google', 'Pixel 6 Pro')
, ('Apple', 'iPhone 13 Pro')
, ('Apple', 'iPhone 13 Pro Max')
, ('Samsung', 'Galaxy S22')
, ('Samsung', 'Galaxy S22 Ultra');

COMMIT;
```

### 3. Data ingestion or manual data restore

If you have a data file of SQL statements that aren't in an order that satisfies constraints (e.g. a plaintext [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html) output), it might make sense to defer all constraints in your transaction.

In this case it might make sense to use `DEFERRABLE INITIALLY IMMEDIATE` constraints, but those have the same performance hits as `DEFERRABLE INITIALLY DEFERRED` described below, so you might as well go all or nothing.

Here is an example that only works because the foreign key constraint is deferred:

```sql
CREATE TABLE authors (
id SERIAL PRIMARY KEY,
name VARCHAR NOT NULL
);

CREATE TABLE books (
id SERIAL PRIMARY KEY,
author_id INTEGER NOT NULL REFERENCES authors (id) DEFERRABLE INITIALLY DEFERRED,
title VARCHAR NOT NULL
);

-- Insert some rows from a data file, temporarily ignoring references between tables

BEGIN;

INSERT INTO books
VALUES (1, 1, 'All Summer in a Day')
, (2, 1, 'The Martian Chronicles')
, (3, 2, 'Starship Troopers')
, (4, 2, 'Podkayne of Mars');

INSERT INTO authors
VALUES (1, 'Ray Bradbury')
, (2, 'Robert A. Heinlein');

COMMIT;
```

Just remember to reset the "start" value of your primary key sequences after a data load like that.

### 4. Deleting some rows out of order

Rather than using the potentially dangerous [`REFERENCES ... ON DELETE CASCADE`](https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-FK), you can use deferrable constraints to let you delete rows from a series of tables in a forgiving order. This could be useful when tearing down stubbed data in an integration test.

```sql
CREATE TABLE countries (
iso2 CHAR(2) PRIMARY KEY,
name VARCHAR NOT NULL
);

CREATE TABLE cities (
id SERIAL PRIMARY KEY,
country_iso2 CHAR(2) NOT NULL REFERENCES countries (iso2) DEFERRABLE INITIALLY DEFERRED,
name VARCHAR NOT NULL
);

-- Insert some rows with references

INSERT INTO countries (iso2, name)
VALUES ('IS', 'Iceland')
, ('NZ', 'New Zealand');

INSERT INTO cities (country_iso2, name)
VALUES ('IS', 'Reykjavík')
, ('IS', 'Akureyri')
, ('NZ', 'Christchurch')
, ('NZ', 'Queenstown');

-- Delete some rows in an order that temporarily violates the foreign key constraint

BEGIN;
DELETE FROM countries WHERE iso2 = 'NZ';
DELETE FROM cities WHERE country_iso2 = 'NZ';
COMMIT;
```

## Performance considerations

Deferrable constraints seem great, why aren't they the default? Or, why shouldn't I create all of my constraints as `INITIALLY DEFERRED`? The main reason is:

**Deferrable unique indexes allow a moment in time in which there are duplicate values, reducing some optimizations the query planner can take.**

The query planner can no longer know that the uniqueness constraint is satisfied at absolutely every point in time in a transaction. This can have a negative impact when joining on the unique fields or when doing a `WHERE <unique_field> IN (...)` subquery. Joe Nelson has a more complete explanation in his [blog post](https://begriffs.com/posts/2017-08-27-deferrable-sql-constraints.html#query-planner-performance-penalty).

This [Hashrocket article](https://hashrocket.com/blog/posts/deferring-database-constraints) talks about a unique column used for sorting items, and that's the exact use case I had above which led me down this deep dive. In this case, the position / weight / priority column won't be joined on, so it won't take this optimization hit.
7 changes: 5 additions & 2 deletions src/blog/faking-sequences-in-mysql.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,13 @@ MySQL supports auto-incrementing keys, but only one column in a table can have t

## Setup

First, let's start up a MySQL server instance with Docker. For ease of example we'll use the `root` user, but you'll want to think twice about doing this in production.
First, let's start up a MySQL server instance with Docker. We'll use the command from "[Creating a MySQL Playground](/blog/creating-a-mysql-playground)":

```bash
docker run --env MYSQL_ROOT_PASSWORD=password --publish 3306:3306 --detach mysql:8
CONTAINER_ID=$(docker run --env MYSQL_ROOT_PASSWORD=mysecretpassword --detach mysql:8) &&
docker exec "${CONTAINER_ID}" mysqladmin ping --wait &&
docker exec --interactive --tty "${CONTAINER_ID}" mysql --password=mysecretpassword &&
docker rm --force --volumes "${CONTAINER_ID}" > /dev/null
```

Then, let's create our sequences table:
Expand Down
2 changes: 1 addition & 1 deletion src/blog/investigating-locks-in-mysql.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ tags:

---

Locking is an important part of an ACID-compliant database, but excessive locks can lead to performance degradation. Here are some strategies for investigating locks.
Locking is an important part of an ACID-compliant database, but excessive locks can lead to performance degradation. Here are some strategies for investigating locks in MySQL.

A major cause of [long-running queries](/blog/finding-long-running-queries-in-mysql) that might lead to timeouts or other issues are various types of locks. After finding that you have a large number of threads, or threads that are lasting longer than expected, the next thing to investigate is locks.

Expand Down

0 comments on commit 6b2be09

Please sign in to comment.