The Expand and Contract Pattern for Zero-Downtime Migrations

You’re doing a rolling deployment. Half your instances are running v1, half are running v2. Then v2 runs a migration that renames a column. Suddenly v1 instances start panicking, your phone starts blowing up with PD alarms for the service you just deployed and downstream services are affected, because the column that v1 expects exists is gone.

The expand and contract pattern prevents this.

The Problem

Traditional migrations assume all application instances update simultaneously. That’s fine for recreate deployments with downtime, but breaks badly with rolling updates where both versions run concurrently.

-- This breaks rolling deployments
ALTER TABLE users RENAME COLUMN name TO full_name;

The moment this runs, every v1 box querying name fails.

The Pattern

Expand and contract splits breaking changes into backward-compatible steps:

Expand: Add the new structure alongside the old one
Migrate: Dual-write in application code, backfill existing data
Contract: Remove the old structure once nothing uses it

Each step is a separate deployment. Each deployment is safe to run while both application versions are live.

users table

name

Initial state: v1 reads from name column

Step 1 of 5

Database schema evolution through expand and contract phases

Example: Renaming a Column

Let’s rename users.name to users.full_name.

Step 1: Expand

Add the new column. That’s it.

ALTER TABLE users ADD COLUMN full_name VARCHAR(255);

This is backward-compatible. v1 keeps using name, completely unaware of full_name.

Step 2: Migrate

Deploy application code that writes to both columns:

func (r *UserRepo) Create(ctx context.Context, user *User) error {
    _, err := r.db.ExecContext(ctx, `
        INSERT INTO users (name, full_name, email)
        VALUES ($1, $1, $2)
    `, user.FullName, user.Email)
    return err
}

func (r *UserRepo) UpdateName(ctx context.Context, id int, name string) error {
    _, err := r.db.ExecContext(ctx, `
        UPDATE users SET name = $1, full_name = $1 WHERE id = $2
    `, name, id)
    return err
}

Read from the new column, falling back to the old one:

func (r *UserRepo) GetByID(ctx context.Context, id int) (*User, error) {
    row := r.db.QueryRowContext(ctx, `
        SELECT id, COALESCE(full_name, name), email FROM users WHERE id = $1
    `, id)

    var user User
    err := row.Scan(&user.ID, &user.FullName, &user.Email)
    return &user, err
}

Backfill existing rows:

UPDATE users SET full_name = name WHERE full_name IS NULL;

At this point both columns stay in sync through your application code. v1 instances still work because name is always populated.

Step 3: Contract

Once all instances are on the new code and you’re confident no rollback is needed, clean up:

ALTER TABLE users DROP COLUMN name;

Update your application code to remove the dual-write logic:

func (r *UserRepo) Create(ctx context.Context, user *User) error {
    _, err := r.db.ExecContext(ctx, `
        INSERT INTO users (full_name, email) VALUES ($1, $2)
    `, user.FullName, user.Email)
    return err
}

Example: Splitting a Table

You need to extract users.address_* columns into a separate addresses table.

Step 1: Expand

CREATE TABLE addresses (
    id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id) UNIQUE,
    street VARCHAR(255),
    city VARCHAR(100),
    postal_code VARCHAR(20)
);

Step 2: Migrate

Dual-write to both locations:

func (r *UserRepo) UpdateAddress(ctx context.Context, userID int, addr Address) error {
    tx, err := r.db.BeginTx(ctx, nil)
    if err != nil {
        return err
    }
    defer tx.Rollback()

    // Write to old columns (for v1 instances)
    _, err = tx.ExecContext(ctx, `
        UPDATE users
        SET address_street = $1, address_city = $2, address_postal_code = $3
        WHERE id = $4
    `, addr.Street, addr.City, addr.PostalCode, userID)
    if err != nil {
        return err
    }

    // Write to new table (for v2 instances)
    _, err = tx.ExecContext(ctx, `
        INSERT INTO addresses (user_id, street, city, postal_code)
        VALUES ($1, $2, $3, $4)
        ON CONFLICT (user_id) DO UPDATE SET
            street = EXCLUDED.street,
            city = EXCLUDED.city,
            postal_code = EXCLUDED.postal_code
    `, userID, addr.Street, addr.City, addr.PostalCode)
    if err != nil {
        return err
    }

    return tx.Commit()
}

Read from the new table with fallback:

func (r *UserRepo) GetAddress(ctx context.Context, userID int) (*Address, error) {
    // Try new table first
    row := r.db.QueryRowContext(ctx, `
        SELECT street, city, postal_code FROM addresses WHERE user_id = $1
    `, userID)

    var addr Address
    err := row.Scan(&addr.Street, &addr.City, &addr.PostalCode)
    if err == nil {
        return &addr, nil
    }
    if err != sql.ErrNoRows {
        return nil, err
    }

    // Fall back to old columns
    row = r.db.QueryRowContext(ctx, `
        SELECT address_street, address_city, address_postal_code
        FROM users WHERE id = $1
    `, userID)
    err = row.Scan(&addr.Street, &addr.City, &addr.PostalCode)
    return &addr, err
}

Backfill:

INSERT INTO addresses (user_id, street, city, postal_code)
SELECT id, address_street, address_city, address_postal_code
FROM users
WHERE address_street IS NOT NULL
ON CONFLICT (user_id) DO NOTHING;

Step 3: Contract

ALTER TABLE users
    DROP COLUMN address_street,
    DROP COLUMN address_city,
    DROP COLUMN address_postal_code;

Remove the dual-write code and the fallback logic.

Example: Changing a Column Type

Changing price from INTEGER (cents) to DECIMAL (dollars).

Step 1: Expand

ALTER TABLE products ADD COLUMN price_decimal DECIMAL(10,2);

Step 2: Migrate

Dual-write with conversion:

func (r *ProductRepo) UpdatePrice(ctx context.Context, id int, cents int) error {
    dollars := float64(cents) / 100.0
    _, err := r.db.ExecContext(ctx, `
        UPDATE products SET price = $1, price_decimal = $2 WHERE id = $3
    `, cents, dollars, id)
    return err
}

Backfill:

UPDATE products SET price_decimal = price / 100.0 WHERE price_decimal IS NULL;

Step 3: Contract

ALTER TABLE products DROP COLUMN price;
ALTER TABLE products RENAME COLUMN price_decimal TO price;

When to Use This Pattern

This pattern is a requirement if you need a deployment strategy other than big-bang, because you’ll need to run both application versions at the same time while the new version is being rolled out.

If your deployment pipeline allows for maintenance windows, or you’re making non breaking-changes, then skip this pattern.

Triggers vs Application Code

The examples above use application level dual-writes, but you can also use database triggers to keep columns in sync. Triggers simplify application code and guarantee consistency even for direct database writes or scripts.

Beware though, triggers are harder to test, version, and maintain alongside your application, and add vendor specific logic to your schema. If all writes go through your application anyway, keeping the sync logic in code is usually cleaner.

# The Expand and Contract Pattern for Zero-Downtime Migrations

The Problem

The Pattern

Example: Renaming a Column

Step 1: Expand

Step 2: Migrate

Step 3: Contract

Example: Splitting a Table

Step 1: Expand

Step 2: Migrate

Step 3: Contract

Example: Changing a Column Type

Step 1: Expand

Step 2: Migrate

Step 3: Contract

When to Use This Pattern

Triggers vs Application Code

# Deployment Strategies Visualized

# Debugging Distributed Systems at 2 AM