PostgreSQL Treasures

For developers who thought they knew Postgres.

13 min read

#databases, #postgresql

PostgreSQL has been quietly accumulating superpowers for decades. Most developers use it like a fancy MySQL -- and leave enormous capability on the table. Here are gems that will change how you write migrations, queries, and maintenance scripts.

1. Transform Data While Changing Column Types with `USING`

The problem: You stored time as a packed integer (BIGINT) -- say, 12345678 meaning 12:34:56.78 -- and now you need an actual TIME column. Normally you'd add a new column, backfill it, drop the old one. Three steps, extra risk, more downtime.

The treasure: ALTER COLUMN ... TYPE ... USING lets you apply an arbitrary expression to every row at the moment the type changes -- in a single statement.

ALTER TABLE events
  ALTER COLUMN recorded_at TYPE TIME
  USING MAKE_TIME(
      (recorded_at / 1000000)::int,        -- hours
      ((recorded_at / 10000) % 100)::int,  -- minutes
      ((recorded_at % 10000) / 100.0)      -- seconds.centiseconds
  );

Another common use -- converting a VARCHAR price column to NUMERIC:

ALTER TABLE products
  ALTER COLUMN price TYPE NUMERIC(10,2)
  USING REPLACE(price, '$', '')::NUMERIC(10,2);

The USING clause is evaluated row-by-row during the rewrite. You have access to the full expression language: math, string functions, CASE, even subqueries. No temp columns, no backfill scripts, no cleanup.

When to reach for it: Any time you're converting encoded or packed values to a proper type -- epoch integers to TIMESTAMPTZ, cents integers to NUMERIC, formatted strings to DATE. One atomic DDL statement beats a multi-step migration every time.

2. Materialized CTEs + `CASE`-Based Pivots to Eliminate Redundant Joins

What is a Materialized CTE?

Normally, PostgreSQL's query planner treats a WITH clause as a soft optimization hint -- it can inline or re-execute it if it thinks that's faster. Add MATERIALIZED and you give the planner an explicit contract: evaluate this CTE exactly once, store the result, and reuse it.

WITH base AS MATERIALIZED (
    SELECT ...  -- runs once, result is cached in memory/temp storage
)
SELECT * FROM base WHERE ...;
-- base is not re-scanned or re-executed above

This matters most when:

The CTE calls an expensive function (graph traversal, FDW, heavy aggregation)
The result is referenced more than once in the outer query
The planner's cost estimate for the subquery is unreliable (e.g., set-returning functions)

The `CASE`-Based Pivot Trick

Instead of joining the same table twice for two different periods, scan it once and pivot using conditional aggregation:

WITH active_users AS MATERIALIZED (
    -- Expensive subquery runs exactly once
    SELECT user_id, region
    FROM get_active_users($start_date, $end_date)
),
sales_pivot AS (
    SELECT
        user_id,
        MAX(CASE WHEN period = 'current'  THEN revenue END) AS curr_revenue,
        MAX(CASE WHEN period = 'current'  THEN units   END) AS curr_units,
        MAX(CASE WHEN period = 'previous' THEN revenue END) AS prev_revenue,
        MAX(CASE WHEN period = 'previous' THEN units   END) AS prev_units
    FROM sales_summary
    WHERE period IN ('current', 'previous')
      AND user_id IN (SELECT user_id FROM active_users)
    GROUP BY user_id
)
SELECT
    u.user_id,
    u.region,
    sp.curr_revenue,
    sp.prev_revenue,
    ROUND((sp.curr_revenue - sp.prev_revenue) / NULLIF(sp.prev_revenue, 0) * 100, 2) AS pct_change
FROM active_users u
LEFT JOIN sales_pivot sp ON sp.user_id = u.user_id
ORDER BY sp.curr_revenue DESC NULLS LAST;

Why this is faster than two joins:

Approach	Table scans	Rows processed
Two separate joins (current + previous period)	2	2× filtered set
`CASE`-based pivot with `GROUP BY`	1	1× filtered set

The single scan also plays nicely with the materialized CTE -- the user_id IN (...) filter is applied once against the cached result set, not re-evaluated per join.

3. Near-Zero-Downtime Table Pruning with `LIKE INCLUDING ALL` + Atomic Swap

The problem: A queue or log table has grown to millions of rows. You need to keep only the most recent N records. DELETE WHERE id < X is slow, holds locks, generates massive WAL, and your application is hammering inserts the whole time.

The treasure: Build a new table as a structural clone, populate it offline, then swap with a millisecond lock window.

Phase 1 -- Build the clone (no lock, zero impact)

CREATE TABLE app.event_log_new (LIKE app.event_log INCLUDING ALL);

INCLUDING ALL copies:

Column definitions and defaults (including the nextval() pointing to the existing sequence -- no new sequence is created)
All indexes
All constraints (CHECK, UNIQUE, NOT NULL)
Storage parameters

Note: LIKE ... INCLUDING ALL does not copy foreign key constraints. If your table has outgoing FK constraints, re-add them manually on the new table before the swap.

INSERT INTO app.event_log_new
SELECT * FROM app.event_log
ORDER BY event_log_id DESC
LIMIT 10000;

This runs while inserts keep flowing into the original table. No locks, no drama.

Phase 2 -- Atomic swap (lock window: milliseconds)

BEGIN;
  LOCK TABLE app.event_log IN ACCESS EXCLUSIVE MODE;

  -- Capture rows inserted between phase 1 and the lock
  INSERT INTO app.event_log_new
  SELECT * FROM app.event_log
  WHERE event_log_id > (SELECT MAX(event_log_id) FROM app.event_log_new)
  ORDER BY event_log_id DESC
  LIMIT 10000;

  -- Advance the sequence to the current max so no IDs are reused after the swap
  SELECT setval(
    'app.event_log_event_log_id_seq',
    (SELECT MAX(event_log_id) FROM app.event_log)
  );

  ALTER TABLE app.event_log     RENAME TO event_log_old;
  ALTER TABLE app.event_log_new RENAME TO event_log;

  -- Critical: transfer sequence ownership BEFORE dropping the old table
  -- or the sequence will be dropped along with it
  ALTER SEQUENCE app.event_log_event_log_id_seq
    OWNED BY app.event_log.event_log_id;
COMMIT;

The lock is held only for metadata operations -- three ALTER statements and a setval. These are catalog updates, not data rewrites. Typical lock window: under 10 milliseconds.

Phase 3 -- Drop the old table (instant)

DROP TABLE app.event_log_old;

DROP TABLE doesn't zero out data on disk -- it simply unlinks the heap and index files from the filesystem. Effectively instantaneous regardless of table size.

The OWNED BY detail you cannot skip: If you drop event_log_old while the sequence still thinks it's owned by that table, the sequence goes with it -- and your next insert explodes. The ALTER SEQUENCE ... OWNED BY inside the transaction re-parents the sequence to the new table before the drop.

4. `DISTINCT ON` -- The Clause That Replaces a Subquery

Most developers reach for a subquery or window function to get the latest row per group. Postgres has dedicated syntax for it:

-- Get the most recent order for each customer
SELECT DISTINCT ON (customer_id)
    customer_id, order_date, total_amount, status
FROM orders
ORDER BY customer_id, order_date DESC;

DISTINCT ON (expr) keeps exactly one row per unique value of expr -- the first row in the ORDER BY sequence. The rule: the DISTINCT ON columns must appear leftmost in ORDER BY.

Compare the alternatives:

-- Subquery version (verbose, often slower)
SELECT o.*
FROM orders o
JOIN (
    SELECT customer_id, MAX(order_date) AS max_date
    FROM orders GROUP BY customer_id
) latest ON o.customer_id = latest.customer_id
       AND o.order_date = latest.max_date;

-- Window function version (requires a wrapper query)
SELECT * FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
    FROM orders
) t WHERE rn = 1;

-- DISTINCT ON version (one scan, no wrapper)
SELECT DISTINCT ON (customer_id)
    customer_id, order_date, total_amount
FROM orders
ORDER BY customer_id, order_date DESC;

With an index on (customer_id, order_date DESC), the DISTINCT ON version can satisfy the query with a pure index scan and zero sort. The others can't.

5. `GENERATED ALWAYS AS` -- Computed Columns That Maintain Themselves

Stop writing triggers to keep derived columns in sync. Postgres 12+ supports stored computed columns natively:

CREATE TABLE contacts (
    id           SERIAL PRIMARY KEY,
    first_name   TEXT NOT NULL,
    last_name    TEXT NOT NULL,
    full_name    TEXT GENERATED ALWAYS AS (first_name || ' ' || last_name) STORED,
    email        TEXT,
    email_domain TEXT GENERATED ALWAYS AS (
        CASE WHEN email LIKE '%@%'
             THEN LOWER(SPLIT_PART(email, '@', 2))
        END
    ) STORED
);

The column is recomputed automatically on every INSERT or UPDATE. You can index it, filter on it, and join on it like any other column -- with zero application-layer maintenance.

Practical uses: full-name search columns, normalized/lowercased lookups, fiscal year derivations from a date column, or combining fields into a composite key. The expression must be immutable (no NOW(), no subqueries), but anything deterministic from other columns in the same row is fair game.

6. Partial Indexes -- Index Only the Rows That Matter

A full index on a 50-million-row table when you only ever query a small, well-defined subset is wasteful in size, memory, and write overhead. Partial indexes let you index just the rows that matter:

-- Only index unprocessed jobs -- the hot query path
CREATE INDEX idx_jobs_pending ON jobs (created_at, priority)
WHERE status = 'pending';

-- Only index non-deleted rows for a soft-delete pattern
CREATE INDEX idx_users_active_email ON users (email)
WHERE deleted_at IS NULL;

The index is smaller, fits in shared_buffers more easily, and is faster to update because completed or deleted rows never touch it. The planner uses it automatically when the query's WHERE clause matches the index predicate.

The trap: your query's WHERE clause must semantically match the index predicate exactly. WHERE status = 'pending' uses the index; WHERE status != 'done' does not -- even if that's logically equivalent given your data.

7. `COPY` with `ON_ERROR SKIP` -- Fault-Tolerant Bulk Loads

Importing a large CSV where a handful of rows have bad data? Instead of failing the entire load or pre-scrubbing the file, tell Postgres to skip bad rows and continue (Postgres 17+):

COPY staging.raw_imports (id, name, price, created_at)
FROM '/data/exports/products.csv'
WITH (FORMAT csv, HEADER true, ON_ERROR SKIP, LOG_VERBOSITY VERBOSE);

Bad rows are skipped and logged rather than aborting the transaction. LOG_VERBOSITY VERBOSE captures exactly which rows were rejected and why, so you can fix them separately.

For older versions, file_fdw with error handling gets you partway there -- but native ON_ERROR SKIP in COPY is cleaner by a mile.

8. `EXPLAIN (ANALYZE, BUFFERS)` + explain.dalibo.com

You know EXPLAIN ANALYZE. But adding BUFFERS reveals whether your query is hitting shared cache or reading from disk -- often the difference between fast and catastrophic:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE customer_id = 42 AND status = 'open';

Look for:

Buffers: shared hit=N read=M -- hit is from cache, read is from disk. A high read count means your working set isn't fitting in shared_buffers.
Rows Removed by Filter: N -- the planner's row estimate was wrong. This is usually why your index isn't being used.
actual loops=N on a nested loop -- a nested loop running 50,000 times is a common hidden bottleneck.

For complex plans, dump FORMAT JSON and paste it into explain.dalibo.com for an interactive visual flame graph of your query plan. Free, no signup, invaluable.

9. `pg_stat_statements` -- Find Your Worst Queries Without a Profiler

The pg_stat_statements extension tracks execution statistics for every distinct query shape that has run since it was last reset. It's the fastest way to find your top offenders:

-- Enable once per database
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 queries by total time spent
SELECT
    LEFT(query, 80)          AS query_snippet,
    calls,
    ROUND(total_exec_time::numeric, 2) AS total_ms,
    ROUND(mean_exec_time::numeric,  2) AS avg_ms,
    ROUND(stddev_exec_time::numeric,2) AS stddev_ms,
    rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

total_exec_time shows you where wall-clock time is actually going -- not just the slowest single query, but the query called 200,000 times that's 2ms each. That's where your optimization ROI lives.

Reset stats after a tuning session with SELECT pg_stat_statements_reset();.

10. Advisory Locks -- Application-Level Mutex Without a Table

Need to ensure only one worker processes a job at a time? Most teams build a locked_at column and a loop. Postgres has a built-in mutex primitive that lives entirely in memory:

-- Try to acquire lock for job_id 12345 (non-blocking)
SELECT pg_try_advisory_lock(12345);
-- Returns TRUE if acquired, FALSE if another session holds it

-- Do the work...

-- Release when done
SELECT pg_advisory_unlock(12345);

Advisory locks are:

Session-scoped -- released automatically when the connection closes (no orphaned locks)
In-memory -- no table writes, no WAL, no bloat
Stackable -- same session can acquire the same lock multiple times (reference-counted)

Use pg_try_advisory_lock (non-blocking) in job workers so they skip already-locked items rather than queuing up. Use pg_advisory_lock (blocking) when you need a strict serialization point.

-- Pattern: skip if already being processed
UPDATE jobs
SET status = 'processing', started_at = NOW()
WHERE id = (
    SELECT id FROM jobs
    WHERE status = 'pending'
      AND pg_try_advisory_lock(id)  -- skips locked rows
    ORDER BY priority DESC, created_at
    LIMIT 1
    FOR UPDATE SKIP LOCKED
)
RETURNING *;

Combine with FOR UPDATE SKIP LOCKED for a bulletproof queue worker with no application-layer coordination.

Summary

Technique	The Old Way	The Postgres Way
Type conversion with transform	Add column → backfill → drop	`ALTER ... USING expr`
Multi-period aggregation	Join same table N times	One scan + `CASE` pivot + `MATERIALIZED` CTE
Pruning a live table	`DELETE` with downtime	Clone → atomic swap → instant drop
Latest row per group	Subquery or window function	`DISTINCT ON`
Derived/computed columns	Triggers	`GENERATED ALWAYS AS ... STORED`
Selective indexing	Full index on everything	`CREATE INDEX ... WHERE condition`
Fault-tolerant bulk load	Pre-scrub CSV or ETL pipeline	`COPY ... ON_ERROR SKIP`
Query performance debugging	`EXPLAIN ANALYZE` alone	`EXPLAIN (ANALYZE, BUFFERS)` + dalibo
Finding slow queries	Application logs, guesswork	`pg_stat_statements`
Distributed job locking	`locked_at` column + polling	`pg_advisory_lock` + `SKIP LOCKED`

1. Transform Data While Changing Column Types with USING

2. Materialized CTEs + CASE-Based Pivots to Eliminate Redundant Joins

What is a Materialized CTE?

The CASE-Based Pivot Trick

3. Near-Zero-Downtime Table Pruning with LIKE INCLUDING ALL + Atomic Swap