Saturday, 21 June 2025

DBA-Practical PostgreSQL Queries for Every DBA

 Managing a PostgreSQL database effectively requires a solid understanding of the key queries that help you monitor and maintain its performance and integrity. Whether you're a database administrator or a developer, having a set of reliable SQL commands at your fingertips can make troubleshooting, optimization, and general oversight much easier. In this post, I’ll share a curated list of essential PostgreSQL queries that cover various aspects of database management. These queries will serve as a handy reference to help you understand what's happening inside your database and ensure it runs smoothly.

PostgreSQL Monitoring Query:

Check Postgres version:

One of the most basic command

SELECT pg_catalog.version();

Postgres query to generate Create table statements:

There are some time you need to generate create table statement and run on some other environment to create the identical table structure , same like “show create table in mysql”

select table_name,'CREATE TABLE ' || table_name ||'(' ||STRING_AGG (
    column_name || ' ' || data_type ,
        ','
       ORDER BY
        table_name,
        ordinal_position
    ) ||');'
    from
information_schema.columns
where table_schema = '<schema_name>' and table_name='<table_name>'
group by
table_name;

Postgres query to check Index exists or not:

There are times when you'll want to check for the existence of an index on a particular table in every schema

select schemaname,indexdef  from pg_indexes where tablename ='<table_name>' and schemaname like 'mytest' and  
indexname='index_name';

Postgres query to Check Unused index:

Postgres query to check unused index across schema
SELECT s.schemaname,
       s.relname AS tablename,
       s.indexrelname AS indexname,
       pg_relation_size(s.indexrelid) AS index_size
FROM pg_catalog.pg_stat_user_indexes s
   JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0      -- has never been scanned
  AND 0 <>ALL (i.indkey)  -- no index column is an expression
  AND NOT i.indisunique   -- is not a UNIQUE index
  AND NOT EXISTS          -- does not enforce a constraint
         (SELECT 1 FROM pg_catalog.pg_constraint c
          WHERE c.conindid = s.indexrelid)
  AND NOT EXISTS          -- is not an index partition
         (SELECT 1 FROM pg_catalog.pg_inherits AS inh
          WHERE inh.inhrelid = s.indexrelid)
ORDER BY pg_relation_size(s.indexrelid) DESC;
PostgreSQL Unused Index Detection for Specific Schema:

Identifies indexes that are rarely or never used within a targeted schema to help optimize database performance by removing unnecessary indexes.

SELECT 
    s.schemaname,
    s.relname AS tablename,
    s.indexrelname AS indexname,
    s.idx_scan,
    s.idx_tup_read,
    s.idx_tup_fetch,
    pg_size_pretty(pg_relation_size(s.indexrelid)) AS index_size,
    pg_relation_size(s.indexrelid) AS index_size_bytes
FROM pg_catalog.pg_stat_user_indexes s
JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid
WHERE s.schemaname = 'test'  -- Replace with your schema name
  AND s.idx_scan = 0      -- Never been scanned
  AND 0 <>ALL (i.indkey)  -- No expression indexes
  AND NOT i.indisunique   -- Not unique indexes
  AND NOT EXISTS          -- Doesn't enforce constraints
         (SELECT 1 FROM pg_catalog.pg_constraint c
          WHERE c.conindid = s.indexrelid)
  AND NOT EXISTS          -- Not a partition index
         (SELECT 1 FROM pg_catalog.pg_inherits AS inh
          WHERE inh.inhrelid = s.indexrelid)
ORDER BY pg_relation_size(s.indexrelid) DESC;

PostgreSQL Duplicate Index Detection Query:

Identifies redundant indexes with identical column definitions on the same table to eliminate unnecessary storage overhead and maintenance costs.

SELECT 
    pg_size_pretty(SUM(pg_relation_size(c.oid))::BIGINT) AS total_size,
    array_agg(c.relname) AS duplicate_indexes,
    sub.tablename,
    sub.cols AS column_pattern
FROM (
    SELECT 
        indexrelid, 
        (indrelid::regclass)::text AS tablename, 
        regexp_replace(indkey::text, '\\b0\\b', '', 'g') AS cols
    FROM pg_index 
    WHERE indisvalid
) sub
JOIN pg_class c ON c.oid = sub.indexrelid
GROUP BY sub.tablename, sub.cols 
HAVING COUNT(*) > 1
ORDER BY SUM(pg_relation_size(c.oid)) DESC;

Postgres Troubleshooting Query:

Check PG DB Processlist:

Use this query when you want to check the Processlist what query are running in detail:

SELECT user, state,pid, client_addr, query, query_start, NOW() - query_start AS elapsed
FROM pg_stat_activity
WHERE query != '<IDLE>'
-- AND EXTRACT(EPOCH FROM (NOW() - query_start)) > 1
ORDER BY elapsed DESC;

Check Postgres Idle Connection :

Find Database connection which started txn but now sitting idle without rolling back or committing

SELECT now() - state_change as idle_in_transaction_duration, 
now() - xact_start as xact_duration,* FROM pg_stat_activity 
WHERE state = 'idle in transaction' AND xact_start is not null ORDER BY 1 DESC;

PostgreSQL Vacuum and Buffer Wait Event Monitoring

This query finds all active database sessions that are currently waiting on vacuum-related operations or buffer access issues.

SELECT 
    pid,
    usename,
    application_name,
    state,
    wait_event_type,
    wait_event,
    query,
    backend_start,
    query_start
FROM pg_stat_activity 
WHERE wait_event IN (
    'AutoVacuumMain',           -- Autovacuum launcher waiting
    'VacuumDelay',              -- Vacuum delay between operations
    'XactGroupUpdate',          -- Transaction group update
    'BufferPin'                 -- Waiting for buffer pin (common during vacuum)
);

Postgres Top 5 CPU-Intensive Database Queries Performance Report:

**Summary:** This query extracts the most resource-consuming SQL statements from PostgreSQL's statistics collector, ranking them by total execution time and calculating their relative CPU impact as percentages.

/* Make sure you have this extension installed 
CREATE EXTENSION IF NOT EXISTS pg_stat_statements; 

*/

SELECT  substring(query, 1, 50) AS short_query, round(total_exec_time::numeric, 2) AS total_exec_time, calls, round(mean_exec_time::numeric, 2) AS mean_exec_time, round((100 * total_exec_time / sum(total_exec_time::numeric) OVER ())::numeric, 2) AS percentage_cpu FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5; 

Autovacuum :

List existing Postgres Autovaccum setting:

SELECT name, setting, unit FROM pg_settings WHERE name  ~ 'autovacuum|vacuum.*cost' ORDER BY 1;

Postgres Table Vacuum & Analyze History Tracker:

Shows maintenance timestamps for user tables to identify which tables need vacuuming or statistics updates.

select relname,last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables;

PostgreSQL Autovacuum Candidate Detection Query:

Analyzes tables with dead tuples to identify which ones exceed autovacuum thresholds and need immediate vacuum operations.

WITH rel_set AS (
    SELECT
        oid,
        CASE split_part(split_part(array_to_string(reloptions, ','), 'autovacuum_vacuum_threshold=', 2), ',', 1)
            WHEN '' THEN NULL
            ELSE split_part(split_part(array_to_string(reloptions, ','), 'autovacuum_vacuum_threshold=', 2), ',', 1)::BIGINT
        END AS rel_av_vac_threshold,
        CASE split_part(split_part(array_to_string(reloptions, ','), 'autovacuum_vacuum_scale_factor=', 2), ',', 1)
            WHEN '' THEN NULL
            ELSE split_part(split_part(array_to_string(reloptions, ','), 'autovacuum_vacuum_scale_factor=', 2), ',', 1)::NUMERIC
        END AS rel_av_vac_scale_factor
    FROM pg_class
)
SELECT 
    PSUT.relname,
    to_char(PSUT.last_vacuum, 'YYYY-MM-DD HH24:MI') AS last_vacuum,
    to_char(PSUT.last_autovacuum, 'YYYY-MM-DD HH24:MI') AS last_autovacuum,
    to_char(C.reltuples, '9G999G999G999') AS n_tup,
    to_char(PSUT.n_dead_tup, '9G999G999G999') AS dead_tup,
    to_char(
        coalesce(RS.rel_av_vac_threshold, current_setting('autovacuum_vacuum_threshold')::BIGINT) + 
        coalesce(RS.rel_av_vac_scale_factor, current_setting('autovacuum_vacuum_scale_factor')::NUMERIC) * C.reltuples, 
        '9G999G999G999'
    ) AS av_threshold,
    CASE
        WHEN (
            coalesce(RS.rel_av_vac_threshold, current_setting('autovacuum_vacuum_threshold')::BIGINT) + 
            coalesce(RS.rel_av_vac_scale_factor, current_setting('autovacuum_vacuum_scale_factor')::NUMERIC) * C.reltuples
        ) < PSUT.n_dead_tup
        THEN '*'
        ELSE ''
    END AS expect_av
FROM pg_stat_user_tables PSUT
    JOIN pg_class C ON PSUT.relid = C.oid
    JOIN rel_set RS ON PSUT.relid = RS.oid 
WHERE PSUT.n_dead_tup > 1
ORDER BY PSUT.n_dead_tup DESC 
LIMIT 10;

PostgreSQL Dead Tuple Percentage Analysis:

Calculates dead tuple percentages across all tables to identify bloated tables requiring vacuum maintenance.

SELECT 
    schemaname,
    relname,
    n_dead_tup,
    n_live_tup,
    round(n_dead_tup::float / n_live_tup::float * 100) AS dead_pct,
    autovacuum_count,
    last_vacuum,
    last_autovacuum,
    last_autoanalyze,
    last_analyze 
FROM pg_stat_all_tables 
WHERE n_live_tup > 0;

PostgreSQL Tables Requiring Immediate Vacuum Attention:

Lists tables with excessive dead tuples (>1000) ranked by severity, displaying bloat ratios and maintenance history to identify urgent cleanup candidates.

SELECT 
    relname,  -- Table name
    n_live_tup,-- Count of live tuples
    n_dead_tup,-- Count of dead tuples
    trunc(100 * n_dead_tup / (n_live_tup + 1))::float "ratio%", -- Dead tuple percentage 
to_char(last_autovacuum, 'YYYY-MM-DD HH24:MI:SS') AS autovacuum_date,   -- Last autovacuum timestamp
    to_char(last_autoanalyze, 'YYYY-MM-DD HH24:MI:SS') AS autoanalyze_date  -- Last autoanalyze timestamp
FROM pg_stat_all_tables 
WHERE n_dead_tup > 1000 -- Filter tables with high dead tuple count
ORDER BY n_dead_tup;   



PostgreSQL Index Bloat Analysis for Schemas:

Identifies large, potentially bloated indexes in schemas by calculating estimated bloat based on scan frequency and size, filtering for indexes with over 500MB estimated bloat.


-- Replace the default schema name with your *schema*

SELECT
    indexes.schemaname,
    tbl.relname AS tablename,
    idx.relname AS indexname,
    bsiz.bytes / (1024 * 1024 * 1024)::numeric AS index_size_gb,          -- Index size in GB
    sub_bloat.est_bloat AS estimated_bloat_gb                              -- Estimated bloat in GB
FROM
    pg_stat_user_indexes AS idx_stat
JOIN
    pg_indexes AS indexes 
    ON idx_stat.indexrelname = indexes.indexname 
    AND idx_stat.schemaname = indexes.schemaname
JOIN
    pg_class AS idx ON idx_stat.indexrelid = idx.oid                      -- Index metadata
JOIN
    pg_class AS tbl ON idx_stat.relid = tbl.oid                           -- Table metadata
JOIN
    pg_stat_user_tables AS tbl_stat ON idx_stat.relid = tbl_stat.relid    -- Table statistics
JOIN
    LATERAL (
        SELECT
            pg_relation_size(idx.oid) AS bytes                             -- Get index size in bytes
    ) AS bsiz ON TRUE
JOIN
    LATERAL (
        SELECT
            -- Calculate estimated bloat based on scan frequency vs total scans
            (1 - idx_stat.idx_scan / greatest(0.1, sum(idx_stat.idx_scan) OVER (PARTITION BY idx_stat.indexrelid))) 
            * (pg_relation_size(idx.oid) / (1024 * 1024 * 1024)::numeric) AS est_bloat
    ) AS sub_bloat ON TRUE
WHERE
    indexes.schemaname LIKE '<SchemaName_>'                               -- Filter for schemas
    AND sub_bloat.est_bloat > 1                                            -- Filter for >1GB bloat
ORDER BY
    estimated_bloat_gb DESC;                                               -- Sort by highest bloat first

No comments:

Post a Comment

DBA-Practical PostgreSQL Queries for Every DBA

  Managing a PostgreSQL database effectively requires a solid understanding of the key queries that help you monitor and maintain its perfor...