Skip to main content
  1. Posts/

Parameters on the Control File and Primary-Standby Parameter Mismatch Issues

·1763 words·9 mins
liuzhilong62
Author
liuzhilong62
PostgreSQL DBA. Writing about database internals, production cases, and source code analysis.

PARAMETER_CHANGE and Database Parameters on the Control File
#

Some PG parameters affect the standby’s operation. These parameters are not only in the configuration file but also written to the control file. Whenever parameters change, they are written to WAL and update the control file.

The standby redoes the PARAMETER_CHANGE WAL record and writes to the standby’s control file. PARAMETER_CHANGE WAL record:

rmgr: XLOG        len (rec/tot):     54/    54, tx:          0, lsn: 27F/800001C0, prev 27F/80000148, desc: PARAMETER_CHANGE max_connections=3000 max_worker_processes=20 max_wal_senders=10 max_prepared_xacts=0 max_locks_per_xact=1024 wal_level=logical wal_log_hints=off track_commit_timestamp=on

XLOG_PARAMETER_CHANGE records these 8 parameters, which can also be viewed directly from the control file:

$ pg_controldata |grep setting
wal_level setting:                    logical
wal_log_hints setting:                on
max_connections setting:              1000
max_worker_processes setting:         20
max_wal_senders setting:              10
max_prepared_xacts setting:           0
max_locks_per_xact setting:           1024
track_commit_timestamp setting:       on

These parameters are all from the primary, even if this control file belongs to the standby.

The startup process checks 6 parameters via the CheckRequiredParameterValues function. One parameter wal_level must be >= replica. The other 5 parameters — max_connections, max_worker_processes, max_wal_senders, max_prepared_transactions, max_locks_per_transaction — are checked for primary vs standby sizing. If the standby has a smaller value, recovery is paused. If you increase the primary’s parameters directly, the standby will crash. The PG log:

FATAL,22023,"hot standby is not possible because max_connections = 2000 is a lower setting than on the master server (its value was 3000)",,,,,"WAL redo at 27F/800001C0 for XLOG/PARAMETER_CHANGE: max_connections=3000 max_worker_processes=20 max_wal_senders=10 max_prepared_xacts=0 max_locks_per_xact=1024 wal_level=logical wal_log_hints=off track_commit_timestamp=on",,,,"","startup"

6 of the 8 parameters can seriously affect standby operation. The other 2 parameters — wal_log_hints, track_commit_timestamp — are not immediately checked by the startup process. All 8 parameters being synchronized to the control file serve their own purposes.

wal_log_hints Primary-Standby Mismatch
#

Changes to wal_log_hints are recorded in WAL logs. Although not checked by the startup process, pg_rewind does check it:

perform_rewind(...)
{
...
	/*
	 * Target cluster need to use checksums or hint bit wal-logging, this to
	 * prevent from data corruption that could occur because of hint bits.
	 */
	if (ControlFile_target.data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
		!ControlFile_target.wal_log_hints)
	{
		pg_fatal("target server needs to use either data checksums or \"wal_log_hints = on\"");
	}

Since wal_log_hints is WAL-related, it doesn’t make sense for pg_rewind to check whether the standby’s wal_log_hints is enabled — it should check whether the primary’s wal_log_hints is enabled. Therefore, PG synchronizes the wal_log_hints parameter to the standby’s control file, which is very reasonable.

wal_log_hints primary-standby mismatch test:

select * from t1;
checkpoint;
update t1 set b='eee';  -- observation point 1
checkpoint;  -- ignore this online checkpoint wal record
select * from t1;  -- observation point 2

-- observation action
pg_waldump 000000020000027F0000000A|tail -10

-- observing option
select t_ctid,lp,
case lp_flags
when 0 then '0:LP_UNUSED'
when 1 then 'LP_NORMAL'
when 2 then 'LP_REDIRECT'
when 3 then 'LP_DEAD'
end as lp_flags,
t_xmin,t_xmax,t_field3 as t_cid, raw_flags, info.combined_flags,substring(t_data,0,40)
from heap_page_items(get_raw_page('t1',0)) item,
LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2) info
order by lp;

on, on:

-- Observation point 1:
rmgr: Heap        len (rec/tot):     85/   208, tx:   11140182, lsn: 27F/5000CC38, prev 27F/5000CBC0, desc: HOT_UPDATE off 3 xmax 11140182 flags 0x10 ; new off 4 xmax 0, blkref #0: rel 1663/7472552/7472597 blk 0 FPW
rmgr: Transaction len (rec/tot):     46/    46, tx:   11140182, lsn: 27F/5000CD08, prev 27F/5000CC38, desc: COMMIT 2025-07-21 18:28:13.292397 CST
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/5000CD38, prev 27F/5000CD08, desc: RUNNING_XACTS nextXid 11140183 latestCompletedXid 11140182 oldestRunningXid 11140183
-- Observation point 2:
rmgr: XLOG        len (rec/tot):     51/   171, tx:          0, lsn: 27F/58000110, prev 27F/580000D8, desc: FPI_FOR_HINT , blkref #0: rel 1663/7472552/7472597 blk 0 FPW
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/580001C0, prev 27F/58000110, desc: RUNNING_XACTS nextXid 11140183 latestCompletedXid 11140182 oldestRunningXid 11140183

off, off:

-- Observation point 1:
rmgr: Heap        len (rec/tot):     85/   225, tx:   11140183, lsn: 27F/580003C8, prev 27F/58000390, desc: HOT_UPDATE off 4 xmax 11140183 flags 0x10 ; new off 5 xmax 0, blkref #0: rel 1663/7472552/7472597 blk 0 FPW
rmgr: Transaction len (rec/tot):     46/    46, tx:   11140183, lsn: 27F/580004B0, prev 27F/580003C8, desc: COMMIT 2025-07-21 18:33:18.192146 CST
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/580004E0, prev 27F/580004B0, desc: RUNNING_XACTS nextXid 11140184 latestCompletedXid 11140183 oldestRunningXid 11140184
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/58000518, prev 27F/580004E0, desc: RUNNING_XACTS nextXid 11140184 latestCompletedXid 11140183 oldestRunningXid 11140184
-- Observation point 2:

on, off:

-- Observation point 1:
rmgr: Heap        len (rec/tot):     85/   274, tx:   11140186, lsn: 27F/58000C18, prev 27F/58000BA0, desc: HOT_UPDATE off 7 xmax 11140186 flags 0x10 ; new off 8 xmax 0, blkref #0: rel 1663/7472552/7472597 blk 0 FPW
rmgr: Transaction len (rec/tot):     46/    46, tx:   11140186, lsn: 27F/58000D30, prev 27F/58000C18, desc: COMMIT 2025-07-21 18:40:17.638691 CST
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/58000D60, prev 27F/58000D30, desc: RUNNING_XACTS nextXid 11140187 latestCompletedXid 11140186 oldestRunningXid 11140187
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/58000D98, prev 27F/58000D60, desc: RUNNING_XACTS nextXid 11140187 latestCompletedXid 11140186 oldestRunningXid 11140187
-- Observation point 2:
rmgr: XLOG        len (rec/tot):     51/   236, tx:          0, lsn: 27F/58000E48, prev 27F/58000DD0, desc: FPI_FOR_HINT , blkref #0: rel 1663/7472552/7472597 blk 0 FPW
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/58000F38, prev 27F/58000E48, desc: RUNNING_XACTS nextXid 11140187 latestCompletedXid 11140186 oldestRunningXid 11140187

off, on:

-- Observation point 1:
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/58001108, prev 27F/58001090, desc: RUNNING_XACTS nextXid 11140187 latestCompletedXid 11140186 oldestRunningXid 11140187
rmgr: Heap        len (rec/tot):     85/   289, tx:   11140187, lsn: 27F/58001140, prev 27F/58001108, desc: HOT_UPDATE off 8 xmax 11140187 flags 0x10 ; new off 9 xmax 0, blkref #0: rel 1663/7472552/7472597 blk 0 FPW
rmgr: Transaction len (rec/tot):     46/    46, tx:   11140187, lsn: 27F/58001268, prev 27F/58001140, desc: COMMIT 2025-07-21 18:44:08.550109 CST
rmgr: Standby     len (rec/tot):     54/    54, tx:          0, lsn: 27F/58001298, prev 27F/58001268, desc: RUNNING_XACTS nextXid 11140188 latestCompletedXid 11140186 oldestRunningXid 11140187; 1 xacts: 11140187
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 27F/580012D0, prev 27F/58001298, desc: RUNNING_XACTS nextXid 11140188 latestCompletedXid 11140187 oldestRunningXid 11140188
-- Observation point 2:

Test summary:

  • FPI_FOR_HINT is produced when hint bits are written back; SELECT queries can produce FPI_FOR_HINT.
  • Regardless of the standby setting (on or off), when the primary is on, FPI_FOR_HINT will be produced.

Additional Knowledge: What is XLOG_RUNNING_XACTS
#

XLOG_RUNNING_XACTS is one type of RM_STANDBY_ID:

/*
 * XLOG message types
 */
#define XLOG_STANDBY_LOCK			0x00
#define XLOG_RUNNING_XACTS			0x10
#define XLOG_INVALIDATIONS			0x20

XLOG_STANDBY_LOCK: Records acquisition and release of AccessExclusiveLock, used by standby nodes to recognize lock states.

XLOG_RUNNING_XACTS: Running-xacts snapshots used for building snapshots to ensure transaction consistency.

XLOG_INVALIDATIONS: INVALIDATIONS messages for synchronizing metadata information to local backends.

 * standbydefs.h
 *	   Frontend exposed definitions for hot standby mode.

RM_STANDBY_ID is an rmgr specifically defined for hot standby read-only standbys. For local instance recovery and logical decoding scenarios that need WAL, RM_STANDBY_ID is essentially meaningless to them.

Observing WAL records during transaction commit:

commandwal record
begin;
select * from txid_current(); –11140191
commit;rmgr: Transaction len (rec/tot): 46/ 46, tx: 11140191, lsn: 27F/80000538, prev 27F/80000500, desc: COMMIT 2025-07-23 11:16:10.872724 CST
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 27F/80000568, prev 27F/80000538, desc: RUNNING_XACTS nextXid 11140192 latestCompletedXid 11140191 oldestRunningXid 11140192

The transaction ID itself — commit or abort — is synchronized by rmgr: Transaction. Snapshots are synchronized via rmgr: Standby RUNNING_XACTS.

track_commit_timestamp Primary-Standby Mismatch
#

track_commit_timestamp: the startup process activates the standby’s commit_ts functionality upon receiving the corresponding WAL, primarily for viewing xid commit times on the standby:

/*
 * Activate or deactivate CommitTs' upon reception of a XLOG_PARAMETER_CHANGE
 * XLog record during recovery.
 */
void
CommitTsParameterChange(bool newvalue, bool oldvalue)
{
	/*
	 * If the commit_ts module is disabled in this server and we get word from
	 * the primary server that it is enabled there, activate it so that we can
	 * replay future WAL records involving it; also mark it as active on
	 * pg_control.  If the old value was already set, we already did this, so
	 * don't do anything.
	 *
	 * If the module is disabled in the primary, disable it here too, unless
	 * the module is enabled locally.
	 *
	 * Note this only runs in the recovery process, so an unlocked read is
	 * fine.
	 */
	if (newvalue)
	{
		if (!commitTsShared->commitTsActive)
			ActivateCommitTs();
	}
	else if (commitTsShared->commitTsActive)
		DeactivateCommitTs();
}

track_commit_timestamp primary-standby mismatch test:

  1. Initial state: primary=on, standby=on. Both can use committed_xact and similar functions.
  2. primary=off (restart primary), standby=on (no change). Both cannot use committed_xact and similar functions.

After modifying and restarting the primary, standby replication remains normal, but committed_xact and similar functions are unusable:

$ select * from pg_last_committed_xact();
ERROR:  55000: could not get commit timestamp data
HINT:  Make sure the configuration parameter "track_commit_timestamp" is set on the primary server.
LOCATION:  error_commit_ts_disabled, commit_ts.c:385

$ show track_commit_timestamp
-> ;
 track_commit_timestamp
------------------------
 on
(1 row)

Time: 0.198 ms
$ \q
## pg_controldata |grep track_commit_timestamp
track_commit_timestamp setting:       off

PG14+ Pause Recovery
#

PG14 improved the behavior when primary parameter changes cause standby crashes. When parameters don’t meet conditions, instead of the read-only standby directly crashing, it now only pauses replication. See RecoveryRequiresIntParameter.

Pause recovery on a hot standby server if the primary changes its parameters in a way that prevents replay on the standby (Peter Eisentraut)

Previously the standby would shut down immediately

Testing PG14 parameter changes causing standby replication interruption:

2025-07-23 19:46:31.337 CST,,,141823,,6880ca5f.229ff,14,,2025-07-23 19:41:19 CST,1/0,0,LOG,00000,"recovery has paused","If recovery is unpaused, the server will shut down.","You can then restart the server after making the necessary configuration changes.",,,"WAL redo at 281/78324BE8 for XLOG/PARAMETER_CHANGE: max_connections=2000 max_worker_processes=20 max_wal_senders=10 max_prepared_xacts=0 max_locks_per_xact=1024 wal_level=logical wal_log_hints=on track_commit_timestamp=on",,,,"","startup",,0

Since replication has already stopped, changing the primary’s parameters back won’t help — the standby can’t apply subsequent changes and update the control file. So you must modify the standby’s parameters and restart (the log hint is also quite clear).

Summary of the 8 Parameters
#

When any of the 8 parameters are modified on the primary and the primary is restarted, the local control file is updated. If parameters have changed, the updated parameters are written to WAL and synchronized to downstream. The downstream redoes this PARAMETER_CHANGE WAL record, updating its local control file. The standby then determines whether primary-standby replication or other functions are available based on certain conditions.

8 Parameters Written to Control FileCheckIf not, standby (PG13-)If not, standby (PG14+)
wal_level!=minimalCannot sync, fundamentalCannot sync, fundamental
max_connectionsprimary <= standbyhot standby shutdownhot standby pause replication
max_worker_processesprimary <= standbyhot standby shutdownhot standby pause replication
max_wal_sendersprimary <= standbyhot standby shutdownhot standby pause replication
max_prepared_transactionsprimary <= standbyhot standby shutdownhot standby pause replication
max_locks_per_transactionprimary <= standbyhot standby shutdownhot standby pause replication
wal_log_hintspg_rewind prerequisite (either data checksums or wal_log_hints = on)Doesn’t affect standby syncDoesn’t affect standby sync
track_commit_timestampEnable/disable standby commit_ts functionalityDoesn’t affect standby syncDoesn’t affect standby sync

Special thanks to: Gao Changjun

Related

A Brief Analysis of PostgreSQL TRUNCATE

·2240 words·11 mins
Command Options # TRUNCATE [ TABLE ] [ ONLY ] name [ * ] [, ... ] [ RESTART IDENTITY | CONTINUE IDENTITY ] [ CASCADE | RESTRICT ] 1. ONLY: truncate only the specified table. When a table has inheritance children or child partitions, by default they are truncated together; ONLY can truncate just the inheritance parent table. Partitioned parent tables cannot specify ONLY. -- Cannot truncate only a partitioned parent table => truncate only parttable; ERROR: 42809: cannot truncate only a partitioned table HINT: Do not specify the ONLY keyword, or use TRUNCATE ONLY on the partitions directly. LOCATION: ExecuteTruncate, tablecmds.c:1655 -- truncate only the inheritance parent table, only the parent is cleaned => truncate table only parenttable; TRUNCATE TABLE => select tableoid::regclass,count(*) from parenttable group by tableoid::regclass ; tableoid | count ------------+------- childtable | 1 -- Directly truncate the inheritance parent table, child tables are also cleaned => truncate table parenttable; TRUNCATE TABLE => select tableoid::regclass,count(*) from parenttable group by tableoid::regclass ; tableoid | count ----------+------- (0 rows) 2. RESTART IDENTITY CONTINUE IDENTITY: whether to reset sequences on columns. Default is CONTINUE.

From Extremely Slow Unique Index Scan to Index Bloat

·6561 words·31 mins
How Did a Primary Key Query Access Multiple Data Pages? # Continuing from the previous article: A Classic Case of Long Transactions, Table Bloat, and LIMIT Problems, there was one point not explained in detail: Why does a query using the primary key generate so many shared hits? Why does index bloat cause access to multiple data pages? Can’t data outside the page be located through the corresponding index entry? This relates to index version management — indexes do carry some version information, but not much. Let’s first review PostgreSQL’s btree index structure.

How Does PG Access Basic System Tables Before pg_class Exists?

·1815 words·9 mins
How does the database access system tables before pg_class exists? This question can be divided into two stages: Database cluster initialization — at this point no database exists at all, so how to construct and access system tables like pg_class is a problem. Private memory initialization of system tables. PG stores system table information in the local backend process. How does the backend load pg_class during initialization? Initializing the Data Dictionary # When the database hasn’t been initialized yet, it’s obviously impossible to access the data dictionary to initialize objects like database, pg_class, etc., because without a database you can’t CREATE DATABASE, and without pg_class you can’t look up metadata information.