Some Features of PostgreSQL Logical Replication

Table of Contents

I’ve already written a fairly detailed article about logical replication before, so I won’t repeat the basics here. However, some knowledge points inevitably get missed. Recently I’ve discovered some interesting logical replication features.

replica identity and old/new values
#

replica identity is used to identify a row during logical replication. The above statement is certainly correct, but it doesn’t explain the changes in old and new data.

DEFAULT Records the old values of the columns of the primary key, if any. This is the default for non-system tables. USING INDEX index_name Records the old values of the columns covered by the named index, that must be unique, not partial, not deferrable, and include only columns marked NOT NULL. If this index is dropped, the behavior is the same as NOTHING. FULL Records the old values of all columns in the row. NOTHING Records no information about the old row. This is the default for system tables.

The PG official documentation only explains the old value situation for replica identity — for example, it doesn’t even mention that NOTHING won’t replicate update/delete. This shows the importance of old values.

Creating a replication link:

select pg_create_logical_replication_slot('pubtestlzl2','test_decoding');
pg_recvlogical -d lzldb --slot=pubtestlzl2 --start -f recv.sql &

Normal test_decoding replication link simulation:

--replica identity defaults to d: uses primary key when available; without primary key, defaults to nothing, unable to replicate update and delete
M=>  create table lzltest(a bigint primary key,b varchar(100),c varchar(100));
CREATE TABLE
M=> insert into lzltest values(1,'bbbbbb','ccccccccc');
INSERT 0 1
M=> update lzltest set b='b';
UPDATE 1

recvlogical output:

table public.lzltest: INSERT: a[bigint]:1 b[character varying]:'bbbbbb' c[character varying]:'ccccccccc'
table public.lzltest: UPDATE: a[bigint]:1 b[character varying]:'b' c[character varying]:'ccccccccc'

With replica identity as default, updating a non-primary-key field — all fields have only new values.

M=> update lzltest set a='111';
UPDATE 1
table public.lzltest: UPDATE: old-key: a[bigint]:1 new-tuple: a[bigint]:111 b[character varying]:'bb' c[character varying]:'ccccccccc'

With replica identity as default, updating the primary key — the identity column’s old and new values are decoded; other fields only have new values.

M=> alter table lzltest replica identity full;
ALTER TABLE

M=>  update lzltest set b='b';
UPDATE 1
table public.lzltest: UPDATE: old-key: a[bigint]:2 b[character varying]:'b' c[character varying]:'ccccccccc' new-tuple: a[bigint]:2 b[character varying]:'b' c[character varying]:'ccccccccc'

With replica identity set to full, both old and new values for the entire row are preserved.

Whether in default (primary key) or full mode, all column information is recorded. The difference lies in whether old data is present. In default mode:

insert: inherently new data, so naturally no old values — all column new values are recorded.
update: records new values for all columns; only the identity column has old values (if the identity column was updated).
delete: inherently old data, but not all columns are necessarily recorded. The same rule applies: only the identity column has old values — only the identity column is recorded.

Summary: When replica identity is default, regardless of the operation (INSERT, UPDATE, DELETE), as long as it’s old data, only the identity column is recorded; as long as it’s new data, all columns are recorded.

When changing from default to full, the decoded log volume difference isn’t particularly large, because new data always includes all columns. (Excluding scenarios that are entirely deletes) the log volume decoded under full is less than twice that of default.

pgoutput cannot be peeked
#

Create a replication slot using pgoutput:

select pg_create_logical_replication_slot('pubtestlzl','pgoutput');

Then try to peek or receive — both fail:

select * from pg_logical_slot_peek_changes('pubtestlzl',null,null);
pg_recvlogical -d lzldb --slot=pubtestlzl --start -f recv.sql &

pg_recvlogical: error: could not send replication command "START_REPLICATION SLOT "pubtestlzl" LOGICAL 0/0": ERROR:  client sent proto_version=0 but we only support protocol 1 or higher
CONTEXT:  slot "pubtestlzl", output plugin "pgoutput", in the startup callback
pg_recvlogical: disconnected; waiting 5 seconds to try again

You cannot peek or use pg_recvlogical to receive from a pgoutput replication slot. Since pgoutput is the output plugin for publish-subscribe, this plugin cannot be manually peeked or received…

Publish-Subscribe Doesn’t Have to Be PG-to-PG
#

create publication and create subscription are PG internal commands that can also be used to create links between PG databases. Third-party software can similarly use create publication and simulate subscriptions to create replication slots. This is better than directly creating replication slots because publications can manage replicated tables.

TOAST and Logical Decoding
#

TOAST columns being sent are NOT decoded! This means an entire row of data may only have part of it transmitted (when TOAST columns themselves haven’t been updated).

Normal decoding decodes all columns:

--Create a test_decoding replication slot
=>  select pg_create_logical_replication_slot('logical_dest','test_decoding');
 pg_create_logical_replication_slot 
------------------------------------
 (logical_dest,349/A80040E0)
(1 row)

--Create a table with small columns
=> create table test1(a int primary key,b varchar(100),c varchar(100));
CREATE TABLE

=> select* from pg_replication_slots;
  slot_name   |    plugin     | slot_type | datoid | database | temporary | active | active_pid |  xmin  | catalog_xmin | restart_lsn  | confirmed_flush_lsn | wal_status | safe_wal_size 
--------------+---------------+-----------+--------+----------+-----------+--------+------------+--------+--------------+--------------+---------------------+------------+---------------
 logical_dest | test_decoding | logical   | 418679 | lzldb    | f         | f      |     [null] | [null] |    872483335 | 349/A80040A8 | 349/A80040E0        | reserved   |        [null]
(1 row)


=>  insert into test1 values (1,'qwer','qwer');
INSERT 0 1
Time: 0.915 ms
=> select * from pg_logical_slot_peek_changes('logical_dest',null,null);
     lsn      |    xid    |                                               data                                               
--------------+-----------+--------------------------------------------------------------------------------------------------
 349/A8004C78 | 872483335 | BEGIN 872483335
 349/A80103E8 | 872483335 | COMMIT 872483335
 349/A8018B30 | 872483369 | BEGIN 872483369
 349/A8018B30 | 872483369 | table public.test1: INSERT: a[integer]:1 b[character varying]:'qwer' c[character varying]:'qwer'
 349/A8018C50 | 872483369 | COMMIT 872483369
(5 rows)
--insert is decoded, containing all columns

=> update test1 set b='zxcv' where c='qwer';
UPDATE 1
Time: 4.005 ms
=>  select * from pg_logical_slot_peek_changes('logical_dest',null,null);
     lsn      |    xid    |                                               data                                               
--------------+-----------+--------------------------------------------------------------------------------------------------
 349/A8004C78 | 872483335 | BEGIN 872483335
 349/A80103E8 | 872483335 | COMMIT 872483335
 349/A8018B30 | 872483369 | BEGIN 872483369
 349/A8018B30 | 872483369 | table public.test1: INSERT: a[integer]:1 b[character varying]:'qwer' c[character varying]:'qwer'
 349/A8018C50 | 872483369 | COMMIT 872483369
 349/A801D018 | 872483378 | BEGIN 872483378
 349/A801D018 | 872483378 | table public.test1: UPDATE: a[integer]:1 b[character varying]:'zxcv' c[character varying]:'qwer'
 349/A801D098 | 872483378 | COMMIT 872483378
(8 rows)
--update is decoded, containing all columns

Normally, without TOAST, decoded data includes all columns of the row.

TOAST decoding test:

--Enlarge the columns
=>  alter table test1 alter column b type varchar(3000);
ALTER TABLE
Time: 8.091 ms
=>   alter table test1 alter column c type varchar(3000);
ALTER TABLE
Time: 0.937 ms

--A batch random function
=> create or replace function f_random_str(length INTEGER) returns character varying
-> LANGUAGE plpgsql
-> AS $$
-> DECLARE
-> result varchar(3000);
-> BEGIN
-> SELECT array_to_string(ARRAY(SELECT chr((65 + round(random() * 25)) :: integer)
-> FROM generate_series(1,length)), '') INTO result;
-> return result;
-> END
-> $$;
CREATE FUNCTION

--Insert data
=>  insert into test1 values (2,f_random_str(2000),f_random_str(2000));
INSERT 0 1

--Check for TOAST
=> SELECT                                              
->   n.nspname as schema,                              
-> s.oid::regclass as relname,                     
-> s.reltoastrelid::regclass as toast_name,        
-> pg_relation_size(s.reltoastrelid) AS toast_size 
->   FROM                                              
-> pg_class s join pg_namespace n                  
-> on s.relnamespace = n.oid                       
->   WHERE                                             
-> relkind = 'r'                                   
-> AND reltoastrelid <> 0                          
-> AND n.nspname = 'public'                        
->   ORDER BY                                          
-> 3 DESC;
 schema | relname |        toast_name        | toast_size 
--------+---------+--------------------------+------------
 public | test1   | pg_toast.pg_toast_418714 |       8192
(1 row)

--Update via primary key, updating a TOAST column
=> update test1 set b='zxcv' where a=2;
UPDATE 1

=>  select * from pg_logical_slot_peek_changes('logical_dest',null,null);
     lsn      |    xid    |                                                                                                                                                                                      
--------------+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
 349/A851FD90 | 872483420 | BEGIN 872483420
 349/A85216E0 | 872483420 | table public.test1: INSERT: a[integer]:2 b[character varying]:'GIORCXQQWDBGTUNDZXAWMPYOUEGTECWTVQGDQGSPMEPJNPUQIFMESLRASBZWGONETRENDCHLDWVTDWJLTGRYUMFDOWHLEYLUTECPOVCYXFIATLKVEQTHSC'
 349/A85218A0 | 872483420 | COMMIT 872483420
 349/A8525CA8 | 872483429 | BEGIN 872483429
 349/A8525D50 | 872483429 | table public.test1: UPDATE: a[integer]:2 b[character varying]:'zxcv' c[character varying]:unchanged-toast-datum
 349/A8525DE0 | 872483429 | COMMIT 872483429

Column c, which has TOAST and was not involved in the update, has no decoded data — it directly outputs toast datum unchanged: unchanged-toast-datum.

Testing with wal2json:

=>  select pg_create_logical_replication_slot('logical_json','wal2json');
 pg_create_logical_replication_slot 
------------------------------------
 (logical_json,349/A87CAB58)
(1 row)

=>  update test1 set b='zxcv' where a=2;
UPDATE 1
=>  \pset format wrapped
Output format is wrapped.
=>  \pset columns 200
Target width is 200.
=>  select * from pg_logical_slot_peek_changes('logical_json',null,null);
-[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
lsn  | 349/A87CACF8
xid  | 872483495
data | {"change":[{"kind":"update","schema":"public","table":"test1","columnnames":["a","b"],"columntypes":["integer","character varying(3000)"],"columnvalues":[2,"zxcv"],"oldkeys":{"keynames":["a"],.
     |."keytypes":["integer"],"keyvalues":[2]}}]}

=> update test1 set b='zxcv' where a=1;
UPDATE 1
Time: 1.391 ms
=>  select * from pg_logical_slot_peek_changes('logical_json',null,null);
-[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
lsn  | 349/A87CACF8
xid  | 872483495
data | {"change":[{"kind":"update","schema":"public","table":"test1","columnnames":["a","b"],"columntypes":["integer","character varying(3000)"],"columnvalues":[2,"zxcv"],"oldkeys":{"keynames":["a"],.
     |."keytypes":["integer"],"keyvalues":[2]}}]}
-[ RECORD 2 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
lsn  | 349/A8CCA0D8
xid  | 872483509
data | {"change":[{"kind":"update","schema":"public","table":"test1","columnnames":["a","b","c"],"columntypes":["integer","character varying(3000)","character varying(3000)"],"columnvalues":[1,"zxcv".
     |.,"qwer"],"oldkeys":{"keynames":["a"],"keytypes":["integer"],"keyvalues":[1]}}]}
 --When updating, column c data is not decoded

wal2json shows the same behavior.

MySQL’s binlog_row_image parameter can adjust whether binlog records large fields:

full (Log all columns)
minimal (Log only changed columns, and columns needed to identify rows)
noblob (Log all columns, except for unneeded BLOB and TEXT columns)

PG has absolutely no such control — by default, TOAST columns are not decoded, and there are no other options to configure~

replica identity and old/new values#

pgoutput cannot be peeked#

Publish-Subscribe Doesn’t Have to Be PG-to-PG#

TOAST and Logical Decoding#

Related

replica identity and old/new values
#

pgoutput cannot be peeked
#

Publish-Subscribe Doesn’t Have to Be PG-to-PG
#

TOAST and Logical Decoding
#