Skip to main content
  1. Posts/

Getting Started with pg_rewind

·842 words·4 mins
liuzhilong62
Author
liuzhilong62
PostgreSQL DBA. Writing about database internals, production cases, and source code analysis.

What is pg_rewind?
#

pg_rewind is a PostgreSQL-provided tool. When the timelines of two PG instances diverge, pg_rewind can synchronize them. (For example, the primary is running, the standby failover has been running for a while — at this point the primary and standby timelines have diverged.)

pg_rewind compares the sizes of files between the source and target, then copies differing files from source to target, including configuration files. However, it does not compare unchanged files, so pg_rewind runs efficiently on large databases with few changes.

pg_rewind can be used after a standby failover: even if the standby has been running independently for some time, it can be pulled back to the same state as the primary and become a standby again.

During execution, pg_rewind compares the divergence point between primary (source) and standby (target), and transmits the primary’s WAL logs after the divergence point to the standby. Therefore, if the primary’s WAL after the divergence point is also lost, rewind won’t copy nonexistent WAL logs, and the standby will still fail to become a standby. The solution is to use restore.

!!! When using pg_rewind, back up the target instance. pg_rewind directly overwrites the target database’s files. If rewind fails, the target database may be unable to start.

Using pg_rewind
#

After a primary-standby switchover, the old primary continues running, causing timeline inconsistency. The old primary cannot start as a standby for the new primary.

When attempting to start the standby, a timeline error appears:

LOG:  entering standby mode
FATAL:  requested timeline 2 is not a child of this server's history
DETAIL:  Latest checkpoint is at 0/6000028 on timeline 1, but in the history of the requested timeline, the server forked off from that timeline at 0/4000098.
LOG:  startup process (PID 22321) exited with exit code 1
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

At this point, rewind is needed to realign the primary and standby.

  1. Configure pg_hba on the current primary Set up login permissions for the pg_rewind user to access the source database. hba changes require a database restart.
vi $source/pg_hba.conf
host    all       pg         172.17.100.150/32          trust

pg_rewind requires a high-privilege user. Newer PG versions allow granting privileges; older versions should use a superuser. My environment is PG 9.6, so I use the OS superuser directly.

  1. wal_log_hints = on parameter configuration Append wal_log_hints = on to the target database’s postgres.conf, then start and shut down the target database once (at this point the primary is running and the standby is shut down).
vi $dest/postgres.conf

wal_log_hints = on
  1. Execute pg_rewind
[pg@lzl pg96data_sla]$ /pg/pg96/bin/pg_rewind --target-pgdata /pg/pg96data_pri --source-server='host=172.17.100.150 port=5433 user=pg password=oracle  dbname=postgres'
servers diverged at WAL position 0/4000098 on timeline 1
rewinding from last common checkpoint at 0/4000028 on timeline 1
Done!
  1. Configure standby parameters Modify IP, port, directory, etc. in postgres.conf and recovery.conf. pg_rewind also copies configuration files over.
[pg@lzl pg96data_pri]$ mv recovery.done recovery.conf
[pg@lzl pg96data_pri]$ vi recovery.conf
[pg@lzl pg96data_pri]$ vi postgres.conf
  1. Start the standby
[pg@lzl pg96data_pri]$  /pg/pg96/bin/pg_ctl -D /pg/pg96data_sla -l /pg/pg96data_sla/server.log start        
server starting
[pg@lzl pg96data_sla]$ psql -p5433 postgres
psql (9.6.17)
postgres=# \x
Expanded display is on.
postgres=# select * from  pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
pid              | 24766
usesysid         | 16384
usename          | lzl
application_name | walreceiver
client_addr      | 172.17.100.150
client_hostname  | 
client_port      | 47345
backend_start    | 2021-07-30 07:44:05.582546+00
backend_xmin     | 
state            | streaming
sent_location    | 0/4033790
write_location   | 0/4033790
flush_location   | 0/4033790
replay_location  | 0/4033790
sync_priority    | 0
sync_state       | async

Common Issues
#

pg_rewind Error 1
#

could not fetch remote file "global/pg_control": ERROR:  must be superuser to read files
Failure, exiting

Solution: Use a high-privilege user.

postgres=# \du
                                   List of roles
  Role name  |                         Attributes                         | Member of 
-------------+------------------------------------------------------------+-----------
 lzl         | Replication                                                | {}
 pg          | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 rewind_user |                                                            | {}

The pg user is the built-in superuser that comes with the PG server, matching the PG installation user. The OS installation user certainly has permission to modify pg_control.

/pg/pg96/bin/pg_rewind --target-pgdata /pg/pg96data_pri --source-server='host=172.17.100.150 port=5433 user=pg password=oracle  dbname=postgres'

pg_rewind Error 2
#

could not connect to server: FATAL:  no pg_hba.conf entry for host "172.17.100.150", user "rewind_user", database "postgres"
Failure, exiting

No pg_hba.conf entry configured for the connection. Solution: Configure pg_hba for the user, e.g.:

host    all       pg         172.17.100.150/32          trust

pg_rewind Error 3
#

[pg@lzl pg96data_sla]$   /pg/pg96/bin/pg_rewind --target-pgdata /pg/pg96data_pri --source-server='host=172.17.100.150 port=5433 user=pg password=oracle  dbname=postgres'

target server needs to use either data checksums or "wal_log_hints = on"

Root causes:

  1. full_page_writes (enabled by default)
  2. wal_log_hints must be set to on, or PG must have checksums enabled at initdb time.

Solution: Add wal_log_hints = on to the target database’s postgres.conf, then start and shut down the target database once (the target was already shut down — it must be started and shut down again for the parameter to take effect).

vi postgres.conf  # add to target database config

wal_log_hints = on

Restart the target database to apply:

[pg@lzl pg96data_sla]$  /pg/pg96/bin/pg_ctl -D /pg/pg96data_pri -l /pg/pg96data_pri/server.log start      
server starting
[pg@lzl pg96data_sla]$  /pg/pg96/bin/pg_ctl -D /pg/pg96data_pri -l /pg/pg96data_pri/server.log stop
waiting for server to shut down.... done

References
#

https://www.postgresql.org/docs/9.6/app-pgrewind.html

Related

How I Got Into Wuhan University's Part-Time Master's Program

·5043 words·24 mins
Why Did I Want to Pursue a Part-Time Master’s? # To improve my academic credentials. My undergraduate degree is from an ordinary university. A higher degree can add a bit of competitiveness in my career. I once submitted my resume to a state-owned enterprise and was completely ghosted. But a colleague with better academic credentials in the same office got through. So for state-owned enterprises, higher education is the knock on the door. To make up for failing the graduate entrance exam as a senior and revive the dream of graduate studies. Learning is never wrong — this is my creed. Differences Between Full-Time and Part-Time Graduate Programs # Study Mode # Full-time means you quit your job; part-time allows you to keep working. This basically locks in part-time as the only option for most working people.

OGG Oracle-to-PostgreSQL Sync — Hands-On Steps

·1927 words·10 mins
Source DB: Oracle (11.2.0.4) 192.168.10.141 Target DB: PGSQL (10.12) 192.168.10.128 OGG software version: (19.1.0.0.4) OGG download: Oracle GoldenGate Downloads glibc issue handling: https://www.cnblogs.com/hxlasky/p/16779047.html 1. Install OGG Software on Source and Target # Source: A. Configure response file: oggcore.rsp oracle.install.responseFileVersion=/home/oracle/oggcore.rsp INSTALL_OPTION=ORA11g SOFTWARE_LOCATION=/oracle/ogg START_MANAGER=false MANAGER_PORT=7809 DATABASE_LOCATION=/oracle/db/11.2.0.4 INVENTORY_LOCATION=/oracle/oraInventory UNIX_GROUP_NAME=oinstall B. Silent install OGG ./runInstaller -silent -nowait -responseFile /home/oracle/oggcore.rsp oracle@szgtsp431-or@ecsdb>./runInstaller -silent -nowait -responseFile /home/oracle/oggcore.rsp Starting Oracle Universal Installer... Checking Temp space: must be greater than 120 MB. Actual 32405 MB Passed Checking swap space: must be greater than 150 MB. Actual 2048 MB Passed Preparing to launch Oracle Universal Installer from /tmp/OraInstall2020-08-14_08-57-27AM. Please wait ... You can find the log of this install session at: /oracle/oraInventory/logs/installActions2020-08-14_08-57-27AM.log Successfully Setup Software. The installation of Oracle GoldenGate Core was successful. Please check '/oracle/oraInventory/logs/silentInstall2020-08-14_08-57-27AM.log' for more details. 2. Set Database to Archive Mode # oracle@szgtsp431-or@ecsdb>sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Fri Aug 14 09:06:34 2020 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> archive log list; Database log mode Archive Mode Automatic archival Enabled Archive destination /oracle/oradata/archivelog Oldest online log sequence 19 Next log sequence to archive 21 Current log sequence 21 3. Enable Force Logging and Minimum Supplemental Logging # alter database force logging; alter database add supplemental log data; alter system switch logfile; Verify force logging and minimum supplemental logging enabled:

OGG PostgreSQL-to-Oracle Sync — Hands-On Steps

·1338 words·7 mins
OGG software version: (19.1.0.0.4) Oracle version: 11.2.0.4 PG version: pg10 OGG download: https://www.oracle.com/technetwork/middleware/goldengate/downloads/index.html glibc issue handling: https://www.cnblogs.com/hxlasky/p/16779047.html 1. Create Database and Table on Source # [root@node2 ~]# su - postgres Last login: Tue Jul 21 21:08:52 CST 2020 on pts/0 [postgres@node2 ~]$ pg_ctl -D /opt/pgsql_data -l logfile start waiting for server to start.... done server started postgres=# create database test postgres=# \c lzldb postgres=# create table tab1(id int primary key,name varchar(20)) 2. Create Database and Table on Target # sqlplus / as sysdba SQL> create table ORALZL.tab1(id number primary key,name varchar2(20)); 3. Extract and Install OGG for PostgreSQL # -- Unlike OGG for Oracle, OGG for PG only needs extraction. Oracle version requires running runInstaller. [postgres@node1 ~]$ id postgres uid=54323(postgres) gid=54330(postgres) groups=54330(postgres) [postgres@node1 ~]$ exit logout [root@node1 ~]# mkdir /ogg [root@node1 ~]# chown -R postgres /ogg [root@node1 ~]# chmod -R 755 /ogg [root@node1 ~]# [root@node1 soft]# ls -l total 240744 -rw-r--r--. 1 root root 87028695 Jul 22 02:51 19100200714_ggs_Linux_x64_PostgreSQL_64bit.zip [root@node1 soft]# chmod 777 19100200714_ggs_Linux_x64_PostgreSQL_64bit.zip [root@node1 soft]# unzip 19100200714_ggs_Linux_x64_PostgreSQL_64bit.zip Archive: 19100200714_ggs_Linux_x64_PostgreSQL_64bit.zip inflating: ggs_Linux_x64_PostgreSQL_64bit.tar inflating: OGG-19.1.0.0-README.txt inflating: release-notes-oracle-goldengate_19.1.0.200714.pdf [root@node1 soft]# chmod 777 ggs_Linux_x64_PostgreSQL_64bit.tar [root@node1 soft]# su - postgres [postgres@node1 ~]$ cd /soft [postgres@node1 soft]$ tar -xf ggs_Linux_x64_PostgreSQL_64bit.tar -C /ogg 4. Configure PG User Environment Variables # Source PG: