PostgreSQL主从切换测试
PostgreSQL主从切换测试
1 switchover 切换测试
1.1 test1切换为备库,test2切换为主库
1.1.1 test2备库查询主从状态
[pg12@test2 ~]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | primary | * running | | default | 100 | 1 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | standby | running | test1 | default | 100 | 1 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
1.1.2 test2备库执行切换
[pg12@test2 ~]$ repmgr -f /home/pg12/conf/repmgr.conf standby switchover --siblings-follow
NOTICE: executing switchover on node "test2" (ID: 2)
WARNING: option "--sibling-nodes" specified, but no sibling nodes exist
NOTICE: local node "test2" (ID: 2) will be promoted to primary; current primary "test1" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "test1" (ID: 1)
NOTICE: issuing CHECKPOINT on node "test1" (ID: 1)
DETAIL: executing server command "/home/pg12/soft/bin/pg_ctl -D '/home/pg12/repmgr' -W -m fast stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
2024-09-15 03:33:06.219 EDT [13671] LOG: replication terminated by primary server
2024-09-15 03:33:06.219 EDT [13671] DETAIL: End of WAL reached on timeline 1 at 0/321B618.
2024-09-15 03:33:06.219 EDT [13671] FATAL: could not send end-of-streaming message to primary: no COPY in progress
2024-09-15 03:33:06.219 EDT [13667] LOG: invalid record length at 0/321B618: wanted 24, got 0
2024-09-15 03:33:06.222 EDT [20814] FATAL: could not connect to the primary server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/321B5A0
NOTICE: promoting standby to primary
DETAIL: promoting server "test2" (ID: 2) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
2024-09-15 03:33:07.509 EDT [13667] LOG: received promote request
2024-09-15 03:33:07.509 EDT [13667] LOG: redo done at 0/321B5A0
2024-09-15 03:33:07.509 EDT [13667] LOG: last completed transaction was at log time 2024-09-15 03:33:05.489002-04
2024-09-15 03:33:07.515 EDT [13667] LOG: selected new timeline ID: 2
2024-09-15 03:33:07.557 EDT [13667] LOG: archive recovery complete
2024-09-15 03:33:07.560 EDT [13666] LOG: database system is ready to accept connections
NOTICE: STANDBY PROMOTE successful
DETAIL: server "test2" (ID: 2) was successfully promoted to primary
INFO: local node 1 can attach to rejoin target node 2
DETAIL: local node's recovery point: 0/321B5A0; rejoin target node's fork point: 0/321B618
NOTICE: setting node 1's upstream to node 2
WARNING: unable to ping "host=192.168.10.100 user=repmgr port=5432 dbname=postgres"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: starting server using "/home/pg12/soft/bin/pg_ctl -w -D '/home/pg12/repmgr' start"
NOTICE: NODE REJOIN successful
DETAIL: node 1 is now attached to node 2
NOTICE: node "test2" (ID: 2) promoted to primary, node "test1" (ID: 1) demoted to standby
NOTICE: switchover was successful
DETAIL: node "test2" is now primary and node "test1" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully
1.1.3 查看主从状态
–test2 执行
[pg12@test2 ~]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | standby | running | test2 | default | 100 | 2 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | primary | * running | | default | 100 | 2 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
–test1 执行
[pg12@test1 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | standby | running | test2 | default | 100 | 2 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | primary | * running | | default | 100 | 2 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
当前test2为主库,test1为备库
1.2 test1切换为主库,test2切换为备库
1.2.1 test1备库查询主从状态
[pg12@test1 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | standby | running | test2 | default | 100 | 2 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | primary | * running | | default | 100 | 2 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
1.2.2 test1备库执行切换
[pg12@test1 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf standby switchover --siblings-follow
NOTICE: executing switchover on node "test1" (ID: 1)
WARNING: option "--sibling-nodes" specified, but no sibling nodes exist
NOTICE: local node "test1" (ID: 1) will be promoted to primary; current primary "test2" (ID: 2) will be demoted to standby
NOTICE: stopping current primary node "test2" (ID: 2)
NOTICE: issuing CHECKPOINT on node "test2" (ID: 2)
DETAIL: executing server command "/home/pg12/soft/bin/pg_ctl -D '/home/pg12/repmgr' -W -m fast stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/323AAC8
NOTICE: promoting standby to primary
DETAIL: promoting server "test1" (ID: 1) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "test1" (ID: 1) was successfully promoted to primary
INFO: local node 2 can attach to rejoin target node 1
DETAIL: local node's recovery point: 0/323AAC8; rejoin target node's fork point: 0/323AB40
NOTICE: setting node 2's upstream to node 1
WARNING: unable to ping "host=192.168.10.101 user=repmgr port=5432 dbname=postgres"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: starting server using "/home/pg12/soft/bin/pg_ctl -w -D '/home/pg12/repmgr' start"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 1
NOTICE: node "test1" (ID: 1) promoted to primary, node "test2" (ID: 2) demoted to standby
NOTICE: switchover was successful
DETAIL: node "test1" is now primary and node "test2" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully
1.1.3 查看主从状态
–test1 执行
[pg12@test1 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | primary | * running | | default | 100 | 3 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | standby | running | test1 | default | 100 | 3 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
–test2 执行
[pg12@test2 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | primary | * running | | default | 100 | 3 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | standby | running | test1 | default | 100 | 3 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
当前test1为主库,test2为备库
2 failover 切换测试
2.1 模拟主库故障
[pg12@test1 repmgr]$ pg_ctl stop
waiting for server to shut down.... done
server stopped
2.2 查看主从状态
test2 已转换为主库
[pg12@test2 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | primary | - failed | ? | default | 100 | | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | primary | * running | | default | 100 | 4 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
WARNING: following issues were detected
- unable to connect to node "test1" (ID: 1)
HINT: execute with --verbose option to see connection error messages
2.3 重新注册备库
[pg12@test1 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf node rejoin -d 'host=192.168.10.101 user=repmgr port=5432 dbname=postgres' --force-rewind
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "/home/pg12/soft/bin/pg_rewind -D '/home/pg12/repmgr' --source-server='host=192.168.10.101 user=repmgr port=5432 dbname=postgres'"
pg_rewind: servers diverged at WAL location 0/3255E40 on timeline 3
pg_rewind: no rewind required
NOTICE: 0 files copied to /home/pg12/repmgr
NOTICE: setting node 1's upstream to node 2
WARNING: unable to ping "host=192.168.10.100 user=repmgr port=5432 dbname=postgres"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: starting server using "/home/pg12/soft/bin/pg_ctl -w -D '/home/pg12/repmgr' start"
NOTICE: NODE REJOIN successful
DETAIL: node 1 is now attached to node 2
2.4 查看主从状态
–test1 执行
[pg12@test1 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | standby | running | test2 | default | 100 | 3 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | primary | * running | | default | 100 | 4 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
[pg12@test1 repmgr]$ psql -p 5432
psql (12.9)
Type "help" for help.
postgres=# checkpoint;
CHECKPOINT
postgres=# \q
[pg12@test1 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | standby | running | test2 | default | 100 | 4 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | primary | * running | | default | 100 | 4 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres
–test2 执行
[pg12@test2 repmgr]$ repmgr -f /home/pg12/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-----------------------------------------------------------
1 | test1 | standby | running | test2 | default | 100 | 4 | host=192.168.10.100 user=repmgr port=5432 dbname=postgres
2 | test2 | primary | * running | | default | 100 | 4 | host=192.168.10.101 user=repmgr port=5432 dbname=postgres