postgresql的hot standby(replication stream)

PG在9.*版本后热备提供了新的一个功能,那就是Stream Replication的读写分离,是PG高可用性的一个典型应用,也就是我们传统意义上说的Hot-Standby,比如Oracle的DG,mssql的mirror以及Mysql的读写分离等,与其他数据库相比较,有相同点,也有不同点,这些后述。下面是PG的流复制的安装步骤以及测试。

环境:
Vmware Workstation 8.0
操作系统:CentOS 6.2
数据库 :PostgreSQL 9.1.3

虚拟主机2台
MASTER: 192.168.2.130
SLAVE: 192.168.2.129

环境参数
[postgres@localhost ~]$ echo $PGHOME
/home/postgres

[postgres@localhost ~]$ echo $PGDATA
/database/pgdata

Step1:安装PG数据库
略,slave端可以只装数据库,不初始化数据库

Step2:创建流复制用户

master端执行: CREATE USER repuserreplication LOGIN CONNECTION LIMIT3 ENCRYPTED PASSWORD 'repuser';

Step3:配置Master端的访问文件pg_hba.conf

增加一行:
host replication repuser 192.168.2.129/16 md5

Step4:配置MASTER端配置文件

max_wal_senders = 1
wal_level = hot_standby

archive_mode = on
archive_command = 'cd ./'

hot_standby = on
wal_keep_segments = 64

注:max_wal_senders是Slave库的节点数,有多少个slave库就设多少,
wal_level是write ahead log参数值,设置流复制务必将此值更新成hot_standby
wal_keep_segments默认值是16,是PG_XLOG下的日志文件数相关参数
archive也可以选择关闭,归档是定时恢复用的,流复制不是必须的

Step5:主库备份(Master端)
5.1:开启文件备份,前提是wal_level参数值必须是archive或者hot_standby,前面已做
select pg_start_backup('Replition work');

5.2:拷贝$PGDATA文件,并复制到Slave服务器上,排除pg_xlog内容,因为在slave还是要被清掉的,另外注意TAR打包时的权限问题,本次打包时遇到一个问题:

tar报错

tar (child): Cannot open: Permission denied
tar: Error is not recoverable: exiting now,后来将/database的权限授予了chown postgres解决.

打包
tar czvf pgdata.tar.gz pgdata --exclude=pgdata/pg_xlog

备机端如果已经安装了postgres数据库,数据文件夹名称一样的话,可以先停掉备机数据库,改名数据文件夹名称
mv pgdata pgdata.old

远程拷贝至slave端并在备机端解压
scp pgdata.tar.gz root@192.168.2.129:/database/
tar xzvf pgdata.tar.gz

5.3:上述步骤完成后,结束master端的备份
select pg_stop_backup(),current_timestamp;

Step6:修改Slave端配置信息

6.1:postgresql.conf文件
hot_standby = on

6.2:recovery.conf文件
$ cp $PGHOME/share/recovery.conf.sample $PGDATA/recovery.conf
$ vi recovery.conf --新增以下三行
standby_mode = 'on'
trigger_file = '/database/pgdata/postgresql.trigger.1949'
primary_conninfo = 'host=192.168.2.130 port=1949 user=repuser password=repuser keepalives_idle=60'

6.3:配置.pgpass文件(slave端)
新增slave访问master的密码文件,可以不用输密码
192.168.2.130:1949:postgres:repuser:repuser

6.4:删除slave端(从master端拷过来的)的pid文件和pg_xlog
$ rm -rf $PGDATA/pg_xlog
$ rm -f $PGDATA/postmaster.pid
$ mkdir $PGDATA/pg_xlog

Step7:启动Slave库
正常启动备库(pg_ctl -D $PGDATA -l pg.log start),有异常可以看log
复制完成后,可以通过CSV日志去查看,本处未设,直接查看进程

7.1 查看master进程:

[postgres@localhost ~]$ ps -ef|grep postgres
root 2454 2438 0 20:25 pts/0 00:00:00 su - postgres
postgres 2461 2454 0 20:25 pts/0 00:00:00 -bash
postgres 2535 1 0 20:26 pts/1 00:00:00 /home/postgres/bin/postgres -D /database/pgdata
postgres 2537 2535 0 20:26 ? 00:00:00 postgres: writer process
postgres 2538 2535 0 20:26 ? 00:00:00 postgres: wal writer process
postgres 2539 2535 0 20:26 ? 00:00:00 postgres: autovacuum launcher process
postgres 2540 2535 0 20:26 ? 00:00:00 postgres: archiver process
postgres 2541 2535 0 20:26 ? 00:00:00 postgres: stats collector process
postgres 3079 2535 0 21:56 ? 00:00:00 postgres: wal sender process repuser 192.168.2.129(45446) streaming 0/C01EDB8
postgres 3116 2535 0 22:02 ? 00:00:00 postgres: postgres postgres 192.168.2.1(52648) idle
postgres 3118 2535 0 22:02 ? 00:00:00 postgres: postgres test 192.168.2.1(52649) idle
postgres 3120 2535 0 22:02 ? 00:00:00 postgres: postgres test 192.168.2.1(52654) idle
root 3126 2490 0 22:04 pts/1 00:00:00 su - postgres
postgres 3214 3128 0 22:16 pts/1 00:00:00 grep postgres
postgres 3128 3126 0 22:04 pts/1 00:00:00 -bash
postgres 3213 3128 2 22:16 pts/1 00:00:00 ps -ef

7.2 查看slave进程:

[postgres@localhost ~]$ ps -ef|grep postgres
postgres 2856 1 0 21:54 pts/2 00:00:00 /home/postgres/bin/postgres -D /database/pgdata
postgres 2858 2856 0 21:54 ? 00:00:00 postgres: startup process recovering 000000010000000000000003
postgres 2859 2856 0 21:54 ? 00:00:00 postgres: writer process
postgres 2860 2856 0 21:54 ? 00:00:00 postgres: stats collector process
postgres 2899 2856 0 21:56 ? 00:00:00 postgres: wal receiver process streaming 0/C01ED28
postgres 3007 2856 0 22:02 ? 00:00:00 postgres: postgres postgres 192.168.2.1(52652) idle
postgres 3013 2856 0 22:03 ? 00:00:00 postgres: postgres test 192.168.2.1(52657) idle
postgres 3014 2856 0 22:03 ? 00:00:00 postgres: postgres test 192.168.2.1(52658) idle
root 3020 2756 0 22:04 pts/2 00:00:00 su - postgres
postgres 3022 3020 0 22:04 pts/2 00:00:00 -bash
postgres 3091 3022 4 22:15 pts/2 00:00:00 ps -ef
postgres 3092 3022 0 22:15 pts/2 00:00:00 grep postgres

此时在slave端的pg_xlog下面也产生了日志文件,并且之前pg_start_backup生成的文件名也变成了old的了.

查看日志内容:
[postgres@localhost ~]$ more pgsql.log
LOG: database system was shut down in recovery at 2012-04-23 18:33:25 PDT
LOG: entering standby mode
LOG: streaming replication successfully connected to primary
LOG: redo starts at 0/8000020
LOG: consistent recovery state reached at 0/C000000
LOG: database system is ready to accept read only connections
Step8:测试
On Master:
test=# select * from kenyon;
id | name
----+--------
2 | kenyon
(1 row)

test=# insert into kenyon values (2,'kenyon testing data');
INSERT 0 1
test=#

On Slave:
test=# select * from kenyon;
id | name
----+---------------------
2 | kenyon
2 | kenyon testing data
(2 rows)

test=# delete from kenyon where id = 2;
ERROR: cannot execute DELETE in a read-only transaction
新增的数据已经传输过去,并且slave端的会话是只读的。

相关文章

文章浏览阅读601次。Oracle的数据导入导出是一项基本的技能,...
文章浏览阅读553次。开头还是介绍一下群,如果感兴趣polardb...
文章浏览阅读3.5k次,点赞3次,收藏7次。折腾了两个小时多才...
文章浏览阅读2.7k次。JSON 代表 JavaScript Object Notation...
文章浏览阅读2.9k次,点赞2次,收藏6次。navicat 连接postgr...
文章浏览阅读1.4k次。postgre进阶sql,包含分组排序、JSON解...