Postgres databases are currently being used as:
- Data storage for puppetdb
We have two puppet profiles for installing postgres:
- role::postgresql::master
- role::postgresql::slave
Make sure that there are only a single master
Architecture
The current postgres architecture is a simple Master->Slave setup without automatic failover.
Verification
There are some commands which can be useful to verify psql's operation:
# # To check if a database is master or slave: # psql -h <ip-address> -U postgres -c 'SELECT pg_is_in_recovery();' -qtA f = master; t = slave
Installing new nodes
WARNING: This script deletes the database on the host where it is run. Do NOT run this script on the master node, as that will cause data loss!
When a new node is installed, it needs to be connected to the master manually. This can be done with the following script from our tools-repository:
# postgres-joinMaster.sh <master_host> Stopping PostgreSQL Cleaning up old cluster directory Starting base backup as replicator pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed 40842/40842 kB (100%), 1/1 tablespace NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup pg_basebackup: base backup completed Startging PostgreSQL
When a node is joined to the cluster it generates a configfile at '/var/lib/postgresql/9.6/main/recovery.conf' which makes this database server connect to that host everytime it starts.
Failover
If the primary database server fails hard, and is not coming back a new master needs to be selected. Then the remaining slaves needs to be configured to stream from the new master. After this point, if the old master comes back, it is a seperate database-server (ie: split-brain), and it should thus be configured as a slave server listening to the new master.
Promote slave to master
root@pgsql1:~# touch /var/lib/postgresql/9.6/main/triggerfile
Make other slaves follow the new master
root@pgsql2:~# postgres-joinMaster.sh pgsql1 Stopping PostgreSQL Cleaning up old cluster directory Starting base backup as replicator pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed 40741/40741 kB (100%), 1/1 tablespace NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup pg_basebackup: base backup completed Add trigger config Startging PostgreSQL
Make sure that the keepalive ip for postgres is assigned to the new master
After a new master is selected, make sure the new master is has gotten the postgres IP from keepalived. (In this case it is the IP-address 192.168.254.41)
root@pgsql1:~# ip add 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:50:56:8d:1f:aa brd ff:ff:ff:ff:ff:ff inet 192.168.254.5/26 brd 192.168.254.63 scope global ens32 valid_lft forever preferred_lft forever inet 192.168.254.41/32 scope global ens32 valid_lft forever preferred_lft forever inet6 2001:700:1d00:ec10:250:56ff:fe8d:1faa/64 scope global mngtmpaddr dynamic valid_lft 2591813sec preferred_lft 604613sec inet6 fe80::250:56ff:fe8d:1faa/64 scope link valid_lft forever preferred_lft forever