We are using pacemaker, corosync to automate failovers. We noticed one behaviour- when primary node is rebooted, the standby node takes over as primary - which is fine. When the node comes back online and services are started on it, it takes back the role of Primary. It should ideally start as standby. Are we missing any configuration?
pcs resource defaults O/p: resource-stickiness: INFINITY migration-threshold: 0
Stickiness is set to INFINITY. Please suggest.
Adding Config details:
[root@Node1 heartbeat]# pcs config show –l
Cluster Name: cluster1
Corosync Nodes:
Node1 Node2
Pacemaker Nodes:
Node1 Node2
Resources:
Master: msPostgresql
Meta Attrs: master-node-max=1 clone-max=2 notify=true master-max=1 clone-node-max=1
Resource: pgsql (class=ocf provider=heartbeat type=pgsql)
Attributes: master_ip=10.70.10.1 node_list="Node1 Node2" pgctl=/usr/pgsql-9.6/bin/pg_ctl pgdata=/var/lib/pgsql/9.6/data/ primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" psql=/usr/pgsql-9.6/bin/psql rep_mode=async restart_on_promote=true restore_command="cp /var/lib/pgsql/9.6/data/archivedir/%f %p"
Meta Attrs: failure-timeout=60
Operations: demote interval=0s on-fail=stop timeout=60s (pgsql-demote-interval-0s)
methods interval=0s timeout=5s (pgsql-methods-interval-0s)
monitor interval=4s on-fail=restart timeout=60s (pgsql-monitor-interval-4s)
monitor interval=3s on-fail=restart role=Master timeout=60s (pgsql-monitor-interval-3s)
notify interval=0s timeout=60s (pgsql-notify-interval-0s)
promote interval=0s on-fail=restart timeout=60s (pgsql-promote-interval-0s)
start interval=0s on-fail=restart timeout=60s (pgsql-start-interval-0s)
stop interval=0s on-fail=block timeout=60s (pgsql-stop-interval-0s)
Group: master-group
Resource: vip-master (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=10.70.10.2
Operations: monitor interval=10s on-fail=restart timeout=60s (vip-master-monitor-interval-10s)
start interval=0s on-fail=restart timeout=60s (vip-master-start-interval-0s)
stop interval=0s on-fail=block timeout=60s (vip-master-stop-interval-0s)
Resource: vip-rep (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=10.70.10.1
Meta Attrs: migration-threshold=0
Operations: monitor interval=10s on-fail=restart timeout=60s (vip-rep-monitor-interval-10s)
start interval=0s on-fail=stop timeout=60s (vip-rep-start-interval-0s)
stop interval=0s on-fail=ignore timeout=60s (vip-rep-stop-interval-0s)
Stonith Devices:
Fencing Levels:
Location Constraints:
Ordering Constraints:
promote msPostgresql then start master-group (score:INFINITY) (non-symmetrical)
demote msPostgresql then stop master-group (score:0) (non-symmetrical)
Colocation Constraints:
master-group with msPostgresql (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master)
Ticket Constraints:
Alerts:
No alerts defined
Resources Defaults:
resource-stickiness: INFINITY
migration-threshold: 0
Operations Defaults:
No defaults set
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: cluster1
cluster-recheck-interval: 60
dc-version: 1.1.19-8.el7-c3c624ea3d
have-watchdog: false
no-quorum-policy: ignore
start-failure-is-fatal: false
stonith-enabled: false
Node Attributes:
Node1: pgsql-data-status=STREAMING|ASYNC
Node2: pgsql-data-status=LATEST
Quorum:
Options:
Thanks !