I'm getting a tricky error while appending an IPVS configuration to our Loadbalancer:
My Keepalived configuration is used to provide a pair of VRRP instances, and a set of real-servers which are listening on the same machine, bound to different interface and port.
Once the configuration is activated everything work as expected outside of the Load-balancers, the master is up and serving requests through the VIP and the real_servers
But on the backup server, my HTTPS clients are trying to connect to the standby VIP interface instead of reaching the VRRP master, resulting on a error
$ curl https://10.52.1.3
curl: (7) Failed to connect to 10.52.1.3 port 443: No route to host
The keepalived is a fairly standard master-standby configuration:
global_defs {
enable_script_security
script_user root
lvs_sync_daemon ens21 internal_tls_4 ttl 10
}
vrrp_sync_group haproxy {
group {
lb
}
}
vrrp_script haproxy_check_script {
script "/usr/local/bin/haproxy-healthcheck.py"
interval 3 # checking every 3 seconds (default: 5 seconds)
fall 1 # require 3 failures for KO (default: 3)
rise 3 # require 6 successes for OK (default: 6)
user nobody # allow scripts to run with a custom user
}
vrrp_instance internal_tls_4 {
interface ens21
state BACKUP
virtual_router_id 3
use_vmac
vmac_xmit_base
advert_int 1
priority 125
authentication {
auth_type PASS
auth_pass XXX
}
unicast_src_ip 10.52.1.2
unicast_peer {
10.52.1.8
}
virtual_ipaddress {
10.52.1.3/26
}
track_script {
haproxy_check_script
}
The issue appear once enabling IPVS with Keepalived virtual_server section:
virtual_server 10.52.1.3 443 {
delay_loop 6
lb_algo wrr
lb_kind NAT
protocol TCP
real_server 10.52.1.2 10204 {
weight 1
TCP_CHECK {
connect_timeout 1
}
}
real_server 10.52.1.8 10204 {
weight 0
TCP_CHECK {
connect_timeout 1
}
}
}
When I disable the virtual_server section, everything get back to a normal state, curl is routed to the master's VIP interface and fetch requests correctly
It seems that having the Kernel IPVS configuration enabled tricks itself into thinking it is able to answers to requests, although it has no allocated address on the VRRP interface.
I initially thought that would be a Sysctl issue with arp configuration but tinkering with it led to the same results.
here is my sysctl.conf configuration
net.ipv4.ip_nonlocal_bind=1
net.ipv4.ip_forward=1
net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.all.arp_announce=1
net.ipv4.conf.all.arp_filter=0
net.ipv4.conf.all.rp_filter=0
net.ipv4.conf.default.rp_filter=0
net.ipv4.conf.ens21.arp_filter=1
net.ipv4.conf.ens21.rp_filter=0