0 Basic concepts
Applications get high availability by Load Balancers
In production environment, applications are usually deployed on multiple servers behind a load balancer. The most popular load balancers today are HAProxy, nginx, apache, ...
Let's assume our app is running on 2 servers, S1 and S2, which are behind load balancer LB.
High availability of Load Balancer itself
If S1 is down, LB can detect the failure of S1, and then only proxies traffic to S2. Our app is still available.
But what will happen if the LB is down. Oops!
The solution is to make LB high available. In this case, we can not add another layer of load balancer for load balancer. Here comes "Keepalived".
Keepalived uses VRRP protocol to dynamically configure a floating IP to one of the LBs. At any given time, only one of the LB servers is "active". Once the active server is down or not running properly, it's marked as "nonactive", and another LB server is selected to be "active".
When keepalived reselects active node
The most obvoius scenario when keepalived reselect active node is the keepalived itself is down (which maybe caused by the server power off). Standby nodes can detect it by not receiving heartbeat from active node. This works out of box, does NOT need any configuration.
But sometimes, this simple "server down" triggering is not enough. For example when keepalived works with HAProxy, keepalived working well doesn't mean HAProxy on the same server runs well. In this case, keepalived must do health check (via track_script) for HAProxy to determine if the node is still good.
Keepalived vs VRRP
Keepalived is not 100% compliant with VRRP. For example, VRRP requires a sepeific virtual MAC for the virtual IP, while keepalived happily works with the physical MAC. Another example is multicast required by VRRP, where keepalived works perfectly with unicast.
2 Keepalived configuration example
In this example, there are 2 nodes running (keepalived + HAProxy)
Node 1 , IP address 10.88.88.20
file /etc/keepalived/keepalived.conf
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 10
}
vrrp_instance VI_1 {
state MASTER
interface enp0s25
virtual_router_id 51
priority 101
advert_int 1
unicast_src_ip 10.88.88.20
unicast_peer {
10.88.88.30
}
virtual_ipaddress {
10.88.88.100
}
track_script {
chk_haproxy
}
}
Node 2, IP address 10.88.88.30
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 10
}
vrrp_instance VI_1 {
state BACKUP
interface enp0s25
virtual_router_id 51
priority 100
advert_int 1
unicast_src_ip 10.88.88.30
unicast_peer {
10.88.88.20
}
virtual_ipaddress {
10.88.88.100
}
track_script {
chk_haproxy
}
}
3 Testing examples
When Node 1 starts HAProxy and keepalived, its priority is 101 + 10 = 111;
When Node 2 starts HAProxy and keepalived, its priority is 100 + 10 = 110;
So Node 1 is "Active", Node2 is "backup".
Scenario 1: Power off Node1
Node 1 no longer sends heartbeat to Node 2. So Node 2 detects this and promotes itself to "active".
Scenario 2: HAProxy down on Node 1
Node 1 detects the failure of HAProxy, updates node priority to 111 - 10 = 101. Now Node1's priority (101) is less than Node2's priority (110), and Node 1 demotes to "Backup", Node 2 promotes to "Active".
Scenario 3: HAProxy starts again on Node 1
After HAProxy is started, Node1 detects it, and updates priority to 101 + 10 = 111. Now Node1's priority (111) is higher than Node2's (110), so Node 1 promotes itslef to "Active", and Node 2 goes back to "backup".
No comments:
Post a Comment