Keepalived
是一个用C语言编写的路由软件,它通过维护 VIP(Virtual IP)
实现为Linux系统提供简单而健壮的负载均衡和高可用性设施。
架构

说明:
重要概念
- 虚拟IP地址(VIP或VIPA)是指与实际物理网口不对应的IP地址。对vip的使用包括网络地址转换(特别是一对多NAT)、容错和移动性。(A virtual IP address (VIP or VIPA) is an IP address that doesn’t correspond to an actual physical network interface. Uses for VIPs include network address translation (especially, one-to-many NAT), fault-tolerance, and mobility.)
LVS(Linux virtual server)
Linux虚拟服务器,是一个虚拟的服务器集群系统。工作原理:用户请求LVS VIP,LVS根据转发方式和算法,将请求转发给后端服务器,后端服务器接收到请求,返回给用户。
ARP(Address Resolution Protocol)
协议属于TCP/IP协议族里面一种用户将IP地址解析为MAC地址的协议。该协议是用户局域网内解析IP地址对应的物理地址。
- 虚拟路由冗余协议
VRRP(Virtual Router Redundancy Protocol)
通过把几台路由设备联合组成一台虚拟的路由设备,将虚拟路由设备的IP地址作为用户的默认网关实现与外部网络通信。
安装部署
环境说明
VMware Fusion pro,自定义网络已经打开混杂模式(VMware Fusion
-> 偏好设置
-> 网络
-> 勾选 允许通过鉴定才能进入混杂模式
):
- host1: 172.20.0.21
- host2: 172.20.0.22
- vip: 172.20.0.20
所有机器均配置 ip_nonlocal_bind
绑定(实现没有 vip 时,可以启动 haproxy 等服务):
echo net.ipv4.ip_nonlocal_bind=1 >> /etc/sysctl.conf
sysctl -p
CentOS 安装
为 host1
、host2
安装软件
yum install -y nginx keepalived
sed -i "s#Welcome to CentOS#Welcome to CentOS`ip a | grep "172.20" | awk '{print $2}'`#g" /usr/share/nginx/html/index.html
通过ip地址区分访问的是哪个 nginx
systemctl start nginx.service
systemctl enable nginx.service
systemctl start keepalived.service
systemctl enable keepalived.service
配置nginx
- nginx 存活检测脚本
/usr/local/bin/check_nginx_alive.sh
#!/bin/bash
nginx_count=`ps -C nginx --no-header |wc -l`
if [ $nginx_count -eq 0 ]; then
echo 'nginx server is died'
exit 1
fi
exit 0
或使用 systemctl is-active nginx
命令检测。
chmod +x /usr/local/bin/check_nginx_alive.sh
配置host1 keepalive
host1 为 MASTER
,priority 优先级为 100
- /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
# vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script check_nginx_alive {
script "/usr/local/bin/check_nginx_alive.sh"
interval 3
weight -15
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.20.0.20 dev ens33
}
track_script {
check_nginx_alive
}
}
virtual_server 172.20.0.20 80 {
delay_loop 6
lb_algo rr
lb_kind NAT
persistence_timeout 50
protocol TCP
real_server 172.20.0.21 80 {
weight 1
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
}
systemctl restart keepalived
配置host2 keepalive
host2 为 BACKUP
,priority 优先级为 90(比 host1 底)
- /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
# vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script check_nginx_alive {
script "/usr/local/bin/check_nginx_alive.sh"
interval 3
weight -15
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.20.0.20 dev ens33
}
track_script {
check_nginx_alive
}
}
virtual_server 172.20.0.20 80 {
delay_loop 6
lb_algo rr
lb_kind NAT
persistence_timeout 50
protocol TCP
real_server 172.20.0.22 80 {
weight 1
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
}
systemctl restart keepalived
检测
$ ip a show ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:50:56:33:42:98 brd ff:ff:ff:ff:ff:ff
inet 172.20.0.21/24 brd 172.20.0.255 scope global ens33
valid_lft forever preferred_lft forever
inet 172.20.0.20/32 scope global ens33
valid_lft forever preferred_lft forever
systemctl stop nginx
可以观察到 vip 漂移到 host2,并且 访问 http://172.20.0.20 可以看到为 host2 的 nginx
systemctl start nginx
可以观察到 vip 漂移到 host1,并且 访问 http://172.20.0.20 可以看到为 host1 的 nginx
其他
问题是排查命令
tcpdump -i ens33 vrrp -n
arp -n
haproxy 存活检测脚本
vrrp_script check_haproxy {
script "/usr/local/bin/check_keepalive.sh"
interval 1
weight -15
fall 3
rise 2
timeout 2
}
/usr/local/bin/check_haproxy_alive.sh
#!/bin/bash
# This will return 0 when it successfully talks to the haproxy daemon via the socket
# Failures return 1
echo "show info" | socat unix-connect:/var/lib/haproxy/stats stdio > /dev/null
chmod a+x /usr/local/bin/check_haproxy_alive.sh
/usr/local/bin/check_haproxy_alive2.sh
另一种实现方式?
#!/bin/bash
/usr/bin/killall -0 haproxy || systemctl restart haproxy