Keepalived的脑裂现象

由于某些原因，导致两台keepalived高可用服务器在指定时间内，无法检测到对方的心跳，各自取得资源及服务的所有权，而此时的两台高可用服务器又都还活着。

1	由于某些原因，导致两台keepalived高可用服务器在指定时间内，无法检测到对方的心跳，各自取得资源及服务的所有权，而此时的两台高可用服务器又都还活着。

1.服务器网线松动等网络故障
2.服务器硬件故障发生损坏现象而崩溃
3.主备都开启firewalld防火墙

1

2

3

1.服务器网线松动等网络故障

2.服务器硬件故障发生损坏现象而崩溃

3.主备都开启firewalld防火墙

[root@lb01 ~]# systemctl start firewalld
[root@lb02 ~]# systemctl start firewalld

1 2	[root@lb01 ~]# systemctl start firewalld [root@lb02 ~]# systemctl start firewalld

[root@lb01 ~]# ip addr | grep 10.0.0.3
    inet 10.0.0.3/32 scope global eth0
    
[root@lb02 ~]# ip addr | grep 10.0.0.3
    inet 10.0.0.3/32 scope global eth0

1

2

3

4

5

[root@lb01 ~]# ip addr | grep 10.0.0.3

inet 10.0.0.3/32 scope global eth0

[root@lb02 ~]# ip addr | grep 10.0.0.3

inet 10.0.0.3/32 scope global eth0

#拒绝访问，需要配置防火墙规则
[root@lb01 ~]# firewall-cmd --add-service=http
success
[root@lb02 ~]# firewall-cmd --add-service=https
success
[root@lb02 ~]# firewall-cmd --direct --permanent --add-rule ipv4 filter INPUT 0 --in-interface eth0 --destination 224.0.0.18 --protocol vrrp -j ACCEPT
#访问页面正常

1

2

3

4

5

6

7

#拒绝访问，需要配置防火墙规则

[root@lb01 ~]# firewall-cmd --add-service=http

success

[root@lb02 ~]# firewall-cmd --add-service=https

success

[root@lb02 ~]# firewall-cmd --direct --permanent --add-rule ipv4 filter INPUT 0 --in-interface eth0 --destination 224.0.0.18 --protocol vrrp -j ACCEPT

#访问页面正常

Nginx默认监听在所有的IP地址上，VIP会飘到一台节点上，相当于那台nginx多了VIP这么一个网卡，所以可以访问到nginx所在机器

但是.....如果nginx宕机，会导致用户请求失败，但是keepalived没有挂掉不会进行切换，所以需要编写一个脚本检测Nginx的存活状态，如果不存活则kill掉keepalived

1

2

3

Nginx默认监听在所有的IP地址上，VIP会飘到一台节点上，相当于那台nginx多了VIP这么一个网卡，所以可以访问到nginx所在机器

但是.....如果nginx宕机，会导致用户请求失败，但是keepalived没有挂掉不会进行切换，所以需要编写一个脚本检测Nginx的存活状态，如果不存活则kill掉keepalived

[root@lb01 ~]# vim check_web.sh
#!/bin/sh
nginxpid=$(ps -C nginx --no-header|wc -l)
#1.判断Nginx是否存活,如果不存活则尝试启动Nginx
if [ $nginxpid -eq 0 ];then
    systemctl start nginx
    sleep 3
    #2.等待3秒后再次获取一次Nginx状态
    nginxpid=$(ps -C nginx --no-header|wc -l) 
    #3.再次进行判断, 如Nginx还不存活则停止Keepalived,让地址进行漂移,并退出脚本  
    if [ $nginxpid -eq 0 ];then
        systemctl stop keepalived
   fi
fi
#给脚本增加执行权限
[root@lb01 ~]# chmod +x /root/check_web.sh

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

[root@lb01 ~]# vim check_web.sh

#!/bin/sh

nginxpid=$(ps -C nginx --no-header|wc -l)

#1.判断Nginx是否存活,如果不存活则尝试启动Nginx

if [ $nginxpid -eq 0 ];then

systemctl start nginx

sleep 3

#2.等待3秒后再次获取一次Nginx状态

nginxpid=$(ps -C nginx --no-header|wc -l)

#3.再次进行判断, 如Nginx还不存活则停止Keepalived,让地址进行漂移,并退出脚本

if [ $nginxpid -eq 0 ];then

systemctl stop keepalived

fi

#给脚本增加执行权限

[root@lb01 ~]# chmod +x /root/check_web.sh

[root@lb01 ~]# vim /etc/keepalived/keepalived.conf
global_defs {           
    router_id lb01      
}
#每5秒执行一次脚本，脚本执行内容不能超过5秒，否则会中断再次重新执行脚本
vrrp_script check_web {
    script "/root/check_web.sh"
    interval 5
}
vrrp_instance VI_1 {
    state MASTER        
    interface eth0      
    virtual_router_id 50    
    priority 100        
    advert_int 1        
    authentication {    
        auth_type PASS  
        auth_pass 1111  
    }
    virtual_ipaddress { 
        10.0.0.3    
    }
    #调用并运行脚本
	track_script {
    	check_web
	}
}
#在Master的keepalived中调用脚本，抢占式，仅需在master配置即可。（注意，如果配置为非抢占式，那么需要两台服务器都使用该脚本）

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

[root@lb01 ~]# vim /etc/keepalived/keepalived.conf

global_defs {

router_id lb01

}

#每5秒执行一次脚本，脚本执行内容不能超过5秒，否则会中断再次重新执行脚本

vrrp_script check_web {

script "/root/check_web.sh"

interval 5

}

vrrp_instance VI_1 {

state MASTER

interface eth0

virtual_router_id 50

priority 100

advert_int 1

authentication {

auth_type PASS

auth_pass 1111

}

virtual_ipaddress {

10.0.0.3

}

#调用并运行脚本

track_script {

check_web

}

#在Master的keepalived中调用脚本，抢占式，仅需在master配置即可。（注意，如果配置为非抢占式，那么需要两台服务器都使用该脚本）

技术笔记分享

脑裂的原因

故障演示

1）开启防火墙

2）查看两个节点

3）访问项目页面

高可用keepalived与nginx

nginx故障切换脚本

调用脚本

发表评论取消回复

2025年7月
一	二	三	四	五	六	日
« 3月
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

You Are Here