RedHat 8.0/9.0 LVS安裝手冊6-网络安全专区

RedHat 8.0/9.0 LVS安裝手冊6

作者：IT168 编辑：李莲 2007-07-04 17:34

6、设定至此看起来都没问题，但是每次Primary Diretcor设备如果网络突然断掉，然后Primary网络又突然恢复的时候，会造成Primary与Secondary Director上的LVS Server都同时激活。有兴趣的人可以在primary Diretcor上用

# ifconfig eth0 down
将Primary的网络卡down下来，等一会约一分钟，再用
#ifconfig eth0 up
#route add -net 0.0.0.0 gw 10.144.43.254

　　然后，你再连上linux142与linux187上，输入ipvsadm -l，就会发现两边的LVS/Direct Routing都激活了，这显然不是我们所希望的结果。

　　要解决这个问题，需要利用mon这个dameon，来做到。

　　想法如下：

　　(1)、每隔一段时间，去ping 10.144.43.254(ping gateway)，若发现GATEWAY在连续六次都没有反应后，就要将lvs的服务关闭(/etc/init.d/lvs stop)，此时认定是自己网络卡已经故障，无法Ping出去。因为即便是Gateway死掉，此时整个网络对外边已经没有作用，激活lvs也没用，故需要关闭他。

　　(2)、若发现Gateway突然又Ping的到了，此时要将heartbeat service给restart（取得主动权）(/sbin/service heartbeat start)，如此经过一段时间，Primary的Director将会取得LVS Server的位置，而Slave Director会回复到RealServer与Backup Director的位置。

　　7、为了解决LVS server同时激活的困扰，我们需要在mon服务中再加入一个hostgroup，/etc/mon/mon.cf内容如下：

#
# Extremely basic mon.cf file
#
#
# global options
#
cfbasedir   = /etc/mon
pidfile     = /var/run/mon.pid
statedir    = /var/run/mon/state.d
logdir      = /var/run/mon/log.d
dtlogfile   = /var/run/mon/log.d/downtime.log
alertdir    = /usr/lib/mon/alert.d
mondir      = /usr/lib/mon/mon.d
maxprocs    = 20
histlength = 100
randstart   = 60s
authtype    = userfile
userfile    = /etc/mon/userfile
#
# group definitions (hostnames or IP addresses)
#
hostgroup server1 10.144.43.175
hostgroup server2 10.144.43.142
hostgroup server3 10.144.43.187
# network gateway
hostgroup server4 10.144.43.254
watch server1
    service webcache
    interval 5s
    monitor http.monitor -p 8080 -t 10
    allow_empty_group
    period wd {Sun-Sat}
    alert lvs.alert -P tcp -V 10.144.43.185:8080 -R 10.144.43.175 -W 5 -F dr
    alertevery 1h
    alertafter 6
    upalert lvs.alert -P tcp -V 10.144.43.185:8080 -R 10.144.43.175
      -W 5 -F dr -u 1
watch server2
service webcache
   interval 5s
   monitor http.monitor -p 8080 -t 10
   period wd {Sun-Sat}
    alert lvs.alert -P tcp -V 10.144.43.185:8080 -R 10.144.43.142 -W 5 -F dr
    alertafter 6
    alertevery 1h
    upalert lvs.alert -P tcp -V 10.144.43.185:8080 -R 10.144.43.142
      -W 5 -F dr -u 1
watch server3
service webcache
   interval 5s
   monitor http.monitor -p 8080 -t 10
   period wd {Sun-Sat}
   alert lvs.alert -P tcp -V 10.144.43.185:8080 -R 10.144.43.187 -W 5 -F dr
     alertafter 6
     alertevery 1h
     numalerts 24
     upalert lvs.alert -P tcp -V 10.144.43.185:8080 -R 10.144.43.187
       -W 5 -F dr -u 1
watch server4
service ping
   interval 10s
# 使用哪一个monitor去作测试
   monitor ping.monitor 10.144.43.254
   period wd {Sun-Sat}
# 每个小时丢一个alert
   alertevery 1h
# 连续测试六次失败才丢出第一个alert
   alertafter 6
# 最多丢12个alert
   numalerts 12
# alert时呼叫heartbeat.alert
   alert heartbeat.alert
# upalert时呼叫heartbeat.alert -u
   upalert heartbeat.alert -u
# See /usr/doc for the original example...

　　8、从上面/etc/mon/mon.cf中，可以发现我们要自己写一个alert发生时被呼叫的script，这里我用perl 写了一个简单的Script放在（/usr/lib/mon/alert.d/heartbeat.alert）。

#!/usr/bin/perl
# heartbeat.alert - Linux Virtual Server alert for mon
#
# It can be activated by mon to remove a real server when the
# service is down , or add the server when the service is up
#
use Getopt::Std;
getopts("u");
$service ="/sbin/service";
$u = $opt_u;
if($opt_u){
# 重新激活heartbeat服务
system("$service heartbeat restart");
}else{
# 停止lvs server
system("/etc/init.d/lvs stop");
}

　　9、测试系统

　　确认linux187与linux 142的设定与档案文件相同，然后两边同时重新激活heartbeat service，至此Linux-HA系统正式完成，可以作些测试。例如：拔掉Director的网络线一段时间，看看Secondary有没有take over，然后再插回去，看看Primary有没有回复原本Director的身份，可试着将将Primary reboot，看看Secondary会不会take over，然后master起来的时候，应该还是Secondary Diretcorr在作用。其中或许会有些参数不大对，但是可以自己慢慢修正。

http://netadmin.77169.com/HTML/20040615183900.html

关注我们