DR模式,即(Direct Routing)直接路由模式


1) 工作过程: 当一个client发送一个WEB请求到VIP,LVS服务器根据VIP选择对应的real-server的Pool,根据算法,在Pool中选择一台Real-server,LVS在hash表中记录该次连接,然后将client的请求包发给选择的Real-server,最后选择的Real-server把应答包直接传给client;当client继续发包过来时,LVS根据更才记录的hash表的信息,将属于此次连接的请求直接发到刚才选择的Real-server上;当连接中止或者超时,hash表中的记录将被删除。

2) DR模式的几个细节:

1> LVS和Real-server必须在相同的网段:(相同的广播域内)


2> LVS不需要开启路由转发:


3> ARP问题:


/sbin/ifconfig lo:0 inet VIP netmask 255255255255

i) 原因在于,当LVS把client的包转发给Real-server时,因为包的目的IP地址是VIP,那么如果Real-server收到这个包后,发现包的目的IP不是自己的系统IP,那么就会认为这个包不是发给自己的,就会丢弃这个包,所以需要将这个IP地址绑到网卡上;当发送应答包给client时,Real-server就会把包的源和目的地址调换,直接回复给client。

ii) 关于ARP广播:




System Interface MAC Address IP Address

HN eth0 00:0c:29:b3:a2:54 1921681810

HN eth3 00:0c:29:b3:a2:68 1921681811

HN eth4 00:0c:29:b3:a2:5e 1921681812

client eth0 00:0c:29:d2:c7:aa 19216818129

当我从19216818129 ping 1921681810时,tcpdump抓包发现:

00:0c:29:d2:c7:aa > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 1921681810 tell 1921681812900:0c:29:b3:a2:5e > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 1921681810 is-at 00:0c:29:b3:a2:5e00:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 1921681810 is-at 00:0c:29:b3:a2:5400:0c:29:b3:a2:68 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 1921681810 is-at 00:0c:29:b3:a2:6800:0c:29:d2:c7:aa > 00:0c:29:b3:a2:5e, IPv4, length 98: 19216818129 > 1921681810: ICMP echo request, id 32313, seq 1, length 6400:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 1921681810 > 19216818129: ICMP echo reply, id 32313, seq 1, length 6400:0c:29:d2:c7:aa > 00:0c:29:b3:a2:5e, IPv4, length 98: 19216818129 > 1921681810: ICMP echo request, id 32313, seq 2, length 6400:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 1921681810 > 19216818129: ICMP echo reply, id 32313, seq 2, length 6400:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp who-has 19216818129 tell 192168181000:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, ARP, length 60: arp reply 19216818129 is-at 00:0c:29:d2:c7:aa三个端口都发送了arp的reply包,但是19216818129使用的第一个回应的eth4的mac地址作为ping请求的端口,由于1921681810是icmp包中的目的地址,那么ping的应答包,会从eth0端口发出。


sysctl -w netipv4confallarp_filter=1


sysctl -w netipv4confallarp_ignore=1

sysctl -w netipv4confallarp_announce=2

还是从19216818129 ping 1921681810时,tcpdump抓包发现:

00:0c:29:d2:c7:aa > ff:ff:ff:ff:ff:ff, ARP, length 60: arp who-has 1921681810 tell 1921681812900:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp reply 1921681810 is-at 00:0c:29:b3:a2:5400:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, IPv4, length 98: 19216818129 > 1921681810: ICMP echo request, id 32066, seq 1, length 6400:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 1921681810 > 19216818129: ICMP echo reply, id 32066, seq 1, length 6400:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, IPv4, length 98: 19216818129 > 1921681810: ICMP echo request, id 32066, seq 2, length 6400:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, IPv4, length 98: 1921681810 > 19216818129: ICMP echo reply, id 32066, seq 2, length 6400:0c:29:b3:a2:54 > 00:0c:29:d2:c7:aa, ARP, length 60: arp who-has 19216818129 tell 192168181000:0c:29:d2:c7:aa > 00:0c:29:b3:a2:54, ARP, length 60: arp reply 19216818129 is-at 00:0c:29:d2:c7:aa看到了么,现在只有eth0会回应arp请求了。




The arp_announce/arp_ignore reference:

arp_announce – INTEGERDefine different restriction levels for announcing the localsource IP address from IP packets in ARP requests sent oninterface:0 – (default) Use any local address, configured on any interface1 – Try to avoid local addresses that are not in the target’ssubnet for this interface This mode is useful when targethosts reachable via this interface require the source IPaddress in ARP requests to be part of their logical networkconfigured on the receiving interface When we generate therequest we will check all our subnets that include thetarget IP and will preserve the source address if it is fromsuch subnet If there is no such subnet we select sourceaddress according to the rules for level 22 – Always use the best local address for this targetIn this mode we ignore the source address in the IP packetand try to select local address that we prefer for talks withthe target host Such local address is selected by lookingfor primary IP addresses on all our subnets on the outgoinginterface that include the target IP address If no suitablelocal address is found we select the first local addresswe have on the outgoing interface or on all other interfaces,with the hope we will receive reply for our request andeven sometimes no matter the source IP address we announceThe max value from conf/{all,interface}/arp_announce is usedIncreasing the restriction level gives more chance forreceiving answer from the resolved target while decreasingthe level announces more valid sender’s information

arp_announce用来限制,是否使用发送的端口的ip地址来设置ARP的源地址: “0″代表是用ip包的源地址来设置ARP请求的源地址。 “1″代表不使用ip包的源地址来设置ARP请求的源地址,如果ip包的源地址是和该端口的IP地址相同的子网,那么用ip包的源地址,来设置ARP请求的源地址,否则使用”2″的设置。 “2″代表不使用ip包的源地址来设置ARP请求的源地址,而由系统来选择最好的接口来发送。当内网的机器要发送一个到外部的ip包,那么它就会请求路由器的Mac地址,发送一个arp请求,这个arp请求里面包括了自己的ip地址和Mac地址,而linux默认是使用ip的源ip地址作为arp里面的源ip地址,而不是使用发送设备上面的 ,这样在lvs这样的架构下,所有发送包都是同一个VIP地址,那么arp请求就会包括VIP地址和设备 Mac,而路由器收到这个arp请求就会更新自己的arp缓存,这样就会造成ip欺骗了,VIP被抢夺,所以就会有问题。现在假设一个场景来解释arp_announce:Real-server的ip地址: 2021061100(public local address),172161100(private local address),2021061254(VIP)如果发送到client的ip包产生的arp请求的源地址是2021061254(VIP),那么LVS上的VIP就会被冲掉,因为交换机上现在的arp对应关系是Real-server上的VIP对应自己的一个MAC,那么LVS上的VIP就失效了。arp_ignore – INTEGERDefine different modes for sending replies in response toreceived ARP requests that resolve local target IP addresses:0 – (default): reply for any local target IP address, configuredon any interface1 – reply only if the target IP address is local addressconfigured on the incoming interface2 – reply only if the target IP address is local addressconfigured on the incoming interface and both with thesender’s IP address are part from same subnet on this interface3 – do not reply for local addresses configured with scope host,only resolutions for global and link addresses are replied4-7 – reserved8 – do not reply for all local addressesThe max value from conf/{all,interface}/arp_ignore is usedwhen ARP request is received on the {interface}







