I recently had an issue within the network of a customer who was experiencing a big amount of broadcast, unknown unicast and multicast – short BUM – traffic within one of his layer 2 network segments. My customer is a company which is offering dedicated servers and virtual machines to all sorts of customers and with all sorts of hardware configurations. During the last years, they did not invest much time and money into ther network which is why there are no filters for spoofed/forged traffic and loops in place by now.
This is why we initially searched for a loop in the network. We enabled spanning-tree on all customer ports with bpdu-guard to shutdown the ports in case we recieve any bpdu packets back from a customer device. We configured LLDP on the devices to check if the physical connections between the devices are in line with the design of the network and they were. Lastly, we started logging the MAC changes to a central syslog server to verify if there is a flapping or movement of MAC addresses on the ports. All these measure had no success and the issue persists.
Our second idea was once of the devices behaving like a hub instead of a switch. A switch tries to learn on which physical interface a certain MAC is present and writes that information into a MAC table. A hub on the other sides, just sents the packets out on all interfaces.
This is when we decided to dig deeper into the packet capture on a new virtual machine not running any software. Besides a big amount of packets being generated by different devices in the network, a quite concerning amount is coming from one specific source. We identified one of the redundant routers as the source of the packets and we could confirm our thesis by disabling the interface on the specific router. Disabling the interface removes the BUM traffic from the network but reduces the redundancy which is not the best idea. This is when I started drawing my own scheme of the network in an abstracted form and added more and more information into the drawing over time.
The redundancy of the gateway is created by using VRRP on both of the routers. Router 1 has a higher priority and is therefore being elected as the master and as the gateway for traffic leaving the subnet. Traffic being sent to the network from the outside is able to take both paths via Router 1 and Switch 1 or via Router 2, Switch 2 and Switch 1 to the server. The traffic being switched on different paths to the destination seems to be the source of the issue. LACP is not used for the connections between the devices and therefore could not be the issue, too.
The issue seems to be somewhere in the Layer 2 of the network stack. Router 2 regularly updates its ARP table to match IP to the MAC of the server. This entry is kept in the ARP table on the router for 20 minutes by default. During the process of learning the MAC/IP combination, the both switches see MAC addresses on the physical interfaces and create entries for these MACs in their MAC table. The entries in the MAC table are only created for the source address of an ethernet frame and not for the destination. On Juniper switches, these entries have a default aging time of 300 seconds or 5 minutes which is shorted compared the the ARP table.
This leades us to the following situation where the switches “forgets” on which port the server is connected before the router initiates the process of reevaluating the ARP entry and therefore refreshing the entry on the switch.