IEEE 802.1D defines the standard Spanning Tree Protocol (STP) to eliminate network loops, preventing data frames from circulating or multiplying in loops, which may result in network congestion and affect normal communication in the network. Through the spanning tree algorithm, STP can determine where loops may exist in a network, block ports on redundant links, and trim the network into a tree structure in which no loops exist to prevent devices from receiving duplicated data frames. When the active path is faulty, STP recovers the connectivity of the blocked redundant links to ensure normal services. On the basis of STP, Rapid Spanning Tree Protocol (RSTP) and Multiple Spanning Tree Protocol (MSTP) are developed. The basic principles of the three protocols are the same, while RSTP and MSTP are improved versions of STP. Maipu implements the VIST and Rapid-VIST spanning tree protocols compatible with Cisco spanning tree. Rapid-VIST improves convergence performance and inherits the function of VIST ring elimination.
In STP, the following basic concepts are defined:
- Root bridge: Root of the finally formed tree structure of a network. The device with the highest priority acts as the root bridge.
- Root Port (RP): The port which is nearest to the root bridge. The port is not on the root bridge, and it communicates with the root bridge.
- Designated bridge: If the device sends Bridge Protocol Data Unit (BPDU) configuration information to a directly connected device or directly connected LAN, the device is regarded as the designated bridge of the directly connected device or directly connected LAN.
- Designated port: The designated bridge forwards BPDU configuration information through the designated port.
- Path cost: It indicates the link quality, and it is related to the link rate. Usually, a higher link rate means a smaller path cost, and the link is better.
The devices that run STP implement calculation of the spanning tree by exchanging BPDU packets, and finally form a stable topology structure. BPDU packets are categorized into the following two types:
- Configuration BPDUs: They are also called BPDU configuration messages which are used to calculate and maintain the spanning tree topology.
- Topology Change Notification (TCN) BPDUs: When the network topology structure changes, they are used to inform other devices of the change.
BPDU packets contain information that is required in spanning tree calculation. The major information includes:
- Root bridge ID: It consists of the root bridge priority and the MAC address.
- Root path cost: It is the minimum path cost to the root bridge.
- Designated bridge ID: It consists of the designated bridge priority and the MAC address.
- Designated port ID: It consists of the designated port priority and port number.
- Message Age: Life cycle of BPDU configuration messages while they are broadcast in a network.
- Hello Time: Transmitting cycle of BPDU configuration messages.
- Forward Delay: Delay in port status migration.
- Max Age: Maximum life cycle of configuration messages in a device.
The election process of STP is as follows:
The local device takes itself as the root bridge to generate BPDU configuration messages and sends the messages. In the BPDU packets, the root bridge ID and designated bridge ID are the local bridge ID, and root path cost is 0, and the specified port is the transmitting port.
Each port of the device generates a port configuration message which is used for spanning tree calculation. In the port configuration message, the root bridge ID and the designated bridge ID are the local bridge ID, the root path cost is 0, and the specified port is the local port.
- Update port configuration messages.
After the local device receives a BPDU configuration message from another device, it compares the message with the port configuration message of the receiving port. If the received configuration message is better, the device uses the received BPDU configuration message to replace the port configuration message. If the port configuration message is better, the device does not perform any operation.
The principle of comparison is as follows: The root bridge IDs, root path cost, designated bridge IDs, designated port IDs, and receiving port IDs should be compared in order.
The smaller value is better. If the values of previous item are the same, compare the next item.
The device that sends the optimal configuration message in the entire network is selected as the root bridge.
- Select port roles and port status.
All ports of the root bridge are designated ports, and the ports are in the Forwarding status. The designated bridge selects the optimal port configuration message from all ports. The receiving port of the message is selected as the root port, and the root port is in the Forwarding status. The other ports calculate designated port configuration messages according to the root port configuration message.
The calculation method is as follows: The root bridge ID is the route ID of the root port configuration message, the root path cost is the sum of the root path cost of the root port configuration message and the root port path cost, the designated bridge ID is the bridge ID of the local device, and the designated port is the local port.
Based on the port configuration message and the calculated designated port configuration message, determine port rules: If the designated port configuration message is better, the local port is selected as the designated port, and the port is in the Forwarding status. Then, the port configuration message is replaced by the designated port configuration message, and the designated port sends port configuration messages periodically at the interval of Hello Time. If the port configuration message is better, the port is blocked. The port is then in the Discarding status, and the port configuration message is not modified.
After the root bridge, root port, and designated port are selected, the tree structure network topology is set up successfully. Only the root port and the designated port can forward data. The other ports are in the Discarding status. They can only receive configuration messages but cannot send configuration messages or forward data.
If the root port of a non-root bridge fails to receive configuration messages periodically, the active path is regarded as faulty. The device re-generates a BPDU configuration message and TCN BPDU with itself as the root bridge and sends the messages. The messages causes re-calculation of the spanning tree and then a new active path is obtained.
Before receiving new configuration messages, the other devices do not find the network topology change, so their root ports and designated port still forward data through the original path. The newly selected root port and designated port migrate to the Forwarding status after two Forward Delay periods to ensure that the new configuration message has been broadcast to the entire network and prevent occurrence of temporary loops that may be caused if both old and new root ports and designate ports forward data.
RSTP defined in IEEE 802.1w is developed based on STP, and it is the improved version of STP. RSTP realizes fast migration of port status and hence shortens the time required for a network to set up stable topology. RSTP is improved in the following aspects:
- It sets a backup port, that is, alternate port, for the root port. If the root port is blocked, the alternate port can fast switch over to become a new root port.
- It sets a backup port, that is, backup port, for the designated port. If the designated port is blocked, the backup port can fast switch over to become a new designated port.
- In a point-to-point link of two directly-connected devices, the designated port can enter the Forwarding status without delay only after a handshake with the downstream bridge.
- Some ports are not connected to the other bridges or shared links, instead, they are directly connected with user terminals. These ports are defined as edge ports. The status changes of edge ports do not affect the network connectivity, so the ports can enter the Forwarding status without delay.
However, both RSTP and STP form a single spanning tree, which has the following shortages:
- Only one spanning tree is available in the entire network. If the network size is large, the network convergence takes a long time.
- Packets of all VLANs are forwarded through one spanning tree, therefore no load balancing is achieved.
MSTP defined in IEEE 802.1s is an improvement of STP and RSTP, and it is backward compatible with STP and RSTP. MSTP introduces the concept of region and instance. MSTP divides a network into multiple regions. Each region contains multiple instances, one instance can set up mapping with one or more VLANs, and one instance corresponds to one spanning tree. One port may have different port role and status in different instances. In this way, packets of different VLANs are forwarded in their own paths.
In MSTP, definition of the following concepts is added:
- MST region: It consists of multiple devices in the switching network and the network between the devices. The devices in an MST region must meet the following requirements: The spanning tree function has been enabled on the devices. They have the same MST region, MSTP level, and VLAN mapping table. They are directly connected physically.
- Internal Spanning Tree (IST): It is the spanning tree of instance 0 in each region.
- Common Spanning Tree (CST): If each MST region is regarded as a device, then the spanning trees that connect MST regions are CSTs.
- Common and Internal Spanning Tree (CIST): It consists of the ISTs of MST regions and the CSTs between the MST regions. It is a single spanning tree that connects all devices in the network.
- Multiple Spanning Tree Instance (MSTI): Spanning trees in MST regions. Each instance has an independent MSTI.
- Common root: CIST root.
- Region root: Root of each IST and MSTI in MST regions. In MST domains, each instance has an independent spanning tree, so the region roots may be different. The root bridge of instance 0 is the region root of the region.
- Region edge ports: They are located at the edge of an MST region and they are used to connect ports of different MST regions.
- External path cost: It is the minimum path cost from a port to the common root.
- Internal path cost: It is the minimum path cost from a port to the region root.
- Master port: It is the region edge port with the minimum path cost to the common root in an MST region. The role of a master port in an MSTI is the same as its role in a CIST.
The election rule of MSTP is similar to that of STP, that is, electing the bridge with the highest priority in the network as the root bridge of CIST by comparing configuration messages. Each MST region calculates its IST, and MST regions calculate CSTs, and all of the constructs CIST in the entire network. Based on mapping between VLANs and spanning tree instances, each MST region calculates an independent spanning tree MSTI for each instance.
Cisco private spanning tree protocol defines PVST and RPVST protocols, both of which introduce the concept of instance. A VLAN corresponds to an instance. In different instances, ports can have different port roles and port states, which realizes the packet forwarding of different VLANs according to their paths.
New definition of MSTP in MLAG environment and precautions:
Operation principle: simulate two MLAG nodes as the same spanning tree root bridge, VMAC adopts MLAG System MAC, access device receives BPDU packet based on VMAC, Peer-Link port also participates in spanning tree calculation, and sends and receives BPDU, but there are some special processing, which are specified in BA function in section 9.2.3
- IROOT——Intraconnection Root Port
- IBACK——Intraconnection Backup Port
- MLAG-MSTI: The corresponding MSTI instance of MLAG-VLAN
- root priority: The specified root bridge priority of MLAG
- bridge priority: The specified bridge priority of MLAG
- stp-pseudo: The pseudo information mode
The configuration strictions of the pseudo information mode:
- The root priority (default value is 0) of CIST and all MLAG-MSTI in the pseudo information must be better than the bridge priority of all bridges (including MLAG nodes themselves) in the whole network (the default value is 32768);
- In addition to ensuring that the pairing node is the root bridge (root priority is the best in the whole network), it is also necessary to start the root guard function on the access ports (all non-peer link ports) of MLAG nodes (root guard has not been supported in vist mode) to prevent spanning tree calculation errors caused by receiving BPDU with higher priority.
- The Bridge-Assurance function needs to be configured on the peer link.
- For the CIST instance and MLAG-MSTI instance, the root priority of the two nodes must be configured to be the same, so that the two nodes present the same root bridge for both DHD and SD; in order to make the two nodes present two completely independent network bridges for SD to participate in spanning tree calculation, the root priority of the two nodes must be configured differently
MLAG currently does not support checking configuration consistency. It is recommended to manually ensure that the relevant configurations are the same on the two MLAG nodes. If the spanning tree configuration is inconsistent, it may lead to inaccurate spanning tree calculation and other problems, which cannot meet the expected effect.