Hash based ECMP load balancing algorithm

Context

Data center networks every so often use compactly interconnected topologies to deliver high bandwidth for internal data exchange. In such network, it is precarious to employ effective load balancing schemes so that all the available bandwidth resources can be utilized. In order to utilize all the available bandwidth traffic need to be routed across the network instead of overloading a single path. Equal Cost Multipath (ECMP) enable the usage of multiple equal cost paths from the source node to the destination node in the network. The advantage of using this algorithm is that the traffic can be split more uniformly to the whole network avoiding congestion and increasing bandwidth consumption.

This blog explains hash based ecmp load balancing algorithm.

Background

ECMP load balancing refers to distributing traffic more evenly by installing entries for multiple best paths to the switch’s forwarding layer and using load balancing algorithm to identify flows and distribute them to different paths. If number of flows is large this algorithm helps in distributing the traffic along with other unused paths in order to reduce the congestion. ECMP implementation enables us to evenly spread flows across the network leading to best utilization of multiple links towards the destination. It reduces the time taken in data transfer up to large extent due to less congestion. This algorithm helps us in improving network performance in SDN enabled networks.

ecmp_basic_dgrm

An example of ECMP load balancing is shown above. Here four equal cost paths are available to route the packet to destination. With use of ECMP algorithm all four paths are getting utilized to route packets. This helps in distributing traffic evenly across the network and reduces congestion. Additionally, these four ECMP paths are backups for each other. If one of the paths fails, traffic could be split between the other three paths after failure detection.

Load balancing algorithm

ECMP implementation ensures, if there are numerous best paths to get to a specific MAC
address (i.e. multiple paths have the same cost metric), the controller installs rules in switches in such a way so that all the paths to reach that MAC address are utilized instead of overloading a single path. This is accomplished by identifying rules that match on more fields in conjunction with the destination MAC address. This algorithm finds all possible shortest paths to reach the destination and perform modulus operation on the hash value with number of shortest possible routes and select the final path based on the output. Once final path is decided controller installs the rule on all switches of selected path and forwards the packet to destination host.
This implementation has been categorized into two steps.

1. Multiple best path selection algorithm

When the first packet of a flow arrives on source switch it forwards the packet to SDN
controller as switch does not have rules that match with it. The controller then checks to see what could be the possible ways to reach the packet’s destination MAC address. First it tries to figure out the destination switch from which destination host is connected. We are
populating a dictionary where key is host’s MAC address and value is a tuple consisting of
switch and port with which host is connected. The Controller looks up in this dictionary to get the destination switch

  • If destination MAC address entry is updated : It gets the respective switch and the
    port on which destination is connected. If source switch and destination switch both are same then controller simply installs the rule to forward the packet on out port as there is only one possible path to reach the destination. If source switch and destination switch both are different we try to get all possible shortest paths to reach the destination using Dijkstra algorithm. We did slight modification in Dijkstra algorithm to get multiple shortest paths between source and destination instead of a single path. Topology information is represented as non directional
    graph and stored as an adjacency list. All edges have equal cost. This algorithm computes the weight of the path where weight of the path is sum of weight of constituent edges and returns paths with minimum weightage. This gives us all possible shortest path to reach the destination.

Here is the algorithm to compute all shortest path from source to destination.

def get_all_paths(self,graph,src):
    '''gets adjacent nodes of source'''
    self.topodict = deepcopy(graph)
    length = len(self.topodict)
    type_ = type(self.topodict)
    nodes = self.topodict.keys()
    pdb.set_trace()
    visited = [src]
    path = {src:{src:[]}}
    nodes.remove(src)
    distance_graph = {src:0}
    pre = next = src
    paths = {}
    while nodes:
        distance = float('inf')
        for v in visited:
           pdb.set_trace()
           for d in nodes:
               if d in self.topodict[v].keys():
                   new_dist = self.topodict[src][v] + self.topodict[v][d]
                   if new_dist < distance:
                       distance = new_dist
                       next = d
                       pre = v
                       self.topodict[src][d] = new_dist
                       path[src][next] = [pre]
                   elif new_dist == distance:
                       if not d in path[src].keys():
                           path[src][d] = []
                       path[src][d].append(v)
        distance_graph[next] = distance
        visited.append(next)
        nodes.remove(next)

    return path
  • If destination MAC address entry is not updated: In that case it send the packet out
    on all ports except the one it came in and learns the location of the hosts.
    We populate dictionary with this entry so that next time when packet for that destination arrives instead of flooding all ports we can just send to the appropriate port. This is based on backward learning concept. Next packet in the flow will follow the above mentioned logic and ECMP forwarding rule will be installed for the flow.

 

2. Hashed Routing

When forwarding a packet the switch must decide which path to use. Fundamentally, <n>
paths exist in the forward and reverse path direction for given source and destination.
In hashed routing a modulo <n> hash of various header information is performed. Header
fields used for hash calculation includes source IP address, destination IP address, protocol, Source port and Destination port. Hash function returns ECMPstyle
5 tuple hash for packets, otherwise 0. We perform modulus operation on the hash value with total number of best possible routes. The output of this modulus indicates which of the paths to use.

Header fields
   {
        hash_input[0] = ip.srcip
        hash_input[1] = ip.dstip
        hash_input[2] = ip.protocol
        hash_input[3] = source port
        hash_input[4] = destination port
   }

Ideally, the header information and hash algorithm are chosen such that an arbitrary traffic pattern will be equally distributed on the paths while ensuring that a single flow remains on the same path for the life of the flow in order to preserve inorder
packet delivery.

This algorithm has been depicted well with flow diagram shown below.

flowdgm_modified

This algorithm helps in reducing congestion in network. Due to less congestion it would also help in reducing the time taken in data transfer. Also with use of this algorithm network performance of shuffle phase data transfers in Hadoop MapReduce can be enhanced.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s