The OpenFlow Overlay

The OpenFlow Overlay


The OpenFlow Overlay

Integrating SDN beyond the data center is still the wild west. The needs are very similar, traffic classification and policy instantiation at the edge. Data center solutions then abstract the underlying network with tunneling/encapsulation mechanisms to achieve IP mobility among other things. Outside the data center, some may care about choosing a forwarding path beyond the SPF calculation accounting for other constraints and some may not. What everyone does care about is the edge. SDN in DC is being defined today, flow based forwarding on the edge (wildcarding available), exact match forwarding in the core (CAM/LPM). Whether enterprise security services or provider service insertion to generate new revenue it is all about the edge. How do we begin to integrate SDN without replacing our networks and without vendor applications to do it? Using a similar approach to data center solutions that Martin Casado and team pioneered using DC tunnels, we can use VID, VPLS and in one case I’m pinning specific flows for tapping over GREs (just as a DC overlay would but in edge HW) to avoid having to deal with the native core.

Since IP mobility is less of a concern rather then pathing I can replace L2 tunnels with application specific OpenFlow overlay flows. That doesn’t mean it needs to be OpenFlow. Scott Shenker (a co-founder at Nicira along with Casado and McKeown) recently made a powerful point that (paraphrasing) “the OpenFlow edge can solve any problem that someone would be willing to pay for”.

Evolving towards a simple identifier or even metadata, that describe the forwarding path at the aggregation mainly to avoid wildcard matching is a logical path described by Martin and others publication Fabric: A Retrospective on Evolving SDN (Google Cache). While it is DC focused, the affect to enterprise is reducing lookups to CAM as opposed to the much more expensive TCAM. I translate the concept to be similar to how MPLS FECs (Forwarding Equivalency Class) and RSVP for pre and reallocation of paths. Nothing solves QoS by current means short of pushing policy to in-box processes. I don’t expect that to change but if it does it will happen on the edge also.

I freely admit that I have been negligible on keeping up with the flurry of Provider overlay work in standards bodies so the next coming maybe be right there. But..for now to determine pathing we are stuck with complex protocols living in the closed monolith. The path forward in the core may be coupled protocols with the SDN API approach. API interoperability or even commonalities is not something I am optimistic on, at least in the near term. Open network operating systems are more then likely a requirement to get to a widely adopted, interoperable SDN API landscape (e.g. vendor agnostic NOS). Self provisioning (see Rob Sherwood’s publication), virtualization or dare I say data analytics of those paths are the eventual goal. That debate is for another day. Flow based forwarding offers the granularity needed for edge services and OpenFlow is as good a messaging for it as any.

Until we have some abstracted simplicity to the core of the network, I humbly submit there are some use cases the application flow aware forwarding in the aggregation makes sense, particularly without de facto SDN API approaches. One such use case we are working on is bypassing (expensive) firewalls or any other control plane in the data plane black box for trusted data backup traffic. The tradeoff is the complexity of multiple control protocols and hardware involved for something like traffic engineering as compared to the also complex software algorithms to manage flow forwarding. That is why I am interested in having a fairly easy mechanism of getting a base network that has the ability to install OpenFlow forwarding rules as a tool overlaid on top of a fully functioning network full of the plethora of protocols we know, love and hate underneath. Applying graphing functions to labels to ease that complexity of forwarding is something someone is undoubtedly working on but probably isn’t going to be solved in the near term.

OpenFlow as a Tool

OpenFlow is a tool. It opens up new ways to design networks. Like anything else, it only makes sense if using it to solve problems in IT (e.g. save money or make money). I have some specific requirements that don’t necessarily fall into the category to Shenker’s point. The architects role will become significantly more important and complex with new distribution models. There are two basic use cases that begin solving many problems some of us experience today. It also presumes you are rolling the code yourself or working with some vendor beta applications:

  1. Ingest, classify and forward traffic at the edge of the network for policy and services.
  2. Flow forwarding for specific applications across multiple nodes either end to end or hacked through native forwarding networks. I say hacked for a reason. Pushing a flow through say a native VLAN requires turning off mac-address learning and STP to basically BUM traffic a P2P circuit. Provider bridging (QinQ) cannot get here soon enough.

L2 VLANs Still Suck

BUM traffic is what I consider one of the harder/hardest problems outside of controlled data centers with known quantities. Particularly problematic when dealing with physical hardware limitations. 1million flow rules vs. Trident+ 750-3k rules. There are three options and none of them are wonderful other then the obligatory drop. Register, react or flood, but thats a rabbit hole. For now, I am resigned to flooding since current wiring closet switches have limited ability to process packets over a single serial bus only to run into a CPU as powerful as your smart phone. For the next few months I am leaving it with “normal” flooding. The drawback is Spanning-Tree (shocking). So it leaves us doing VLAN translation to stitch between VIDs.

L2 is still a problem (as Scott Lowe pointed out recently) for gateway resolution from a client. If we Proxy ARP requests we can handle the client gateway requests and set/rewrite the dMAC which requires flexible silicon. Registration in Ethernet would be lovely someday.

The OpenFlow Overlay Design

Its pretty simple, put in a table miss policy of normal on the edge nodes if you are only interested in edge processing, service insertion or the edge nodes and nodes end to end. This means the device matches everything that wasn’t matched by a higher priority. OpenFlow is an ordered list keyed off of the rule priorities. Keep in mind, if you use TCAM for OpenFlow, depending on the agent you are probably taking away capacity or capability all together for functions such as ACLs that use that same TCAM. Hybrid kit is also required but I haven’t run across many boxes that don’t support OFP_Normal or something very similar such as Brocade’s Hybrid OF pipeline.

Rules are as simple as popping in a match all rule with the lowest priority to use OpenFlow normal. This default drain is similar to a default gateway in prefix routing like so:

Hybrid OpenFlow Normal

Once a handful of devices have that rule in place, you can then begin instantiating OpenFlow rules to add policy on the edge or even stitch a forwarding path to bypass black boxes or manage a custom constraint based SLA.

Hybrid OpenFlow Normal With Applications

What this leaves you with is a an underlay that works exactly the same as it always did but adds the ability to put OpenFlow instructions to manipulate particular traffic proactively or even reactively at a small scale and plenty of tolerance for issues using current HW. Again, it doesn’t need to be end to end but minus an encap better then VLANs or setting next hops its a hack.

OpenFlow Overlay

As part of the order of operations each packet is matched against TCAM so performance is the same as it would normally be. As applications are devloped by vendors or rolled yourself functionality using OpenFlow will start becoming clear where it makes sense. The OpenFlow doesn’t scale stuff nonsense is only when you use the wrong tool in the wrong place. OpenFlow is capable of operating at both ends of the state distribution model (fully and eventual consistency). Hardware constraint and use cases should determine the use.

Deployment Considerations

There is a laundry list of what to look out for. Well over a books worth ironically 🙂 A couple things that are basics for the simple framework above are the following:

  • Controller placement isn’t that big of a deal since all flows are pushed proactively. There may or may not be a learning curve for networking folks running a server.
  • Failure scenarios should be tested. The nice thing of proactive flows is rules stay put until a flowmod is sent to remove it or the network element reboots.
  • Understanding vendor OF firmware is incredibly fragmented (to be expected). Your best bet is to learn Open vSwitch since many have it under the hood or at least similar datapaths in the form of ships in the night bridges (br0, br1 etc).

While this is not a purist look at OpenFlow usage, it is what I see as a common sense, low risk approach to establishing a framework. It also allows for existing infrastructure investment protection that many of us need, IF your vendor writes the firmware for it. This leaves a network or some edges of your network with capabilities such as:

  • Manipulate the forwarding tables to install flows to solve some problems.
  • Add simple desegregated services to your network (TAP, FW, NAC, IDS etc.)
  • Establish a testbed for you to develop or deploy some control or policy applications.

Many of those problems do not require hundreds of thousands of flow rules but rather dozens of flows. This isn’t advocating that everyone should go do this. It is hacking and developing at this point until the majorities vendors retrain and sell the products. Reference architectures will be few and far between since industry verticals have diverse needs, so will the solutions (que the progressive integrators). We are close to seeing vendor supported applications come to market outside of the data center but to leverage those it will need the network to become a platform rather then the rigid model it is today. Take a look at the open source community network centric projects out there.

Here is a couple of videos from a series of quick and dirty labs I put together. This is a simple example of adding services to the disaggregate edge of the network without needing much in the way of hardware capabilities or compute resources.

You can try this lab yourself with the following downloads:

Part 1 is a quick overview of the concepts

Part 2 is a simple lab using OpenDaylight and Mininet

Get Started!: Check out this post for instructions: Installing Mininet, OpenDaylight and Open vSwitch

Installing Mininet, OpenDaylight and Open vSwitch →

Feel free to hit ping me or better jump on #opendaylight #openvswitch #openflow and interact with peers and even contribute. Thanks for stopping by!

About the Author

Brent SalisburyI have over 15 years of experience wearing various hats from, network engineer, architect, devops and software engineer. I currently have the pleasure of working at the company that develops my favorite software I have ever used, Docker. My comments here are my personal thoughts and opinions. More at Brent's BioView all posts by Brent Salisbury →

  1. venkatvenkat07-30-2013

    I apologize if this is a naive question but in the example above, if port A1 is STP blocked or its peer port is STP blocked, wouldn’t the device drop the PKT instead of obeying the OF forwarding rule given that it’s a hybrid switch?
    So does the controller have to stitch a path along the STP path ?

  2. Brent SalisburyBrent Salisbury08-01-2013

    Hi Venket. That is a wonderful point. You are absolutely correct, STP is about a decade too old. STP should be reported as blocking, but in most cases are reported from the OF switch firmware as a down port as they tend to not properly report 802.1d etc blocking yet. Alternatively a controller maybe not process the STP field in port-status messages properly.

    Net is STP blocking links are bad. The current solution (other then TRILL) for non-blocking fabrics is LAG/MLAG bundling. The problem there is if you want to custom balance traffic based on a flows application classification, (m)LAG bundles appear as a single logical port to the controller since it is hashing the traffic in independently in the PHY .

    The best solution I have come up with is to run combinations of VIDs on 802.1q interconnects and then do VID translations for OF traffic. By rewriting you can balance blocking links for native traffic and not lose links to blocking. Theoretically may be workarounds in the spec but I haven’t seen any implemented. Hybrid OF is tricky early on but it is necessary to find a roadmap if it is to adopt.

    Thanks for bringing that up. Its a great topic. Jump on #opendaylight on if you want to discuss it further.


  3. KanatKanat09-08-2013

    hmmm… I’d like to see SDN as a device that would finally give netadmins a comprehensive unified LAN/WLAN (Wireless is particularly important since it will push out the access switch into oblivion in next 5 years or so) management solution without a vendor lock-in.
    And I mean full-cycle – discovery, config, monitoring, troubleshoot and audit.
    I think today that’s the #1 problem – jumping between 2-3 GUIs, then couple of CLI’s. We’re running the networks with same tools as we had 10 years ago – ping, traceroute, SNMP/Netflow (although those last 2 are kind ify in Campus)
    Just keeping the lights on, especially if you got more than vendor in each food group (R&S, Wireless, Sec) is painful. Yeah, we’re used it. But that don’t make it right. Not any more at least.
    OpenFlow fits well for this task in terms of Data Plane management. But what about config/control part?

  4. ChadChad10-18-2013

    When I initially left a comment I seem to have
    clicked the -Notify me when new comments are added- checkbox and from now
    on whenever a comment is added I receive four emails with the
    exact same comment. There has to be a means you can remove me from that service?

  5. Brent SalisburyBrent Salisbury10-20-2013

    Hi Chad, I know on other websites that I leave comments on I can manage those here

    Maybe take a peak there and see if it shows up. Not really sure otherwise. I will poke around when a I get a few minutes this week.