The Network Iceberg
The impact of compute virtualization on the data center network has been profound. There are more virtual ports then physical ports. That simple measurement overshadows any other disruption in networking. We arrived at this transformation as a result of significant adoption of OS virtualization.
The next revision of compute virtualization is the application. The last significant bastion of efficiency is to remove the OS from the equation and virtualize that application itself. At the helm of the compute disruption is Docker. Depending on a providers oversubscription, a typical ratio of VM/CPU is in the range of 20-50 VMs per physical core. The consolidation and eventual removal of the guest OS will significantly affect the subscription ratio. The condensation of the application to CPU ratio and the demand that will be imposed on the network is pretty damn exciting. There is of course some debate around security, shared kernel etc, but we had the same conversations in the infancy of OS virtualization. Atomic and CoreOS are both cool projects if you haven’t seen them before, take a look. The lightweight, purpose build OS are pretty awesome.
The Docker Impact on Networking Density
What this means for networking will be a hyper-dense networking environment, embedded deeply inside of the host operating system. Not only is the port virtual port density going to explode but also the traditional middle box services are being desegregated also.
If they haven’t already, network Ops and architects need to evaluate the impact of the new soft edge. The 2-tier spine/leaf architecture was a response to a realignment of bandwidth and the resulting need for bisectional support (95/5; %95 of the traffic north/south leaving the data center and %5 remaining east/west). The 3rd tier is back (or never left) and it will require a deep fundamental understanding of networking in order to provide reliable scale. It will be reasonable to expect thousands of IP addresses hanging off each port (PAT has never fit many provider needs) in the top of rack. Uptime is sexy and also the number one measurement of most of ops value to an organization. All that cost savings means much less when we are riding high on two 9’s.
There are inherent complexities to networks that aren’t going anywhere, but fortunately, we do know how to compartmentalize that complexity. Breaking down the silo between the edge virtualization teams and the physical network ops is pretty important moving forward. This is much more significant then the underlay vs. overlay conversations and much more about getting the right architectures that allow for orchestrated provisioning coupled with operational visibility for troubleshooting. There is no one architecture that fits all needs, everything is a trade off but reliability is a non-negotiable core value. We have learned (often the hard way) in other projects that significant float in network state between numerous platforms are not easy consistency problems to solve.
The early SDN model took us to a bizarre nascent land of complexity and fragility. The good news is that wasn’t wasted effort but merely a refinement. It is now crystal clear pieces of networking both new and old will be composed to absorb the compute reconstruction that will place enormous demands on the network. My brilliant cohorts at our new venture SocketPlane.io @botchagalupe, @MadhuVenugopal, @dave_tucker and I are excited to do our part in the community to develop a networking solution based on the practicals of the past and innovations of the present. Just as importantly, being laser focused on the ops/developer’s needs. This is the most exciting part because we have been in the trenches much of our careers so there is no speculating at the problems we are solving. We know the problems and have the scars to prove it. So come join the community in the exciting transformation and leave your mark, its a fantastic time to stand and be counted.
Take a look at one of our proposals #8951, its intention was to get the community thinking about the need for basic building blocks of networks out of the box in Docker. Solomon Hykes and the rest of the Docker team are awesome and really get the open source community so we are excited to assist in scaling out Docker networking. Long story short, its building L3 switches with the ability to lay OVS services on top in every edge node. This is the only thing in our experience that we are comfortable in saying with confidence that users won’t be sad pandas on fragile infra if we head down this path. No shortcuts especially when facing at least a 10x-20x increase in density. provisioning will be measured in fractions of a second, the days of having plenty of time for a VM to boot and getting a port provisioned are gone. Workloads could be over and completed in the time it takes to boot a VM. To avoid race conditions we are disaggregating everything and letting our old trustys update peer nodes with state changes. Its a long thread but lots of fun so take a read and leave your thoughts if interested in the subject. There is a survey if anyone is interested in giving their direct feedback about networking to all of the studs at Docker here.
Feel free to email/tweet etc. We live and breath collaboration. Community and open source isn’t something we say we can “live with if we have to” or some strategic “first hit for free” nonsense. It is at the core of what we do and what we believe the future of infra will be. I have a series on Go programming as an alternative to Python for sysop/netops that I am excited to wrap up and post soon. Go is a really elegant, yet practical language that dominates PaaS, not to mention its a really fun language to use and we are having a blast with it. It blends the niceties of both low level and high level interfaces and suited for a wide range of experience and use cases. Go has a nice transition from a high level Python like interface, OO struct compositions that can quickly progress towards lower level memory management concepts of pointers (dynamic zero latency GC), C-like speeds but a fraction of the C compile time for you test-driven hackers like me 🙂 Catch you soon, thanks for stopping by and we’ll see you upstream!
@botchagalupe demoing our tech preview: