For past 5-6 years I’ve been in business of deploying cloud solutions for our customers. Vast majority of that was some form of OpenStack, either a simple cloud or a complicated one. But when you think about it – what is a simple cloud? It’s easy to say that small amount of machines makes an easy, and large amount of machines makes a complicated cloud. But, that is not true. Complexity of a typical IaaS solution is pretty much determined by network complexity. Network, in all shapes and forms, from the underlay network to the customer’s overlay network requirements. I’ll try to explain how we deal with the underlay part in this blog.
It’s not a secret that a traditional tree like network architecture just doesn’t work for cloud environments. There are multiple reasons why; it doesn’t scale very well, it requires big OSI layer 2 domains and… well, it’s based on OSI layer 2. Debugging issues on that level is never a joyful experience. Therefore, for IaaS environments one really wants to do a modern design in a form of a spine-leaf architecture. Layer 3 spine-leaf architecture. This allows us to have bunch of smaller layer 2 domains, which then nicely correlate to availability zones, power zones, etc. However, managing environments with multiple layer 2 and therefore even more layer 3 domains requires a bit of rethinking. If we truly want to be effective in deploying and operating a cloud across multiple different layer 2 domains we need to think of the network in a bit more abstract mode. Luckily, this is nothing new.
In traditional approach to network, we talk about TORs, management fabric, BMC/OOB fabric, etc. These are most of the time layer 2 concepts. Fabric, after all, is a collection of switches. But the approach is correct; we should always talk about networks in abstract terms. Instead of talking about subnets and VLANs, we should talk about purpose of the network. This becomes important when we talk about spine-leaf architecture and multiple different subnets that serve the same purpose. In rack 1, subnet 172.16.1.0/24 is management network, but in rack 2, management network is on subnet 192.168.1.0/24, and so on. It’s obvious that it’s much nicer to abstract those subnets into a ‘management network’. Still, nothing new. We do this every day.
So… Why do our tools and applications still require us to use VLANs, subnets and IPs? If we deploy same application across different racks, why do we have to keep separate configurations for each of the units of the same application? What we really want is to have all of our Keystones listening on OpenStack Public API network, and not on subnets 192.168.10.0/24, 192.168.20.0/24 and 192.168.30.0/24. We end up thinking about application on a network, but we configure differently exact copies of the same application (units) on different subnets. Clearly our configuration tools are not doing what we want, but rather forcing us to transform our way of thinking into what those tools need. It’s a paradox that OpenStack is not that complicated, rather it’s made complicated by the tools used to deploy it.
While trying to solve this problem in our deployments at Canonical, we came up with concept of spaces. A space would be this abstracted network that we have in our heads, but somehow fail to put into our tools. Again, spaces are not a revolutionary concept in networking, they have been in our heads all this time. So, how do we implement spaces at Canonical?
We have grown concept of spaces across all of our tooling; MAAS, juju and charms. When we configure MAAS to manage our bare metal machines, we do not define networks as subnets or VLANs, we rather define networks as spaces. A space has a purpose, description and few other attributes. VLANs, and indirectly subnets too, become properties of the space, instead of other way around. This also means that when we deploy a machine, we deploy it connected to a space. When we deploy a machine, we usually do not deploy it on a specific network, but rather with specific requirements; must be able to talk to X, must have Y CPU and Z RAM. If you ever asked yourself why does it take so much time to rack and stack a server, it’s because of this disconnect of what we want and how we handle the configuration.
We’ve also enabled Juju to make this kind of requests – it asks MAAS for machines that is connected to a space, or set of spaces. It then exposes this spaces to charms, so that each charm knows what kind of networks this application has on its disposal. This allows us to do ‘juju deploy keystone –bind public=public-space -n3’; deploy three keystones, connect them to a public-space, a space defined in MAAS. What VLAN will that be, which subnet or an IP, we do not care; the charm will get information from Juju about these “low level” terms (VLANs, IPs). We humans do not think of VLANs and subnets and IPs; at best we think in OSI layer 1 terms.
Sounds a bit complicated? Let’s flip it the other way around. What I can do now is define my application as “3 units of keystone, which use internal network for SQL, public network for exposing API, internal network for OpenStack’s internal communication and is also exposed on OAM network for management purposes” and this is precisely how we deploy OpenStack. In fact, the Juju bundle looks like this:
Those who follow OpenStack development will notice that something similar has landed in OpenStack recently; routed provider networks. It’s the same concept, solving the same problem. It’s nice to see how Juju uses this out of the box.
Big thanks to MAAS, Juju, charms and OpenStack communities for doing this. It allowed us to deploy complex applications with a breeze, and therefore shifted our focus to bigger picture, IaaS modeling and some other, new challenges!