TLDR: To start, this is not going to be one of those traditional “Containers vs VMs” conversations. This series is more of a first-hand account of the evolution of application infrastructures from the operations team’s point of view; from physical servers all the way to the rise of Kubernetes.
The rise in popularity of Kubernetes is a tale of overcoming the operational complexities of scaling application infrastructure to support the growing demand for applications and services.
I like to think of it as a story of abstraction, in which we have added flexibility and scalability by subtracting dependencies that slowed operations. We still have not removed all the complexities. Hell, you could easily argue things got more complex during this evolution, but this progression has driven results that have changed the way technology impacts the world we live in today.
Let’s dive deeper into what this means by taking you through my accounts of moving from manually configuring servers to managing at-scale DevOps operations.
Part 2: Virtual Machines
Part 2: Virtual machines
To remove the burdens of physical servers, the OS and its processes were abstracted from the physical server with the introduction of virtual machines. This removed many of the operational complexities mentioned in the previous article.
First, VMs solved the setup problem. You could snapshot (or create it in a declarative way) to store an image and later provision it as many times as needed to guarantee a similar initial state. It would still take several minutes, but this was nothing compared to the few hours/days needed in the pre-VM world.
VMs also greatly improved your ability to scale. They allowed you to maximize the capacity of your physical infrastructure or, with a few clicks, grow your footprint by buying VMs of any size from cloud providers (no more trips to remote data centers for me!). The rise of configuration automation tools, like Chef, Puppet, and finally Ansible made the initial setup and mass deployment management much easier.
However, we didn’t truly solve the dependency problem. To run, each VM required a full-blown operating system that lives on top of a hypervisor. Also, you would still run multiple processes per VM (we always need a monitoring agent...). This resulted in the same issue as above, where processes using different library versions caused services to suddenly stop working after an unrelated lib upgrade.
These dependencies were slowing teams down, and paved the way for the next evolutionary stage in application infrastructure: