Public cloud was a great idea, but it's time to stop the bleeding
I remember how I felt when I first heard about Amazon Web Services. This massive company was going to rent out spare server space? To anybody? I didn't have to shell out thousands of dollars, buy a huge server for my budding business, and then pay again to connect it to the internet? Yes, please!
I mean, come on, it was a no-brainer. I was just a developer back then, building a metaverse-centric chat application. What did I know about complex internet networking, hardware security protocols, and all of that "other stuff" that I relied on operators for? Not much -- or at least not enough that I wanted to do it myself, even if I had had the time -- which I didn't.
So I made the decision that so many companies of all sizes have made since 2006: I fired up an EC2 instance and I put my application on it.
Honestly, it seemed miraculous. So many things I didn't have to think about anymore: server maintenance, making sure my internet stays up, making sure my internet bandwidth is sufficient for the app, protecting the server from attack (other than via the OS or my application, which are, obviously, still my responsibility). I could even go ahead and use other AWS services to play with new technologies, such as Elastic MapReduce back in the day to process large datasets, or today's Interactive Video Service to broadcast a weekly retro movie night to a global audience.
But there's a dark side to all this, and like me, many companies are starting to realize it. While there are certainly benefits to being able to access these kinds of resources on demand, there are disadvantages to relying on it -- for big companies even more than for small ones.
The increasing costs of public cloud
For me to look at my budget and realize I'm spending $250 a month on AWS is painful, but it wouldn't even be noticed at a typical enterprise. Unfortunately, a typical enterprise that's reliant on public cloud might be spending thousands, hundreds of thousands, or even, as CNBC reported in 2019 of Apple, a million dollars a day. A day. Thirty million a month, a third of a BILLION dollars a year.
It sneaks up on you, getting worse the more success (read: happy users) you have. And those users can be end users, or just developers who make use of more and more services at your public cloud provider.
Those costs can also be tricky to understand, let alone contain. A simple application might use dozens of different services, each with its own cost. They might also be resistant to easy cleanup, leaving "zombie" services up and running (and continuing to bill) long after the developers using them have moved on to something else, even if they've been diligent about cleaning up. (Ask me how I know this.)
And then there are the less direct costs. You'd think that by using public cloud you can avoid hiring cloud operations people of your own, but just the opposite is true. For every public cloud's API you're using extensively, you'll need skilled personnel to manage it and ensure that your applications and operations are working properly, and those people are not cheap. And if you're using more than one public cloud, you'll need people skilled in all of those APIs.
Public cloud and vendor lock-in
It's the APIs that get you in other ways, too. Sure, it's simple for me to pop open the web UI every Saturday night and turn on my streaming instances to get ready for the show. But if you're a large company that has dozens or hundreds or thousands of machines over multiple services and perhaps tens of thousands of users who have to be authenticated... you are not doing things that way.
No, instead, your cloud operators are using the APIs these services provide to script your operations. It's the Right Thing To Do. Unfortunately, it also means that all of your operations are specifically targeted at this specific provider, and if you ever wanted to move, either because another provider has better prices or because you simply weren't happy with their service, you're going to have to rebuild all of your automation in order to make that happen. (And that also involves having your people re-learn the new APIs, or hiring new people.)
Other reasons public cloud can be an issue
And there's other problems with putting all of your eggs in the public cloud basket, and unfortunately, most of them are rooted in the typical structure of the solution: specifically sharing a server with strangers.
The first that comes to mind is performance, or what as long as 10 years ago we were calling the "noisy neighbor problem." Basically, it means that if another of the provider's customers on the same server is doing something resource intensive, it's your application that is going to suffer.
This can be made even worse by the practice of "oversubscription," or packing more onto a server than the physical server can actually accommodate. For example, a server with 512GB of physical RAM and 32 physical processors might have virtual machines assigned to it representing 2TB of vRAM and 128 virtual processors, with the assumption that not all of those machines will be accessing those resources at the same time. As public cloud customers, we don't have any control over those oversubscription rates, unless we pay (considerably) more for dedicated servers.
Another aspect of sharing resources like that is the issue of security. While technically speaking, individual virtual machines, networks, and other resources should not be able to access each others' data, but, well ... I think we all know that security is never iron-clad. In addition, a vulnerability in one application or VM that gives an attacker access to the underlying resources can put all your data and applications using those resources at risk.
Plus, as my colleague John Jainschigg points out, "The biggest security risk used to be S3 buckets open to the internet, and the complexity of public cloud permissioning schemes and so on are also complex, and therefore problematic. You gain a lot of security by just having things on a private datacenter."
Even if you discount security, however, there's another issue you might want to take into consideration: compliance. If you're in an industry that thinks about how and where you handle your data, you are going to want to think about whether a public cloud provider is going to be appropriate. Are you required to keep your data on premise? To keep it within a particular geographic location? Maybe you need to ensure that you keep control of your data both in transit and at rest. Compliance varies by industry, so make sure you know your regulations.
So if you're thinking "well, maybe it IS time I started to move away from public cloud," what can you do instead?
What to do instead
First off, it's important to realize that just because public cloud isn't an ideal solution doesn't mean you need to stay away from it completely; it just means that you want to ensure that if you do use it, you do it in a way that doesn't trap you in an unhealthy relationship.
The first thing to do is investigate whether you'd be better off using a private cloud solution. For example, my personal app uses 4 servers, they're only turned on sporadically, and I manage them myself. To move to a solution like on-prem OpenStack would be ridiculous for me; I'm much better off staying where I am. On the other hand, if you're an enterprise with 500 VMs across two datacenters that expects 20 percent growth over the next 4 years, you can save more than $1 million by moving to Mirantis OpenStack on Kubernetes (MOSK) on-premise. And of course the more resources you're using, the more you'd save.
That doesn't just go for VMs; many companies are using public cloud to host other solutions such as Kubernetes clusters, and there are similar savings to be had by moving to an on-premise solution such as Mirantis Kubernetes Engine (MKE).
And again, none of this says that you have to quit using public clouds altogether to escape from these problems. For example, public cloud can still be used to augment existing capacity, or for short-term projects. You might use it to experiment before bringing the final production version in-house. The key here is to take advantage of the fact that many on-premise solutions can be deployed onto these clouds without shackling yourself to their APIs. For example, rather than deploying Kubernetes using Amazon's EKS, and thereby locking yourself into their way of doing things, you can deploy a solution such as MKE on EC2 instances, then manage them as usual with Lens. Then when you're ready to bring the project back in-house, the only thing that changes is the location of the cluster. Your operations people don't need to learn a whole new set of APIs just to get the cluster up and running.
Ah, the operations people. What if you're on public cloud so that they don't have to deal with the cloud itself? Well, moving to private cloud doesn't necessarily mean saddling your operations people with extra work that was formerly done by the magical elves behind the public cloud. Instead, you can avail yourself of ZeroOps for On-Prem, and let our global bench fo experts work with you to create the perfect solution for you, then help support it so those operations people can continue to do the important work of supporting the business that they're doing now (except without all the aggravation of dealing with the public cloud).
Conclusion
I wasn't being disingenuous when I talked about how excited I was when I first heard about public cloud; it really was a great thing. And in some ways, it still is. But for too many years companies have been tying themselves into situations, and now they're looking at the cost (both literal and figurative) and wondering if they made a mistake, and looking into solutions such as Mirantis OpenStack on Kubernetes and Mirantis Kubernetes Engine.
No, you didn't make a mistake going to public cloud. But now it's time to come home. Schedule a chat with a Mirantis expert today and we’ll help you find a way to stop the bleeding.