Service Mesh for Mere Mortals: The history of service mesh and how it fits in vs. Kubernetes
We are thrilled to launch our inaugural book from Mirantis Press, Service Mesh for Mere Mortals, by Bruce Basil Mathews. Below we've shared the first three chapters of the book, where you can learn about the history of service mesh, and how service mesh fits in versus container orchestrators like Kubernetes. Download the full 100+ page eBook for free.
Introduction
The reason I am writing this book is to help those who have already made the leap from monolithic application development, targeted for mainframes, to virtualized three-tier SOA oriented software development, and who have finally arrived at a point where they have begun investigation of microservices architecture for application development.
Many of you who have reached this plateau started out using technologies such as Docker for containerization and have already found the need for some type of automated orchestration using engines such as Mesos or Kubernetes for this purpose.
As you may or may not yet have discovered, though the orchestration engines do a very good job of coordinating and maintaining a specific number of instances of a microservice, scaling them up and down as needed, they are NOT very good at maintaining a Zero Trust security model, nor are they even able to participate fully in terms of Role-Based Access Control or strict Network Security, or even "standardization" of Platform Services such as Service Discovery, Load Balancing, or Application Level Encryption without the significant addition of external services and modifications to their Internal Resource definitions.
This is where the need for a Service Mesh add-on comes into play. Service Meshes provide Policy Engines and other features specifically targeted at filling these voids.
This book provides a deeper understanding of the available Service Meshes and their features and benefits, but most of all, it gives you the experience of actually using a Service Mesh so you can gain enough of an understanding of how they work to take on your own projects.
Since there are clear indications in the industry that specific types of Service Meshes are being used more commonly than others, I have made the deliberate choice to provide more detailed information about Service Meshes employing a Sidecar Architecture. The most popular one of the Sidecar oriented Service Meshes today is Istio, so we'll be using that for our examples. The concepts, however, are the same for all of the Sidecar Service Meshes.
The Basics: How Did We Get Here?
Before discussing Service Meshes, let's lay a little foundation and provide a bit of context. We will be supplying the needed foundation in the first few chapters, but never fear! If you are a bit more advanced on the subject, we'll be getting into some really hairy specifics and some things that even most knowledgeable readers may not be aware of in the later chapters. (But in an understandable way, I promise.) So please bear with me as we get warmed up!
Beginning up front with something even many techies may not realize, the beginnings of the service mesh start with the development of the three-tiered architecture model for applications, which was used to develop and deploy a large majority of web-targeted applications.
At the time, the "mesh" was fairly simplistic; access to the "web tier" initiated a call to the "app tier," which in turn passed a call to the "database tier." The whole thing was tailored to provide sufficient performance back to the "web tier," where presumably an end user was awaiting a response. As the internet grew and applications became more and more complex, however, the three-tiered application model began to break down under heavier and heavier load.
This led to the advent of microservices that decomposed the monolithic web, app, and data tiers into multiple discrete components that operated independently.
At the same time, the dedicated networks between tiers became more and more generalized, and the concept of "east-west" traffic between all microservices (as opposed to the "north-south" traffic between the application and external consumers) was adopted.
Companies that relied on fast internet and network throughput to provide their services, such as Netflix, Google, Facebook and Twitter, initially followed the standard coding practices developed for "C" and "Java" and created vast libraries of functions and platform services, such as load balancing, telemetry, and circuit breaker patterning to "standardize" these operations for the developer across all of their microservices.
Voila! The birth of the modern Service Mesh (still in its infancy and needing its diapers changed) was upon us.
As these libraries of functions and operations became more widely used, some cracks in the armor started quickly showing up. Here are a few examples:
- One of these companies owning a library, (say, Twitter and Finagel) decides to add a feature, or to change the behavior of an existing feature. First, every instance where that library is in use by all developers needs to be recompiled. Upon update, the developer's code using the library may, and probably will, break. So, the whole thing has to be coordinated across a massive amount of the internet for each change to occur.
- Say a feature is becoming "deprecated" and it will cease to be included in the library as it is superseded by a newer feature or capability. The same type of difficult coordination is involved.
- Since there were and are no "standards" for the contents, protocols, or access points to these libraries, if you want to move an application from Twitter to Google, it must be rewritten.
The current cloud-native "work around" for the use of libraries is the use of proxies, which take in requests and spit out responses and provide a layer of abstraction in front of the actual library. A proxy-based architecture allows for updates and changes that do not require recompilation or extreme coordination to implement. Implementing the service mesh in proxies places the decisions for the platform functions and features used in applications in the hands of the platform operations and engineering teams, leaving the developer to focus on the application's logic.
So, at this point, we had decomposed monolithic applications into manageable individual services. Some were implemented as REST URLs, which didn't allow for individual scalability. To solve that, many were implemented as virtual machines, but that was wasting the space and maintenance used by an operating system for very little benefit.
Containerization solved this problem, but didn't address the fundamental aspects of automated scaling up and down, network security, and platform service standardizations that would need to follow.
In addition, distributing these containers across multiple servers exponentially compounds the problems introduced, and requires some type of global, reusable solution.
Orchestration engines help with the scaling and operational issues of maintaining application uptime, but platform standardization, automated service discovery, load balancing, encryption, and so on were still left in the hands of the application developer to resolve.
Why Didn't We Leave This to the Container Orchestrator?
In order to understand why developers of container orchestration engines such as Mesos, Openshift, and Kubernetes did not simply embed all of the needed features and functions to manage and maintain north-south and east-west network traffic in a cluster of containers, you first want to understand some of the background behind it, or what a container "orchestrator" is and what is designed and NOT designed to do.
Container orchestration automates the scheduling, deployment, networking, scaling, health monitoring, and management of containers. Deploying and scaling containers up and down across an enterprise can be challenging without some type of automation to perform load-balancing and resource allocation, all while maintaining strict security.
Container orchestration automates many of the more mundane tasks related to these operations in a predictably repeatable way. This makes the processes far more efficient, as they no longer involve human intervention to execute.
Although the automation parts of the orchestration engine make things easier, they also make security far more difficult to manage properly. For example, many organizations running container orchestration technologies assign full cluster administrator privileges to their users for day-to-day operational requirements that cannot be automated, but multiple applications are hosted across the same clusters of physical servers, so these administrators have access to applications other than their own.
This architecture breaks away from the entire concept of "least privilege." This problem can be mitigated by implementing a Role Based Access Control (RBAC)-based management scheme within the clusters, and Service Meshes provide policy engines that make implementation of an RBAC-based management scheme easier to achieve.
The role of kube-proxy and the ingress controller
Orchestration engines typically route the network traffic between individual nodes over a virtual overlay network, which is usually managed by the orchestrator, and is often opaque to existing network security and management tools. The routing rules governing north-south and east-west traffic are maintained by a container hosted on each node of the cluster called the kube-proxy and containers hosted in the control plane called the kubernetes ingress controller.
The kube-proxy container proxies the network connections between nodes required by the application and maintains network rules on each of the nodes.
External connections to pods are handled by an edge proxy, known as an ingress controller because it is configured and maintained using ingress resources within the kubernetes framework. These network rules, which may come from external services such as load balancers and so on, allow network communication to your pods from network sessions inside or outside of your cluster.
All of this traffic is automated and "hidden" from external network operations.
Another complication of the kube-proxy/ingress approach implemented within the Kubernetes framework is that the various methods provided to accomplish the same end, such as exposing a service endpoint externally from within the cluster, are all implemented very differently.
For example:
- The Kubernetes
NodePort
implementation exposes aService
endpoint against a single port on a single node. - The default load balancing mechanism within the Kubernetes framework consumes an IP address for each
Service
, and requires a separateLoadBalancer
instance for each IP address. - The ingress methodology is implemented strictly using the Kubernetes API framework, so it can't apply to all cases needed to integrate with external access points as some external access points, such as F5 LoadBalancers, etc. are not recorded as a part of the Kubernetes Framework itself.
Where the Service Mesh fits in
Service Meshes help standardize all of these needs within a central flexible methodology and service that relies more on declarative application programming than on imperative application programming. In other words,it enables the developer to focus on the "what" and the "why" versus the "how." Service Meshes also typically provide entry points for the introduction of outside probes to be used for network management and telemetry gathering purposes under strict security control.
Container orchestrators strive to maintain the scale and density of workloads. This means that by default, they can place workloads of differing sensitivity levels on the same host without some type of intervention in the process. Service Meshes can incorporate node labels within policies that enable workloads to be pinned to specific nodes, allowing for isolated deployments. (Kubernetes can achieve a similar effect by creating taints, tolerations and admission controllers, but because they are defined separately, and because of other limitations, they're not a complete solution.)
As you can see, Service Meshes fill several holes that the container orchestrator could not manage without significant rigidity and overhead being introduced into their operation, and they do it in a consistent and isolated way. This is why the use of a Service Mesh is becoming so popular among application developers.
There are currently many proxied Service Mesh options to choose from that fill the need quite handily. We will be going over them in the next few chapters. Keep reading!
This is an excerpt from the book, Service Mesh for Mere Mortals, by Bruce Basil Mathews. To continue reading, download the free 100+ page eBook here.