Defining a Zero-Downtime Deployment
Learn the definition of a zero-downtime deployment.
We'll cover the following
Updating a single-replica MongoDB cannot demonstrate true power behind Deployments. We need a scalable service. It’s not that MongoDB cannot be scaled (it can), but it is not as straight-forward as an application that was designed to be scalable. We’ll jump to the second application in the stack and create a Deployment of the ReplicaSet that will create Pods based on the go-demo-2
image.
Zero-downtime deployment is a prerequisite for higher frequency releases.
Looking into the Definition#
Let’s take a look at the Deployment definition of the API go-demo-api
.
We’ll skip explaining apiVersion
, kind
, and metadata
, since they always follow the same pattern.
-
Line 5-7: The
spec
section has a few of the fields we haven’t seen before, and a few of those we are familiar with. Thereplicas
and theselector
are the same as what we used in the ReplicaSet from the previous chapter. -
Line 11:
minReadySeconds
defines the minimum number of seconds before Kubernetes starts considering the Pods healthy. We put the value of this field to1
second. The default value is0
, meaning that the Pods will be considered available as soon as they are ready and, when specified,livenessProbe
returns OK. If in doubt, omit this field and leave it to the default value of0
. We defined it mostly for demonstration purposes. -
Line 13: The next field is
revisionHistoryLimit
. It defines the number of old ReplicaSets we can rollback. Like most of the fields, it is set to the sensible default value of10
. We changed it to5
and, as a result, we will be able to rollback to any of the previous five ReplicaSets. -
Line 14: The
strategy
can be either theRollingUpdate
or theRecreate
type. The latter will kill all the existing Pods before an update.Recreate
resembles the processes we used in the past when the typical strategy for deploying a new release was first to stop the existing one and then put a new one in its place. This approach inevitably leads to downtime. The only case when this strategy is useful is when applications are not designed for two releases to coexist. Unfortunately, that is still more common than it should be. If you’re in doubt whether your application is like that, ask yourself the following question. Would there be an adverse effect if two different versions of my application are running in parallel? If that’s the case, aRecreate
strategy might be a good choice and you must be aware that you cannot accomplish zero-downtime deployments.
Deployment Strategies#
Let’s look into a bit of detail of both Recreate
and RollingUpdate
strategies.
Recreate strategy#
The Recreate
strategy is much better suited for our single-replica database. We should have set up the native database replication (not the same as Kubernetes ReplicaSet object), but, that is out of the scope of this chapter.
If we’re running the database as a single replica, we must have mounted a network drive volume. That would allow us to avoid data loss when updating it or in case of a failure. Since most databases (MongoDB included) cannot have multiple instances writing to the same data files, killing the old release before creating a new one is a good strategy when replication is absent. We’ll apply it later.
RollingUpdate strategy#
The RollingUpdate
strategy is the default type, for a good reason. It allows us to deploy new releases without downtime. It creates a new ReplicaSet with zero replicas and, depending on other parameters, increases the replicas of the new one, and decreases those from the old one. The process is finished when the replicas of the new ReplicaSet entirely replace those from the old one.
When RollingUpdate
is the strategy of choice, it can be fine-tuned with the maxSurge
and maxUnavailable
fields. The former defines the maximum number of Pods that can exceed the desired number (set using replicas
). It can be set to an absolute number (e.g., 2
) or a percentage (e.g., 35%
). The total number of Pods will never exceed the desired number (set using replicas
) and the maxSurge
combined. The default value is 25%
.
maxUnavailable
defines the maximum number of Pods that are not operational. If, for example, the number of replicas is set to 15 and this field is set to 4, the minimum number of Pods that would run at any given moment would be 11. Just as the maxSurge
field, this one also defaults to 25%. If this field is not specified, there will always be at least 75% of the desired Pods.
In most cases, the default values of the Deployment specific fields are a good option. We changed the default settings only as a way to demonstrate better all the options we can use. We’ll remove them from most of the Deployment definitions that follow.
The template#
The template
is the same PodTemplate
we used before. A best practice is to be explicit with image tags like we did when we set mongo:3.3
. However, that might not always be the best strategy with the images we’re building. Given we employ right practices, we can rely on latest
tags being stable. Even if we discover they’re not, we can remedy that quickly by creating a new latest
tag. However, We cannot expect the same from third-party images. They must always be tagged to a specific version.
⚠️ Never deploy third-party images based on
latest
tags. By being explicit with the release, we have more control over what is running in production, as well as what should be the next upgrade.
We won’t always use latest
for our services, but only for the initial Deployments. Assuming that we are doing our best to maintain the latest
tag stable and production-ready, it is handy when setting up the cluster for the first time. After that, each new release will be with a specific tag. Our automated continuous deployment pipeline will do that for us in one of the next chapters.