Making the Application Fault-Tolerant

In this lesson, we will learn how to make the application fault-tolerant.

We'll cover the following

How can we make the application fault-tolerant?
Inspecting the definition of deployment.yaml
Applying the definition
Running chaos experiment
The conclusion

How can we make the application fault-tolerant?#

Now, we have a decent experiment that is validating whether our Pod is in the correct state, whether it is under the right conditions, and whether it exists. Then, it destroys the Pod, only to repeat the same validations of its state. The experiment confirms something that you should already know. Deploying a Pod by itself does not make it fault-tolerant. Pods alone are not even highly available. They’re not many things. However, what interests us, for now, is that they are not fault-tolerant. We confirmed this through the experiment.

We did something that we shouldn’t do; we deployed a Pod directly, instead of using some higher-level constructs in Kubernetes. That was intentional. I wanted to start with a few simple experiments and progress over time towards more complex ones. I promise that I will not make the same ridiculous definitions of the application in the future.

Let’s correct the situation.

We know through the experiment that our application is not fault-tolerant. When an instance of our app is destroyed, Kubernetes does not recreate a new one. So, how can we fix that? If you have any experience with Kubernetes, you should already know that we are going to make our application fault-tolerant by creating a Deployment instead of a Pod.

Inspecting the definition of `deployment.yaml`#

Let’s take a look at a new definition.

Enter to Rename, Shift+Enter to Preview

The output is as follows.

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: go-demo-8
  labels:
    app: go-demo-8
spec:
  selector:
    matchLabels:
      app: go-demo-8
  template:
    metadata:
      labels:
        app: go-demo-8
    spec:
      containers:
      - name: go-demo-8
        image: vfarcic/go-demo-8:0.0.1
        env:
        - name: DB
          value: go-demo-8-db
        - name: VERSION
          value: "0.0.1"
        ports:
        - containerPort: 8080
        livenessProbe:
          httpGet:
            path: /
            port: 8080
        readinessProbe:
          httpGet:
            path: /
            port: 8080
        resources:
            limits:
              cpu: 100m
              memory: 50Mi
            requests:
              cpu: 50m
              memory: 20Mi

The new definition is a Deployment with the name go-demo-8. It contains a single container defined in almost the same way as the Pod we defined before. The major difference is that we know that destroying our instances created as Pods will do us no good. You surely knew that already, but soon we’ll have “proof” that it is really true.

So, we are changing our strategy, and we’re going to create a Deployment. Since you are familiar with Kubernetes, you know that a Deployment creates a ReplicaSet and that ReplicaSet creates one or more Pods. More importantly, you know that ReplicaSet’s job is to ensure that a specific number of Pods is always running.

Applying the definition#

Let’s apply that definition.

Enter to Rename, Shift+Enter to Preview

Next, we’re going to wait until the Pods of that Deployment rollout.

Enter to Rename, Shift+Enter to Preview

Running chaos experiment#

Finally, we’re going to execute the same chaos experiment we executed before.

Enter to Rename, Shift+Enter to Preview

This time, we can see that everything is green. At the very beginning, all three probes passed. Then, the action to terminate the pod was executed, and we waited for 10 seconds. After that, the experiment executed probes again and validated that the state met the requirements, just as it did before executing the actions. Everything worked.

With that experiment, we can confirm that if we destroy an instance of our application, then nothing terrible will happen. Or, at least, we know that Pods will continue running because they are now controlled by a ReplicaSet created by a Deployment.

The conclusion#

Now that we have improved our application, no matter how silly it was in the first place, we confirmed through a chaos experiment that it is indeed fault-tolerant. Bear in mind that fault-tolerant is not the same as highly available. We are yet to check that. What matters is that now we know that instances of our application will be recreated no matter how we destroy them or what happens to them. Kubernetes, through Deployment definitions, is making sure that the specific number of instances is always running.

In the next lesson, we will remove the resources that we have created.

Probing Phases and Conditions

Destroying What We Created

Mark as Completed

Report an Issue

Before We Begin

Introduction To Kubernetes Chaos Engineering

Defining Requirements

Destroying Application Instances

Experimenting with Application Availability

Obstructing and Destroying Network

Draining and Deleting Nodes

Creating Chaos Experiment Reports

Running Chaos Experiments Inside a Kubernetes Cluster

Executing Random Chaos

Conclusion

Making the Application Fault-Tolerant

How can we make the application fault-tolerant?#

Inspecting the definition of `deployment.yaml`#

Applying the definition#

Running chaos experiment#

The conclusion#

Making the Application Fault-Tolerant

How can we make the application fault-tolerant?#

Inspecting the definition of deployment.yaml#

Applying the definition#

Running chaos experiment#

The conclusion#

Inspecting the definition of `deployment.yaml`#