Terminating Application Dependencies

This lesson will cover a chaos experiment carried out to check how the application behaves if we terminate a replica of the database.

We'll cover the following

Inspecting the definition of health-db.yaml
Checking the difference between of health-http.yaml and health-db.yaml
Running chaos experiment and inspecting the output
Assignment

There’s one more thing within the area of what we’re exploring that we might want to try. We might want to check what happens if we destroy an instance of a dependency of our application. As you already know, our demo application depends on MongoDB. We saw what happens when we destroy an instance of our application. Next, we’ll explore how the application behaves if we terminate a replica of the database (the dependency).

Inspecting the definition of `health-db.yaml`#

We are going to take a look at yet another chaos experiment definition.

Enter to Rename, Shift+Enter to Preview

The output is as follows.

version: 1.0.0
title: What happens if we terminate an instance of the DB?
description: If an instance of the DB is terminated, dependant applications should still be operational.
tags:
- k8s
- pod
- http
configuration:
  ingress_host:
      type: env
      key: INGRESS_HOST
steady-state-hypothesis:
  title: The app is healthy
  probes:
  - name: app-responds-to-requests
    type: probe
    tolerance: 200
    provider:
      type: http
      timeout: 3
      verify_tls: false
      url: http://＄{ingress_host}/demo/person
      headers:
        Host: go-demo-8.acme.com
method:
- type: action
  name: terminate-db-pod
  provider:
    type: python
    module: chaosk8s.pod.actions
    func: terminate_pods
    arguments:
      label_selector: app=go-demo-8-db
      rand: true
      ns: go-demo-8
  pauses: 
    after: 2

Checking the difference between of `health-http.yaml` and `health-db.yaml`#

As you’re already accustomed, we’ll output differences between that definition and the one we used before. That will help us spot what really changed, given that the differences are subtle.

Enter to Rename, Shift+Enter to Preview

The output is as follows.

2,3c2,3
< title: What happens if we terminate an instance of the application?
< description: If an instance of the application is terminated, the applications as a whole should still be operational.
---
> title: What happens if we terminate an instance of the DB?
> description: If an instance of the DB is terminated, dependant applications should still be operational.
27c27
<   name: terminate-app-pod
---
>   name: terminate-db-pod
33c33
<       label_selector: app=go-demo-8
---
>       label_selector: app=go-demo-8-db

We can see that, if we ignore the title, the description, and the name, what really changed is the label_selector. We’ll terminate one of the Pods with the label app set to go-demo-8-db. In other words, we are doing the same thing as before. Instead of terminating instances of our application, we are terminating instances of the database that is the dependency of our app.

Let’s see what happens. Remember, in the previous experiment, we terminated a Pod of the passed application. We could continue sending requests. It is highly available.

Running chaos experiment and inspecting the output#

Let’s see what happens when we do the same to the database.

Enter to Rename, Shift+Enter to Preview

The output, without timestamps, is as follows.

[... INFO] Validating the experiment's syntax
[... INFO] Experiment looks valid
[... INFO] Running experiment: What happens if we terminate an instance of the DB?
[... INFO] Steady state hypothesis: The app is healthy
[... INFO] Probe: app-responds-to-requests
[... INFO] Steady state hypothesis is met!
[... INFO] Action: terminate-db-pod
[... INFO] Pausing after activity for 2s...
[... INFO] Steady state hypothesis: The app is healthy
[... INFO] Probe: app-responds-to-requests
[... ERROR]   => failed: activity took too long to complete
[... WARNING] Probe terminated unexpectedly, so its tolerance could not be validated
[... CRITICAL] Steady state probe 'app-responds-to-requests' is not in the given tolerance so failing this experiment
[... INFO] Let's rollback...
[... INFO] No declared rollbacks, let's move on.
[... INFO] Experiment ended with status: deviated
[... INFO] The steady-state has deviated, a weakness may have been discovered

The initial probe passed, and the action was successful. It terminated an instance of the database, it waited for two seconds, and then it failed. Our application does not respond to requests when the associated database is not there.

We know that Kubernetes will recreate the terminated Pod of the database. However, there is downtime between destroying an existing Pod and the new one being up and running. Not only that the database is not available, but our application is not available either.

Assignment#

Unlike other examples where I showed you how to fix something, from now on, you will have homework. The goal is for you to figure out how to solve this problem. How can you make this experiment a success? You should have a clue how to do that from the application. It’s not really that different. Think about it and take a break from reading. Try executing the experiment again when you figure it out.

If the homework is too difficult, I’ll give you a couple of tips. Do the same thing to the database as we did to the application. You need to run multiple instances of the application to make it highly available. Our demo application depends on it, and it will never be highly available if the database is not available. So, multiple instances of the database are required as well. Since the database is stateful, you probably want to use StatefulSet instead of a Deployment, and I strongly recommend that you don’t try to define all that by yourself. There is an excellent Helm chart in the stable channel. Search “MongoDB stable channel Helm chart,” and you will find a definition of MongoDB that can be replicated and made highly available.

In the next lesson, we will remove the resources that we have created.

Validating Application Availability

Destroying What We Created

Mark as Completed

Report an Issue

Before We Begin

Introduction To Kubernetes Chaos Engineering

Defining Requirements

Destroying Application Instances

Experimenting with Application Availability

Obstructing and Destroying Network

Draining and Deleting Nodes

Creating Chaos Experiment Reports

Running Chaos Experiments Inside a Kubernetes Cluster

Executing Random Chaos

Conclusion

Terminating Application Dependencies

Inspecting the definition of `health-db.yaml`#

Checking the difference between of `health-http.yaml` and `health-db.yaml`#

Running chaos experiment and inspecting the output#

Assignment#

Terminating Application Dependencies

Inspecting the definition of health-db.yaml#

Checking the difference between of health-http.yaml and health-db.yaml#

Running chaos experiment and inspecting the output#

Assignment#

Inspecting the definition of `health-db.yaml`#

Checking the difference between of `health-http.yaml` and `health-db.yaml`#