Monitoring Health
Learn why it is important to monitor the health of services and how Kubernetes Probes can help to achieve this.
Why to monitor health?#
The go-demo-2
Docker image is designed to fail on the first sign of trouble. In cases like that, there is no need for any health checks. When things go wrong:
- The main process stops.
- The container hosting the main process stops as well.
- Kubernetes restarts the failed container.
However, not all services are designed to fail fast. Even those that are might still benefit from additional health checks. For example, a back-end API can be up and running but, due to a memory leak, serves requests much slower than expected. Such a situation might benefit from a health check that would verify whether the service responds within, for example, two seconds.
Kubernetes Probes#
We can exploit Kubernetes liveness and readiness probes for that.
Liveness Probe#
livenessProbe
can be used to confirm whether a container should be running. If the probe fails, Kubernetes will kill the container and apply restart policy which defaults to Always
.
Readiness Probe#
We’ll leave readinessProbe
for later since it is directly tied to Services
.
Instead, we’ll explore livenessProbe
. Both are defined in the same way so the experience with one of them can be easily applied to the other.
Understanding the updated pod definition#
Let’s take a look at an updated definition of the Pod we used thus far.
-
Line 8-12: Don’t get confused by seeing two containers in this Pod. Those two should be defined in separate Pods. However, since that would require knowledge we are yet to obtain, and go-demo-2 doesn’t work without a database, we’ll have to stick with the example that specifies two containers. It won’t take long until we break it into pieces.
-
Line 16-19: The additional definition is inside the
livenessProbe
. We defined that the action should behttpGet
followed with thepath
and theport
of the service. Since/this/path/does/not/exist
is true to itself, the probe will fail, thus showing us what happens when a container is unhealthy. Thehost
is not specified since it defaults to the Pod IP. -
Line 20-23: We declared that the first execution of the probe should be delayed by five seconds (
initialDelaySeconds
), that requests should timeout after two seconds (timeoutSeconds
), that the process should be repeated every five seconds (periodSeconds
), and (failureThreshold
) define how many attempts it must try before giving up .
Liveness probe in action#
Let’s take a look at the probe in action.
We created the Pod with the probe. Now we must wait until the probe fails a few times. A minute is more than enough. Once we’re done waiting, we can describe the Pod.
The bottom of the output contains events. They are as follows.
We can see that, once the container started, the probe was executed, and that it failed. As a result, the container was killed only to be created again. In the output above, we can see that the process was repeated three times (3x over ...
).
Please visit Probe v1 core if you’d like to learn all the available options.
Pods are (almost) useless (by themselves)#
Pods are fundamental building blocks in Kubernetes. In most cases, you will not create Pods directly. Instead, you’ll use higher level constructs like Controllers.
Pods are disposable. They are not long lasting services. Even though Kubernetes is doing its best to ensure that the containers in a Pod are (almost) always up-and-running, the same cannot be said for Pods. If a Pod fails, gets destroyed, or gets evicted from a Node, it will not be rescheduled. At least, not without a Controller. Similarly, if a whole node is destroyed, all the Pods on it will cease to exist. Pods do not heal by themselves. Excluding some special cases, Pods are not meant to be created directly.
📝 Do not create Pods by themselves. Let one of the controllers create Pods for you.
For your ease, all the instructions have been given below:
You can practice the commands in the following code playground by pressing the Run button and waiting for the cluster to set up.
/
- go-demo-2-health.yml
Destroying Everything#
We’ll remove the cluster and start the next chapter fresh. Delete the following cluster using the command below:
Troubleshooting tips for minikube
#
If you are working with minikube
, use the following instruction to delete the cluster.
minikube delete