Implement Health checks (#178)

* add healthchecks to moonraker, klipper and ustreamer images
This commit is contained in:
Markus Küffner
2024-11-11 23:03:40 +01:00
committed by Markus Küffner
parent f3ba780790
commit 9ef0e9df1a
10 changed files with 95 additions and 4 deletions

View File

@@ -8,6 +8,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## [Unreleased]
### Added
* klipper & moonraker: generate version file during build to correctly display versions
* klipper, moonraker & ustreamer: add healthchecks to container images
### Fixed
### Changed
### Removed

View File

@@ -39,6 +39,9 @@ RUN groupadd klipper --gid 1000 \
RUN mkdir -p printer_data/run printer_data/gcodes printer_data/logs printer_data/config \
&& chown -R klipper:klipper /opt/*
COPY --chown=klipper:klipper health.py ./
HEALTHCHECK --interval=5s CMD ["python3", "/opt/health.py"]
COPY --chown=klipper:klipper --from=build /opt/klipper ./klipper
COPY --chown=klipper:klipper --from=build /opt/venv ./venv

View File

@@ -119,3 +119,10 @@ none
|`run`|Default runtime Image for klippy|Yes|
|`tools`|Build Tools for MCU code compilation|Yes|
|`hostmcu`|Runtime Image for the klipper_mcu binary|Yes|
## Healthcheck
`/opt/health.py` gets executed every 5s inside the container.
The script does the following:
* queries klippers `info` endpoint via its unix socket
* Checks if state is `ready`
* If one of the above requirements is not `ready`, the script exits with a failure state to indicate the container is unhealthy

22
docker/klipper/health.py Normal file
View File

@@ -0,0 +1,22 @@
#!/usr/bin/env python3
import socket, json, sys
socket_address="/opt/printer_data/run/klipper.sock"
message={"id": 666, "method": "info"}
# Set up socket connection
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.connect(socket_address)
# Send message and receive response
sock.sendall(json.dumps(message).encode() + b"\x03")
response = sock.recv(4096).decode('utf-8').strip('\x03')
sock.close()
# Check the result
if json.loads(response)["result"]["state"] == "ready":
# State is ready - healthy
sys.exit(0)
else:
# State is not ready - unhealthy
sys.exit(1)

View File

@@ -39,6 +39,7 @@ RUN apt update \
systemd \
sudo \
git \
jq \
&& apt clean
WORKDIR /opt
@@ -48,6 +49,9 @@ RUN groupadd moonraker --gid 1000 \
RUN mkdir -p printer_data/run printer_data/gcodes printer_data/logs printer_data/database printer_data/config \
&& chown -R moonraker:moonraker /opt/*
COPY --chown=moonraker:moonraker health.sh ./
HEALTHCHECK --interval=5s CMD ["bash", "/opt/health.sh"]
COPY --chown=moonraker:moonraker --from=build /opt/moonraker ./moonraker
COPY --chown=moonraker:moonraker --from=build /opt/venv ./venv

View File

@@ -85,3 +85,13 @@ services:
|---|---|---|
|`build`|Pull Upstream Codebase and build python venv|No|
|`run`|Default runtime Image|Yes|
## Healthcheck
`/opt/health.sh` gets executed every 5s inside the container.
The script does the following:
* queries the `/server/info` endpoint of moonraker
* Performs the following checks
* Number of failed moonraker_components = 0
* klippy_connected is `true`
* klippy_state is `ready`
* If one of the above requirements is not met, the script exits with a failure state to indicate the container is unhealthy

17
docker/moonraker/health.sh Executable file
View File

@@ -0,0 +1,17 @@
#!/bin/bash
serverinfo=$(curl -s localhost:7125/server/info)
klippy_connected=$(echo -n ${serverinfo} | jq -r .result.klippy_connected)
klippy_state=$(echo -n ${serverinfo} | jq -r .result.klippy_state)
failed_components=$(echo -n ${serverinfo} | jq -r .result.failed_components[] | wc -l)
if [ "$klippy_connected" == "true" ] \
&& [ "$klippy_state" == "ready" ] \
&& [ $failed_components -eq 0 ]; then
## moonraker is up and connected to klippy
exit 0
else
## moonraker started w/ failed components and/or is not connected to klippy
exit 1
fi

View File

@@ -39,6 +39,8 @@ RUN apt update \
libbsd0 \
libgpiod2 \
v4l-utils \
curl \
jq \
&& apt clean
WORKDIR /opt
@@ -46,6 +48,9 @@ RUN groupadd ustreamer --gid 1000 \
&& useradd ustreamer --uid 1000 --gid ustreamer \
&& usermod ustreamer --append --groups video
COPY --chown=ustreamer:ustreamer health.sh ./
HEALTHCHECK --interval=5s CMD ["bash", "/opt/health.sh"]
COPY --chown=ustreamer:ustreamer --from=build /opt/ustreamer/src/ustreamer.bin ./ustreamer
## Start ustreamer

View File

@@ -52,3 +52,12 @@ none
|---|---|---|
|`build`|Pull Upstream Codebase and build application|No|
|`run`|Default runtime Image|Yes|
## Healthcheck
`/opt/health.sh` gets executed every 5s inside the container.
The script does the following:
* gets the JSON structure with the state of the server
* Checks the following values
* `.ok` is set to `true`, which indicates ustreamer is working
* `.result.source.online` is set to `true`, which indicates the source (webcam) is returning an image rather than `NO SIGNAL`
* If one of the above requirements is not met, the script exits with a failure state to indicate the container is unhealthy

13
docker/ustreamer/health.sh Executable file
View File

@@ -0,0 +1,13 @@
#!/bin/bash
state=$(curl -s localhost:8080/state)
ok=$(echo $state | jq -r .ok)
online=$(echo $state | jq -r .result.source.online)
if [ "$ok" == "true" ] && [ "$online" == "true" ]; then
## ustreamer is ok and source is online
exit 0
else
## ustreamer is not ok or source is not online
exit 1
fi