I have a sensu installation available to me that I am expected to use for all my health monitoring. I am trying to figure out how to best monitor that my rancher server is healthy. There does not seem to be a health check endpoint documented anywhere. What I have right now is a check that the docker instance is running, and that I get a 200 response from http://localhost:8080/login. I’m sure that is good enough for many hard down conditions. But it would be nice if I had an endpoint that gave a much more nuanced view of the health of my server.
You can monitor it through the API. There are various endpoints which can be used, depending on what you want to monitor. I monitor service status for example that way. The API returns JSON objects and I check that
state=active. I am not using Sensu yet however.
The Rancher server could post metrics to a Graphite database if you have one (or InfluxDB running the Graphite adapter) - https://github.com/Rucknar/Guide_Rancher_Monitoring#rancher-graphite-support