Rancher live metrics reports that the containers uses memory and cpu is exactly twice than actual usage.
This is the workload metric for total memory usage
sum(container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""}) by (pod_name)
this gives 2 time series results and sum on it actually duplicates:
In one of my examples:
|container_memory_working_set_bytes{container=“jenkins”,container_name=“jenkins”,endpoint=“https-metrics”,id="/kubepods/besteffort/pod430a7363-bf15-4faf-a51e-b3c558fa440e/ae7eee945b919b1226e97c22336dcb04b7621b02a672293580125b2c04c27f7d",image=“jenkins/jenkins@sha256:710f4c447e32577ff0382757ebb2e3df05173b6bca2a3186bea7aeb7d4f76f3b”,instance=“x.x.x.x:10250”,job=“expose-kubelets-metrics”,name=“k8s_jenkins_jenkins-7c64c79f9-r44s2_testresourcelimits_430a7363-bf15-4faf-a51e-b3c558fa440e_0”,namespace=“testresourcelimits”,node=“xxxxxxx”,pod=“jenkins-7c64c79f9-r44s2”,pod_name=“jenkins-7c64c79f9-r44s2”,service=“expose-kubelets-metrics”}|733532160|
|container_memory_working_set_bytes{container=“jenkins”,container_name=“jenkins”,endpoint=“https-metrics”,instance=“x.x.x.x:10250”,job=“expose-kubelets-metrics”,namespace=“testresourcelimits”,node=“xxxxx”,pod=“jenkins-7c64c79f9-r44s2”,pod_name=“jenkins-7c64c79f9-r44s2”,service=“expose-kubelets-metrics”}|733536256|
So a sum on this gives exactly 2x 733536256 and rancher charts report double the value. This extends to rancher grafana dashboards also.
However in the inbuilt rancher monitoring, I adjusted the metrics to filter out one time series using:
myfix:
sum(name!="",container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""}) by (pod_name)
Similar issue on Total CPU usage also. The total usage reports using metric:
sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)
and I remove the duplicates by:
myfix:
sum(rate(name!="",container_cpu_usage_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)
This removes the duplicate value and grafana dashboards start the reporting correctly. However the rancher charts in UI dashboard still shows double the value. Can this be fixed please in rancher UI charts as well? Much appreciated for a solution.
Thanks
Dheep