Monitoring Kubernetes

Introduced in GitLab 9.0.

GitLab has support for automatically detecting and monitoring Kubernetes metrics.

Requirements

The Prometheus and Kubernetes integration services must be enabled.

Metrics supported

Name Query
Average Memory Usage (MB) (sum(avg(container_memory_usage_bytes{container_name!="POD",environment="%{ci_environment_slug}"}) without (job))) / count(avg(container_memory_usage_bytes{container_name!="POD",environment="%{ci_environment_slug}"}) without (job)) /1024/1024
Average CPU Utilization (%) sum(avg(rate(container_cpu_usage_seconds_total{container_name!="POD",environment="%{ci_environment_slug}"}[2m])) without (job)) * 100

Configuring Prometheus to monitor for Kubernetes node metrics

In order for Prometheus to collect Kubernetes metrics, you first must have a Prometheus server up and running. You have two options here:

Specifying the Environment

In order to isolate and only display relevant CPU and Memory metrics for a given environment, GitLab needs a method to detect which containers it is running. Because these metrics are tracked at the container level, traditional Kubernetes labels are not available.

Instead, the Deployment or DaemonSet name should begin with CI_ENVIRONMENT_SLUG. It can be followed by a - and additional content if desired. For example, a deployment name of review-homepage-5620p5 would match the review/homepage environment.

If you are using GitLab Auto-Deploy and one of the two provided Kubernetes monitoring solutions, the environment label will be automatically added.

Displaying Canary metrics

Introduced in GitLab 10.2.

GitLab also gathers Kubernetes metrics for canary deployments, allowing easy comparison between the current deployed version and the canary.

These metrics expect an environment label of the form $CI_ENVIRONMENT_SLUG-canary to isolate the canary metrics. If you are using GitLab Auto-Deploy, this label will be automatically configured for you.

Canary metrics supported

Name Query
Average Memory Usage (MB) (sum(avg(container_memory_usage_bytes{container_name!="POD",environment="%{ci_environment_slug}-canary"}) without (job))) / count(avg(container_memory_usage_bytes{container_name!="POD",environment="%{ci_environment_slug}-canary"}) without (job)) /1024/1024
Average CPU Utilization (%) sum(avg(rate(container_cpu_usage_seconds_total{container_name!="POD",environment="%{ci_environment_slug}-canary"}[2m])) without (job)) * 100