Grafana Plugin (grafana/v1-alpha
)
The Grafana plugin is an optional plugin that can be used to scaffold Grafana Dashboards to allow you to check out the default metrics which are exported by projects using controller-runtime.
When to use it ?
- If you are looking to observe the metrics exported by controller metrics and collected by Prometheus via Grafana.
How to use it ?
Prerequisites:
- Your project must be using controller-runtime to expose the metrics via the controller default metrics and they need to be collected by Prometheus.
- Access to Prometheus.
- Prometheus should have an endpoint exposed. (For
prometheus-operator
, this is similar as: http://prometheus-k8s.monitoring.svc:9090 ) - The endpoint is ready to/already become the datasource of your Grafana. See Add a data source
- Prometheus should have an endpoint exposed. (For
- Access to Grafana. Make sure you have:
- Dashboard edit permission
- Prometheus Data source
Basic Usage
The Grafana plugin is attached to the init
subcommand and the edit
subcommand:
# Initialize a new project with grafana plugin
kubebuilder init --plugins grafana.kubebuilder.io/v1-alpha
# Enable grafana plugin to an existing project
kubebuilder edit --plugins grafana.kubebuilder.io/v1-alpha
The plugin will create a new directory and scaffold the JSON files under it (i.e. grafana/controller-runtime-metrics.json
).
Show case:
See an example of how to use the plugin in your project:
Now, let’s check how to use the Grafana dashboards
- Copy the JSON file
- Visit
<your-grafana-url>/dashboard/import
to import a new dashboard. - Paste the JSON content to
Import via panel json
, then pressLoad
button - Select the data source for Prometheus metrics
- Once the json is imported in Grafana, the dashboard is ready.
Grafana Dashboard
Controller Runtime Reconciliation total & errors
- Metrics:
- controller_runtime_reconcile_total
- controller_runtime_reconcile_errors_total
- Query:
- sum(rate(controller_runtime_reconcile_total{job=”$job”}[5m])) by (instance, pod)
- sum(rate(controller_runtime_reconcile_errors_total{job=”$job”}[5m])) by (instance, pod)
- Description:
- Per-second rate of total reconciliation as measured over the last 5 minutes
- Per-second rate of reconciliation errors as measured over the last 5 minutes
- Sample:
Controller CPU & Memory Usage
- Metrics:
- process_cpu_seconds_total
- process_resident_memory_bytes
- Query:
- rate(process_cpu_seconds_total{job=”$job”, namespace=”$namespace”, pod=”$pod”}[5m]) * 100
- process_resident_memory_bytes{job=”$job”, namespace=”$namespace”, pod=”$pod”}
- Description:
- Per-second rate of CPU usage as measured over the last 5 minutes
- Allocated Memory for the running controller
- Sample:
Seconds of P50/90/99 Items Stay in Work Queue
- Metrics
- workqueue_queue_duration_seconds_bucket
- Query:
- histogram_quantile(0.50, sum(rate(workqueue_queue_duration_seconds_bucket{job=”$job”, namespace=”$namespace”}[5m])) by (instance, name, le))
- Description
- Seconds an item stays in workqueue before being requested.
- Sample:
Seconds of P50/90/99 Items Processed in Work Queue
- Metrics
- workqueue_work_duration_seconds_bucket
- Query:
- histogram_quantile(0.50, sum(rate(workqueue_work_duration_seconds_bucket{job=”$job”, namespace=”$namespace”}[5m])) by (instance, name, le))
- Description
- Seconds of processing an item from workqueue takes.
- Sample:
Add Rate in Work Queue
- Metrics
- workqueue_adds_total
- Query:
- sum(rate(workqueue_adds_total{job=”$job”, namespace=”$namespace”}[5m])) by (instance, name)
- Description
- Per-second rate of items added to work queue
- Sample:
Retries Rate in Work Queue
- Metrics
- workqueue_retries_total
- Query:
- sum(rate(workqueue_retries_total{job=”$job”, namespace=”$namespace”}[5m])) by (instance, name)
- Description
- Per-second rate of retries handled by workqueue
- Sample:
Subcommands
The Grafana plugin implements the following subcommands:
-
edit (
$ kubebuilder edit [OPTIONS]
) -
init (
$ kubebuilder init [OPTIONS]
)
Affected files
The following scaffolds will be created or updated by this plugin:
grafana/*.json
Further resources
- Refer to a sample of
servicemonitor
provided by kustomize plugin - Check the plugin implementation
- Grafana Docs of importing JSON file
- The usage of servicemonitor by Prometheus Operator