March 6, 2025

How to Visualize Kubernetes Metrics with Grafana

Learn how to effectively monitor and visualize Kubernetes metrics using Grafana and Prometheus for enhanced cluster performance and reliability.

Want to monitor your Kubernetes cluster like a pro? Grafana makes it simple to visualize metrics, identify issues, and optimize performance. By combining Grafana with Prometheus, you can track key Kubernetes data like CPU usage, memory consumption, and pod performance - all in real time.

Key Takeaways:

Why monitor Kubernetes metrics? Prevent downtime, optimize resources, and scale efficiently.
What is Grafana? A tool to create dashboards for visualizing metrics from sources like Prometheus.
Setup essentials: Configure Kubernetes, install Prometheus, and link it to Grafana.
What you’ll learn: Set up dashboards, create alerts, and track critical metrics like CPU and memory usage.

By the end, you’ll know how to set up a monitoring stack that keeps your Kubernetes environment healthy and efficient.

Before You Start

Make sure all necessary components are set up before diving into Kubernetes metrics visualization. You'll need to configure your Kubernetes cluster, Grafana, and Prometheus in that order.

Set Up Your Kubernetes Cluster

Before adding monitoring tools, confirm that your Kubernetes cluster is up and running. For production, managed services like Google Kubernetes Engine (GKE), Amazon EKS, or Azure AKS are great options. For local testing, tools like Minikube or Docker Desktop work well.

Key cluster requirements:

Component	Minimum Specification
Memory	512 MB
CPU	1 core
Storage	1 GB free space
Network	Port 3000 accessible

You’ll also need to enable RBAC (Role-Based Access Control) in your cluster to allow Prometheus to collect metrics. Confirm your cluster is accessible by running:

kubectl cluster-info

Set Up Grafana

Install Grafana on your system. It works on Ubuntu, Red Hat, macOS, and Windows. Ensure your system meets these requirements:

Memory: At least 750 MB
CPU: At least 0.25 cores
Storage: 1 GB of free disk space
Database: Compatible with SQLite 3, MySQL 8.0+, or PostgreSQL 12+

"Grafana allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. Create, explore, and share beautiful dashboards with your team and foster a data-driven culture." - Grafana Labs

Once installed, connect Grafana to Prometheus to start pulling Kubernetes metrics.

Set Up Prometheus

Prometheus

Prometheus is essential for collecting Kubernetes metrics. Install it using Helm, the Kubernetes package manager. Start by creating a namespace for your monitoring stack:

kubectl create namespace monitoring

Use a values.yaml file to fine-tune the installation and manage resources effectively.

Key components to configure:

Component	Purpose
Persistent Volume Claim (PVC)	Stores collected metrics
Service Account	Handles RBAC permissions
ConfigMap	Holds Prometheus configurations
Deployment	Manages Prometheus pods

Setting Up Prometheus for Kubernetes

Learn how to configure Prometheus to gather Kubernetes metrics and monitor your cluster effectively.

Collect Kubernetes Metrics

Prometheus collects data from defined endpoints. To simplify setup, use the kube-prometheus-stack Helm chart, which includes:

Component	Purpose	Key Metrics Collected
Node-Exporter	Metrics at the node level	CPU, memory, and disk usage
Kube-State-Metrics	Metrics for Kubernetes objects	Pod status, deployment state
Alertmanager	Manages alerts	Alert status, notification delivery

To configure Prometheus for scraping, create a ServiceMonitor resource. Here's an example:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kubernetes-metrics
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s
    port: metrics
  selector:
    matchLabels:
      app: kubernetes

Once you set up metric collection, focus on tracking the most relevant performance indicators.

Main Metrics to Track

Keep an eye on these Kubernetes metrics to ensure efficient monitoring:

Metric Type	Description	Warning Threshold
Node CPU Usage	CPU usage per node	Above 80%
Node Memory Active	Active memory usage in bytes	Above 90%
Pod Resource Usage	Resource consumption per container	Above 85% of limits
API Server Latency	API request response time	Above 500ms

After identifying these metrics, configure service discovery to make monitoring more dynamic.

Kubernetes Component Discovery

Prometheus uses Kubernetes service discovery to monitor all components automatically. Start by creating a ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups: [""]
  resources:
  - nodes
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]

To enable automatic discovery, annotate Kubernetes services with the following:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "9090"
  prometheus.io/path: "/metrics"

In production testing with a 100-node cluster, kube-state-metrics latency was observed as follows:

50th percentile: 259.62ms
90th percentile: 475.00ms
99th percentile: 906.67ms

sbb-itb-b688c76

Link Grafana to Prometheus

Once Prometheus and Grafana are set up, you can connect them to start creating dashboards for Kubernetes metrics.

Set Up Prometheus as a Data Source

To link Prometheus in Grafana, follow these steps:

Go to Connections > Add new connection.
Search for Prometheus and click Add new data source.

Use the following configuration settings:

Setting	Value	Description
Name	prometheus	A unique name for the data source
URL	http://prometheus:9090	Address of your Prometheus server
Scrape interval	30s	Matches the scrape interval in Prometheus
HTTP Method	GET	The method used for requests

When running in containers, use service names instead of localhost to avoid connection issues.

Verify the Connection

After configuring the data source:

Click Test Connection to validate the setup.
Navigate to Explore > Metrics to query Prometheus data.
Run the query up to check if metrics are being retrieved.

If the connection is successful, you'll see the metric data displayed in the Explore view. Once verified, you can start creating Kubernetes dashboards.

Troubleshoot Common Issues

If you encounter problems, these tips might help:

Issue	Solution	Why It Matters
Network Access	Check firewall rules and container networking	Ensures services can communicate
URL Configuration	Verify hostnames and port numbers	Connects Grafana to the correct backend
Performance	Reduce data points and adjust scrape intervals	Keeps dashboards responsive

For secure environments, enable TLS authentication to encrypt data between Prometheus and Grafana. Add a TLS configuration like this:

apiVersion: v1
kind: Secret
metadata:
  name: prometheus-tls
type: kubernetes.io/tls
data:
  tls.crt: base64_encoded_cert
  tls.key: base64_encoded_key

This setup ensures your metrics data remains secure during transmission.

Build Kubernetes Dashboards

After successfully integrating Prometheus, it's time to create dashboards that provide clear insights into your Kubernetes environment.

Use Ready-Made Dashboards

Grafana includes pre-built dashboards that make Kubernetes monitoring easier. These templates save you time by offering detailed visualizations right out of the box.

To import a pre-made dashboard:

Go to Dashboards > New > Import. Paste a JSON file or upload one directly.
Set Up Dashboard Settings
Add key details like your Prometheus data source, dashboard name, folder location, and a unique identifier.
Adjust Default Values
Modify variables to fit your cluster, such as namespace filters, node selectors, time ranges, and refresh intervals.

If these templates don’t fully meet your needs, you can create custom dashboards tailored to your specific metrics.

Make Your Own Dashboards

Custom dashboards allow you to focus on the metrics that matter most. Start by clicking Dashboards > New > New Dashboard in Grafana.

Panel Type	Ideal For	Key Metrics
Time series	Tracking trends	CPU, memory, network usage
Heatmap	Identifying patterns	Request latency, pod scaling
Table	Viewing detailed data	Container states, error rates
Stat	Highlighting single values	Uptime, availability

Steps for building custom panels:

Select the best visualization type for your data.
Use PromQL to configure metric queries.
Add clear panel titles and descriptions for better understanding.

Dashboard Design Tips

Follow these tips to ensure your dashboards are both effective and user-friendly:

Organize Metrics Logically
Group related metrics and place critical alerts or status indicators where they’re easy to spot.
Keep Dashboards Responsive
Limit panels to 12–15 per dashboard, set reasonable refresh intervals, and use template variables to filter data.
Stay Consistent
Use uniform naming conventions, layouts, and color schemes across dashboards for a cohesive experience.

For infrastructure monitoring, consider using the USE method (Utilization, Saturation, Errors) for resource metrics and the RED method (Rate, Errors, Duration) for service-level indicators.

Advanced Features

Add Dashboard Variables

Dashboard variables help make Grafana visualizations more interactive and efficient. They let you filter data across multiple panels at once, streamlining workflows and keeping dashboards organized.

To set up dashboard variables:

Query variable for namespaces: Use the following query to fetch namespaces:
```
label_values(kube_pod_info, namespace)
```
Dependent variable for pods: Create a variable to filter pods by namespace:
```
label_values(kube_pod_info{namespace="$namespace"}, pod)
```

With this setup, you can easily drill down from namespaces to individual pods, simplifying troubleshooting. Once your variables are ready, you can move on to defining alert rules to monitor these metrics.

Configure Alert Rules

Dynamic dashboards become even more useful with alert rules that ensure you're always on top of critical issues. Here's an example of how to set up alerts:

Alert Type	Threshold Example	Alert Channel
CPU Usage	>80% for 15 minutes	Slack
Memory Saturation	>90% for 5 minutes	Email
Pod Restarts	>3 in 1 hour	PagerDuty

These alerts allow you to act quickly when thresholds are breached, ensuring smooth operations.

Combine Metrics, Logs, and Traces

For complete observability, Grafana lets you integrate metrics, logs, and traces into one platform. The LGTM Stack (Loki, Grafana, Tempo, Mimir) from Grafana Labs is designed to make this integration seamless.

Steps to Integrate:

Set up matching labels between Prometheus metrics and Loki logs.
Instrument your applications to include traceIDs in logs.
Configure Loki as a data source using the Loki gateway's domain name.
Use LogQL queries to correlate metrics, logs, and traces.

This approach transforms isolated metrics into a unified observability system.

"Grafana Labs is really aiming to do now is make Grafana into a full observability platform. That means not only showing users their metrics, logs, and traces, but also providing them with the ability to correlate between them and be able to use all three of those pillars to quickly solve their issues." - Andrej Ocenas, Full Stack Developer, Grafana Labs

OptiAPM Enterprise Services

OptiAPM

OptiAPM Kubernetes Monitoring

OptiAPM goes beyond basic configurations to provide enterprise-level support for Kubernetes monitoring. Their approach enhances Grafana visualizations, offering deeper insights and helping teams address potential issues before they escalate.

Monitoring Aspect	Key Benefits	Business Impact
Resource Utilization	Tracks CPU, memory, and storage in real time	Helps allocate resources efficiently and cut cloud costs
Performance Metrics	Monitors application response times	Boosts user experience and system reliability
Cluster Health	Detects node issues and pod distribution automatically	Minimizes downtime and ensures availability

With OptiAPM, teams can monitor resource usage across clusters, keep tabs on storage volume changes, analyze energy consumption, and use machine learning to predict CPU and memory requirements.

Expert Monitoring Support

OptiAPM also provides expert support to help maintain peak system performance. Their services include:

Complete Visibility: By combining metrics, logs, and traces, OptiAPM offers a full view of your clusters, making it easier to spot and fix performance issues.
Advanced Analytics: Machine learning tools enable predictive scaling and performance forecasting.
Custom Dashboards: Tailored Grafana dashboards deliver real-time metrics, resource trends, custom alert thresholds, and automated reports.

These solutions simplify complex metrics, turning them into actionable insights through Grafana, while ensuring your clusters run efficiently and reliably.

Conclusion

Summary

Using Grafana to visualize Kubernetes metrics combines powerful tools into a straightforward process. For Docker setups, replace localhost with http://host.docker.internal:9090. Grafana dashboards provide real-time visibility into Kubernetes environments, with over 900,000 active installations proving its reliability for enterprise use. By keeping track of these metrics, you can strengthen your Kubernetes deployments and improve how they run. These basics lay the groundwork for further improvements.

Further Steps

Take your setup to the next level with these advanced strategies:

Focus Area	Key Actions	Benefits
Resource Management	Adjust resource quotas and enable predictive autoscaling	Avoid over-allocation and improve resource efficiency
Cost Optimization	Monitor resource usage and use dynamic storage management	Lower operational costs and boost efficiency
Advanced Monitoring	Combine logs, metrics, and traces	Gain a more complete view of your system

For businesses aiming to get the most out of Kubernetes monitoring, you might want to look into Grafana Enterprise Metrics, which offers better scalability and governance features. Tools like Grafana Mimir or VictoriaMetrics can also help solve issues with long-term data storage and query performance.

To stay ahead, focus on proactive monitoring and keep your dashboards updated regularly. If your infrastructure needs more advanced observability, professional support options are worth considering.