March 6, 2025

How to Visualize Kubernetes Metrics with Grafana

Learn how to effectively monitor and visualize Kubernetes metrics using Grafana and Prometheus for enhanced cluster performance and reliability.

Want to monitor your Kubernetes cluster like a pro? Grafana makes it simple to visualize metrics, identify issues, and optimize performance. By combining Grafana with Prometheus, you can track key Kubernetes data like CPU usage, memory consumption, and pod performance - all in real time.

Key Takeaways:

  • Why monitor Kubernetes metrics? Prevent downtime, optimize resources, and scale efficiently.
  • What is Grafana? A tool to create dashboards for visualizing metrics from sources like Prometheus.
  • Setup essentials: Configure Kubernetes, install Prometheus, and link it to Grafana.
  • What you’ll learn: Set up dashboards, create alerts, and track critical metrics like CPU and memory usage.

By the end, you’ll know how to set up a monitoring stack that keeps your Kubernetes environment healthy and efficient.

Before You Start

Make sure all necessary components are set up before diving into Kubernetes metrics visualization. You'll need to configure your Kubernetes cluster, Grafana, and Prometheus in that order.

Set Up Your Kubernetes Cluster

Before adding monitoring tools, confirm that your Kubernetes cluster is up and running. For production, managed services like Google Kubernetes Engine (GKE), Amazon EKS, or Azure AKS are great options. For local testing, tools like Minikube or Docker Desktop work well.

Key cluster requirements:

Component Minimum Specification
Memory 512 MB
CPU 1 core
Storage 1 GB free space
Network Port 3000 accessible

You’ll also need to enable RBAC (Role-Based Access Control) in your cluster to allow Prometheus to collect metrics. Confirm your cluster is accessible by running:

kubectl cluster-info

Set Up Grafana

Install Grafana on your system. It works on Ubuntu, Red Hat, macOS, and Windows. Ensure your system meets these requirements:

  • Memory: At least 750 MB
  • CPU: At least 0.25 cores
  • Storage: 1 GB of free disk space
  • Database: Compatible with SQLite 3, MySQL 8.0+, or PostgreSQL 12+

"Grafana allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. Create, explore, and share beautiful dashboards with your team and foster a data-driven culture." - Grafana Labs

Once installed, connect Grafana to Prometheus to start pulling Kubernetes metrics.

Set Up Prometheus

Prometheus

Prometheus is essential for collecting Kubernetes metrics. Install it using Helm, the Kubernetes package manager. Start by creating a namespace for your monitoring stack:

kubectl create namespace monitoring

Use a values.yaml file to fine-tune the installation and manage resources effectively.

Key components to configure:

Component Purpose
Persistent Volume Claim (PVC) Stores collected metrics
Service Account Handles RBAC permissions
ConfigMap Holds Prometheus configurations
Deployment Manages Prometheus pods

Setting Up Prometheus for Kubernetes

Learn how to configure Prometheus to gather Kubernetes metrics and monitor your cluster effectively.

Collect Kubernetes Metrics

Prometheus collects data from defined endpoints. To simplify setup, use the kube-prometheus-stack Helm chart, which includes:

Component Purpose Key Metrics Collected
Node-Exporter Metrics at the node level CPU, memory, and disk usage
Kube-State-Metrics Metrics for Kubernetes objects Pod status, deployment state
Alertmanager Manages alerts Alert status, notification delivery

To configure Prometheus for scraping, create a ServiceMonitor resource. Here's an example:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kubernetes-metrics
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s
    port: metrics
  selector:
    matchLabels:
      app: kubernetes

Once you set up metric collection, focus on tracking the most relevant performance indicators.

Main Metrics to Track

Keep an eye on these Kubernetes metrics to ensure efficient monitoring:

Metric Type Description Warning Threshold
Node CPU Usage CPU usage per node Above 80%
Node Memory Active Active memory usage in bytes Above 90%
Pod Resource Usage Resource consumption per container Above 85% of limits
API Server Latency API request response time Above 500ms

After identifying these metrics, configure service discovery to make monitoring more dynamic.

Kubernetes Component Discovery

Prometheus uses Kubernetes service discovery to monitor all components automatically. Start by creating a ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups: [""]
  resources:
  - nodes
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]

To enable automatic discovery, annotate Kubernetes services with the following:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "9090"
  prometheus.io/path: "/metrics"

In production testing with a 100-node cluster, kube-state-metrics latency was observed as follows:

  • 50th percentile: 259.62ms
  • 90th percentile: 475.00ms
  • 99th percentile: 906.67ms
sbb-itb-b688c76

Once Prometheus and Grafana are set up, you can connect them to start creating dashboards for Kubernetes metrics.

Set Up Prometheus as a Data Source

To link Prometheus in Grafana, follow these steps:

  1. Go to Connections > Add new connection.
  2. Search for Prometheus and click Add new data source.

Use the following configuration settings:

Setting Value Description
Name prometheus A unique name for the data source
URL http://prometheus:9090 Address of your Prometheus server
Scrape interval 30s Matches the scrape interval in Prometheus
HTTP Method GET The method used for requests

When running in containers, use service names instead of localhost to avoid connection issues.

Verify the Connection

After configuring the data source:

  • Click Test Connection to validate the setup.
  • Navigate to Explore > Metrics to query Prometheus data.
  • Run the query up to check if metrics are being retrieved.

If the connection is successful, you'll see the metric data displayed in the Explore view. Once verified, you can start creating Kubernetes dashboards.

Troubleshoot Common Issues

If you encounter problems, these tips might help:

Issue Solution Why It Matters
Network Access Check firewall rules and container networking Ensures services can communicate
URL Configuration Verify hostnames and port numbers Connects Grafana to the correct backend
Performance Reduce data points and adjust scrape intervals Keeps dashboards responsive

For secure environments, enable TLS authentication to encrypt data between Prometheus and Grafana. Add a TLS configuration like this:

apiVersion: v1
kind: Secret
metadata:
  name: prometheus-tls
type: kubernetes.io/tls
data:
  tls.crt: base64_encoded_cert
  tls.key: base64_encoded_key

This setup ensures your metrics data remains secure during transmission.

Build Kubernetes Dashboards

After successfully integrating Prometheus, it's time to create dashboards that provide clear insights into your Kubernetes environment.

Use Ready-Made Dashboards

Grafana includes pre-built dashboards that make Kubernetes monitoring easier. These templates save you time by offering detailed visualizations right out of the box.

To import a pre-made dashboard:

  1. Go to Dashboards > New > Import. Paste a JSON file or upload one directly.
  2. Set Up Dashboard Settings
    Add key details like your Prometheus data source, dashboard name, folder location, and a unique identifier.
  3. Adjust Default Values
    Modify variables to fit your cluster, such as namespace filters, node selectors, time ranges, and refresh intervals.

If these templates don’t fully meet your needs, you can create custom dashboards tailored to your specific metrics.

Make Your Own Dashboards

Custom dashboards allow you to focus on the metrics that matter most. Start by clicking Dashboards > New > New Dashboard in Grafana.

Panel Type Ideal For Key Metrics
Time series Tracking trends CPU, memory, network usage
Heatmap Identifying patterns Request latency, pod scaling
Table Viewing detailed data Container states, error rates
Stat Highlighting single values Uptime, availability

Steps for building custom panels:

  • Select the best visualization type for your data.
  • Use PromQL to configure metric queries.
  • Add clear panel titles and descriptions for better understanding.

Dashboard Design Tips

Follow these tips to ensure your dashboards are both effective and user-friendly:

  1. Organize Metrics Logically
    Group related metrics and place critical alerts or status indicators where they’re easy to spot.
  2. Keep Dashboards Responsive
    Limit panels to 12–15 per dashboard, set reasonable refresh intervals, and use template variables to filter data.
  3. Stay Consistent
    Use uniform naming conventions, layouts, and color schemes across dashboards for a cohesive experience.

For infrastructure monitoring, consider using the USE method (Utilization, Saturation, Errors) for resource metrics and the RED method (Rate, Errors, Duration) for service-level indicators.

Advanced Features

Add Dashboard Variables

Dashboard variables help make Grafana visualizations more interactive and efficient. They let you filter data across multiple panels at once, streamlining workflows and keeping dashboards organized.

To set up dashboard variables:

  • Query variable for namespaces: Use the following query to fetch namespaces:
    label_values(kube_pod_info, namespace)
    
  • Dependent variable for pods: Create a variable to filter pods by namespace:
    label_values(kube_pod_info{namespace="$namespace"}, pod)
    

With this setup, you can easily drill down from namespaces to individual pods, simplifying troubleshooting. Once your variables are ready, you can move on to defining alert rules to monitor these metrics.

Configure Alert Rules

Dynamic dashboards become even more useful with alert rules that ensure you're always on top of critical issues. Here's an example of how to set up alerts:

Alert Type Threshold Example Alert Channel
CPU Usage >80% for 15 minutes Slack
Memory Saturation >90% for 5 minutes Email
Pod Restarts >3 in 1 hour PagerDuty

These alerts allow you to act quickly when thresholds are breached, ensuring smooth operations.

Combine Metrics, Logs, and Traces

For complete observability, Grafana lets you integrate metrics, logs, and traces into one platform. The LGTM Stack (Loki, Grafana, Tempo, Mimir) from Grafana Labs is designed to make this integration seamless.

Steps to Integrate:

  • Set up matching labels between Prometheus metrics and Loki logs.
  • Instrument your applications to include traceIDs in logs.
  • Configure Loki as a data source using the Loki gateway's domain name.
  • Use LogQL queries to correlate metrics, logs, and traces.

This approach transforms isolated metrics into a unified observability system.

"Grafana Labs is really aiming to do now is make Grafana into a full observability platform. That means not only showing users their metrics, logs, and traces, but also providing them with the ability to correlate between them and be able to use all three of those pillars to quickly solve their issues." - Andrej Ocenas, Full Stack Developer, Grafana Labs

OptiAPM Enterprise Services

OptiAPM

OptiAPM Kubernetes Monitoring

OptiAPM Kubernetes Monitoring

OptiAPM goes beyond basic configurations to provide enterprise-level support for Kubernetes monitoring. Their approach enhances Grafana visualizations, offering deeper insights and helping teams address potential issues before they escalate.

Monitoring Aspect Key Benefits Business Impact
Resource Utilization Tracks CPU, memory, and storage in real time Helps allocate resources efficiently and cut cloud costs
Performance Metrics Monitors application response times Boosts user experience and system reliability
Cluster Health Detects node issues and pod distribution automatically Minimizes downtime and ensures availability

With OptiAPM, teams can monitor resource usage across clusters, keep tabs on storage volume changes, analyze energy consumption, and use machine learning to predict CPU and memory requirements.

Expert Monitoring Support

OptiAPM also provides expert support to help maintain peak system performance. Their services include:

  • Complete Visibility: By combining metrics, logs, and traces, OptiAPM offers a full view of your clusters, making it easier to spot and fix performance issues.
  • Advanced Analytics: Machine learning tools enable predictive scaling and performance forecasting.
  • Custom Dashboards: Tailored Grafana dashboards deliver real-time metrics, resource trends, custom alert thresholds, and automated reports.

These solutions simplify complex metrics, turning them into actionable insights through Grafana, while ensuring your clusters run efficiently and reliably.

Conclusion

Summary

Using Grafana to visualize Kubernetes metrics combines powerful tools into a straightforward process. For Docker setups, replace localhost with http://host.docker.internal:9090. Grafana dashboards provide real-time visibility into Kubernetes environments, with over 900,000 active installations proving its reliability for enterprise use. By keeping track of these metrics, you can strengthen your Kubernetes deployments and improve how they run. These basics lay the groundwork for further improvements.

Further Steps

Take your setup to the next level with these advanced strategies:

Focus Area Key Actions Benefits
Resource Management Adjust resource quotas and enable predictive autoscaling Avoid over-allocation and improve resource efficiency
Cost Optimization Monitor resource usage and use dynamic storage management Lower operational costs and boost efficiency
Advanced Monitoring Combine logs, metrics, and traces Gain a more complete view of your system

For businesses aiming to get the most out of Kubernetes monitoring, you might want to look into Grafana Enterprise Metrics, which offers better scalability and governance features. Tools like Grafana Mimir or VictoriaMetrics can also help solve issues with long-term data storage and query performance.

To stay ahead, focus on proactive monitoring and keep your dashboards updated regularly. If your infrastructure needs more advanced observability, professional support options are worth considering.

Related Blog Posts

Check out other articles

see all

It’s not too late to improve