Monitoring & Scaling
Why Monitoring & Scaling
Picture a busy train station. Trains (applications) arrive and depart constantly, and station managers must monitor schedules, passenger flow, and adjust capacity when crowds surge. Without monitoring, delays pile up; without scaling, passengers are stranded.
Monitoring and scaling in Kubernetes are the station managers, ensuring workloads run smoothly, resources are optimized, and clusters adapt to demand.
Monitoring in Kubernetes
- Metrics Collection: Prometheus scrapes metrics from pods, nodes, and services.
- Visualization: Grafana provides dashboards for real‑time insights.
- Logging: Tools like Fluentd or Loki capture logs for troubleshooting.
- Tracing: Jaeger or OpenTelemetry track requests across microservices.
Analogy: Monitoring is like station cameras and dashboards, showing managers where trains are, how crowded platforms are, and spotting issues early.
Scaling in Kubernetes
- Horizontal Pod Autoscaler (HPA): Adds/removes pods based on CPU/memory usage.
- Vertical Pod Autoscaler (VPA): Adjusts pod resources dynamically.
- Cluster Autoscaler (CA): Adds/removes nodes to match workload demand.
- Advanced Scaling: Combine autoscalers with custom metrics (e.g., request latency).
Analogy: Scaling is like adding more trains or longer carriages when passenger demand spikes.
Global Context
- Enterprises: Use monitoring + scaling to ensure uptime for mission‑critical apps.
- Cloud Providers: Offer managed monitoring (e.g. AWS CloudWatch, GCP Stackdriver) integrated with autoscaling.
- Community: Prometheus + Grafana are CNCF projects, widely adopted as the de‑facto monitoring stack.
Hands‑On Exercise
- Test Scaling:
- Run a load test with
kubectl runorheytool. - Observe pods scaling up/down automatically.
- Run a load test with
- Reflect: How do monitoring tools act as station dashboards, and autoscalers as extra trains, ensuring smooth operations?
Set Up HPA:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: frontend-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: frontend
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Deploy Prometheus + Grafana:
kubectl apply -f https://github.com/prometheus-operator/prometheus-operator/blob/main/bundle.yaml
The Hacker’s Notebook
- Prometheus scrapes metrics like eyes of the cluster.
- Grafana visualizes dashboards for clarity.
- HPA adds pods like elastic workloads.
- VPA resizes pods and right‑sized resources.
- CA scales nodes as infrastructure elasticity.
- Lesson for engineers: Don’t fly blind just monitor and scale proactively.
- Hacker’s mindset: Treat monitoring as your radar and scaling as your autopilot. With them, Kubernetes runs globally without missing a beat.

Updated on Dec 30, 2025