Introduction

Kubernetes workloads are often overprovisioned because requests and limits are set conservatively and then rarely revisited. Static rules and community autoscaling tools can help, but they often fall short when workload behavior varies across services, traffic patterns, seasonal demand, and cluster scale.

EcoScale uses observed workload behavior and policy-driven algorithms to generate CPU and memory recommendations for Kubernetes workloads. Recommendations include current resources, target resources, and cost impact, so teams can review operational tradeoffs before any change; once stable workloads pass policy and review gates, recommendations can flow into automatic apply mode.

Example recommendation summary: compare current and recommended resources with the delta before applying a change.

Optimization Workflow

EcoScale starts by discovering workloads in connected clusters. It evaluates CPU and memory requests and limits, then presents per-container recommendations in a review-first model before any workload is changed.

Recommendations can be tuned with scaling goals that reflect different operational needs, such as reducing cost, preserving performance headroom, or balancing efficiency and reliability.

Most teams begin in preview mode, apply selected recommendations manually, and validate the result. After a workload has stable recommendations and predictable behavior, it can be promoted to automation.

Autopilot is only for trusted workloads. Enable it when these gates are met:

Stable metric history.
Clear ownership and criticality review.
Manual recommendations already validated.
Cluster permissions explicitly aligned for apply mode.
No recent stability regressions in the workload review history.

GPU-intensive workloads use the same controlled, review-first model. Enable autopilot only after stability criteria are proven and accelerator-specific behavior is validated.