Customer overview
IOTAIRx is a healthcare AI company building diagnostic assistance tools for physicians. Their advanced AI models help improve patient outcomes and reduce specialist referral times across a growing range of clinical use cases.
The challenge
As IOTAIRx scaled their model development, they faced escalating infrastructure
expenses and energy consumption from long-running fine-tuning jobs on Google
Cloud. Their training stack — a deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
model fine-tuned on custom medical datasets — ran on a GKE cluster with
NVIDIA A100 GPUs in us-central1: powerful, but costly, and energy-intensive.
The solution: Pebble EcoAgent
Pebble deployed EcoAgent, an intelligent optimization tool that manages power consumption and resource allocation during machine learning workloads. To make the impact unambiguous, the team ran a controlled comparison: a representative training process was divided into 20 epochs, with the first 10 running as a baseline and the next 10 running with EcoAgent enabled.
EcoAgent is a drop-in optimizer — no model code changes, no retraining pipeline rewrites. It learns the workload's compute and memory profile and automatically tunes power and allocation to match.
Results
| Metric | Baseline | With EcoAgent | Improvement |
|---|---|---|---|
| Infrastructure cost (per job) | — | — | −40% |
| Energy consumption (kWh) | — | — | −48% |
| CO₂ emissions | — | — | −48% |
| Time-to-converge | Baseline | Comparable | No regression |
Metrics improved immediately once EcoAgent activated, with sustained optimization across every subsequent training epoch. Critically, model quality and convergence behavior were preserved — savings came from eliminating idle GPU cycles, not from cutting corners on training.
Business impact
The optimization enabled IOTAIRx to improve resource allocation efficiency, accelerate development cycles, and advance their diagnostic mission through environmentally and financially responsible engineering practices. Cost headroom freed by EcoAgent was redirected into additional fine-tuning runs, shortening time-to-clinical-validation.