A Cloud FinOps 90-Day Runbook for AWS, Azure, and GCP
Format
Playbook / Runbook
Sector
BFSI, healthcare, public sector
Service relevance
Cloud FinOps, analytics
Author
Vishal Shukla, VP of Technology
Most of the waste is recoverable. Holding the savings is the hard part.
Industry research in 2026 puts cloud waste between 25 and 35 percent of total cloud spend for the average enterprise: idle resources, over-provisioned instances, orphaned storage, dev environments running 24/7, and commitments that no longer match the workload. A well-run FinOps engagement closes most of that gap within twelve months.
This is a runbook, not a discussion. The phases run in sequence; activities inside each phase run in parallel. It assumes a delivery team of one FinOps lead, two cloud engineers, and a named finance partner on the client side, with executive sponsorship and a signed savings target. Savings ranges below reflect actual variance across estates - read the higher end as an upper bound, not a commitment.
- Industry research in 2026 puts cloud waste between 25 and 35 percent of total cloud spend for the average enterprise.
- First-quarter realised savings typically land in the 20 to 35 percent range; best-case engagements hit 30 to 50 percent.
- Three phases over thirteen weeks: quick wins, then governance and tagging, then operating model rollout. Phase 1 saves money, Phase 2 makes savings attributable, Phase 3 makes the discipline durable.
- Success at day 90 is measured by realised savings against the SOW, tagging compliance above 90 percent, and an operating cadence run by the client team that survives the build partner's exit.
Three phases, thirteen weeks
Phase 1
Quick Wins (Weeks 1 to 4)
Land a defensible number of dollars saved inside the first month. Visible savings buy the program time, credibility, and budget for Phases 2 and 3.
Week 1. Estate audit and baseline
0%Ingest cost data from AWS Cost Explorer plus CUR, Azure Cost Management, and GCP Billing into one warehouse. Pull 30 days of utilisation (CloudWatch, Azure Monitor, Cloud Monitoring). Identify the top 20 cost drivers. Deliverable: a baseline report and a signed savings target in the SOW. Expected savings this week: zero.
Week 2. Idle and orphaned resource cleanup
5–12%Eliminate idle compute (under 5 percent CPU for 14 days), orphaned storage, zombie load balancers, idle databases, unused NAT gateways, and unattached static IPs. Apply the standard cleanup decision tree.
Week 3. Rightsizing on compute
15–25%Run AWS Compute Optimizer, Azure Advisor, and GCP Recommender against Week 1 data. Cross-check against the business calendar. Execute in dev and test first. Rightsize Kubernetes pod requests and limits (Kubecost as the cross-cloud option).
Week 4. Commitment optimisation
30–60% on covered workloadsAudit existing RIs, Savings Plans, and CUDs (anything below 90 percent utilisation is a candidate). Model the new posture against the rightsized workloads, not the old baseline. A defensible coverage target is 70 to 80 percent of the rightsized baseline.
Phase 2
Governance and Tagging (Weeks 5 to 8)
Make Phase 1 last. Quick wins decay if the discipline that produced them is not encoded into governance. Savings reappear as new spend within two quarters if tagging is not enforced and cost ownership is not assigned.
Week 5. Tagging standard and enforcement architecture
0% directDefine the required tag schema (minimum: Owner, CostCentre, Environment, Project, DataClassification). Choose the enforcement layer (SCPs, Azure Policy, GCP Org Policy) and the audit layer. Document it as a single artifact.
Week 6. Tagging rollout
IndirectApply the schema across the estate. Auto-tag where possible, open tickets for the rest, and enforce for new resources through policy-as-code. Target 90 percent of in-scope resources tagged correctly by end of Week 8.
Week 7. Cost attribution and showback
5–15% from visibility aloneBuild a showback dashboard at team and workload level. Walk each named owner through it. Publish monthly to engineering leadership and finance.
Week 8. Commitment execution
30–60% lands by Week 12Execute the Week 4 commitment plan now the rightsized baseline has held for four weeks. Confirm utilisation tracking. Schedule the first quarterly review (Week 22 of the broader engagement).
Phase 3
Operating Model Rollout (Weeks 9 to 13)
Hand the program over to a sustainable operating cadence. Phase 1 saved money. Phase 2 made it attributable. Phase 3 makes the discipline durable.
Week 9. Operating rhythm design
CompoundingDefine the weekly FinOps and Engineering review, the monthly finance and engineering forecast review (variance over 10 percent triggers a deeper review), and the quarterly commitment and tag-schema review. Name the human who owns each cadence after the build leaves.
Week 10. Anomaly detection and alerting
IndirectConfigure native anomaly detection across all three clouds. Set thresholds at service and team level. Route alerts to the named owner with escalation. Publish an anomaly response runbook.
Week 11. Engineering enablement
5–10% over two quartersRun enablement sessions for cloud and platform teams. Add cost gates into the deployment pipeline (Infracost or native). Embed cost dashboards in the engineering workflow.
Week 12. AI workload hygiene
10–25% on AI workloadsAddress the highest-growth waste category in 2026. Audit GPU and inference utilisation (30 to 50 percent of AI compute is over-provisioned). Rightsize GPU types, move batch inference to spot, and review vector store and embedding pipeline sizing. Review monthly, not quarterly.
Week 13. Handover and retrospective
20–35% realised vs SOWHold a handover session with the Week 9 owners. Run a retrospective. Produce a final savings report against the SOW. Confirm the transition: clean exit with a support window, or a managed FinOps retainer.
- Realised savings against the SOW target. Banked, not modelled.
- Tagging compliance above 90 percent of in-scope resources.
- An operating cadence run by the client team, with a named owner, that survives the build partner's exit.
If two of three are green, the engagement has succeeded. If only one is green, it produced point-in-time savings but not durable discipline - the most common failure mode of a FinOps program, and why Phases 2 and 3 are part of this runbook rather than optional follow-ons.
The full seven-page runbook, with tools cited
The internal version adds the SOW template, savings target calculator, tagging schema template, and operating cadence template. We will email the link directly.
- Cloud FinOps
The engagement this runbook is the first 90 days of, sized to your estate.
- Analytics Modernization
Cost data warehousing, showback dashboards, and attribution.
- AI Workflow Automation
Bringing AI workload cost under the same discipline as traditional compute.
- Viitor Atlas - Governed AI Agents
Scope before spend on agent workloads, with governance built in.
- KEEL - Enterprise Context & Trust Platform
Trustworthy enterprise AI answers without over-provisioned infrastructure.
- Enterprise AI Readiness Audit Checklist
Playbook
- Production LLM Evaluation & Regression Setup
Playbook
- Energy Operator Data Platform
Case Study
Want to Run This Runbook against Your Estate?
Book a 30-minute FinOps scoping call. We will size the runbook to your AWS, Azure, or GCP estate and send the working set of templates.