What IAM Permissions Does a Cost-Optimization Tool Actually Need? An Honest Read-Only Scope You Can Defend in Security Review

A platform team starts a security review for a cost-optimization vendor. The vendor’s onboarding doc asks for a CloudFormation template that creates a role with arn:aws:iam::aws:policy/AdministratorAccess “to ensure all features work.” The security engineer reads this, closes the doc, and the procurement is paused for six weeks while a counter-proposal is drafted.

This is the most common procurement-stalling pattern in FinOps tooling. The vendor wants AdministratorAccess to avoid feature gaps. The customer wants read-only to defend the procurement. Both sides are right about their priorities. The honest answer is somewhere in between, and it is much closer to “read-only” than vendors usually let on.

This post is the honest IAM scope a cost-optimization or non-prod-scheduling tool actually needs. What read-only covers, what action permissions are genuinely required (and which are not), the deny-overlay pattern, and the 4-question checklist you can run on any vendor before signing. The pattern composes with read-only MCP servers for cloud infrastructure, closed-loop FinOps, and policy-aware governance.

The four levels of IAM scope a vendor might request

Not all vendor IAM requests are equal. Knowing the four shapes makes the negotiation precise.

Level	Permission scope	What it means	Acceptable for?
1	`ReadOnlyAccess` only	View state, no mutation	Visibility, reporting, dashboards
2	`ReadOnlyAccess` + a small allow-list	View + specific actions on specific resources	Scheduling, tagging, lifecycle
3	Hand-curated 50-100 permission custom role	Whatever the vendor’s PM asked for	Power-user products with broad action surface
4	`PowerUserAccess` or `AdministratorAccess`	Full mutation surface	Almost never acceptable

Level 1 is what every honest FinOps tool starts with. Level 2 is the right shape for tools that actually take actions (stop instances, retag, schedule shutdown). Level 3 is the answer when a vendor cannot articulate why they need each permission and just wants headroom. Level 4 is a procurement red flag.

The negotiation is between Level 1 (customer’s preference) and Level 2 (vendor’s actual minimum for action features). Most procurement stalls happen because the vendor positioned at Level 3 when Level 2 was achievable.

What `ReadOnlyAccess` actually covers

arn:aws:iam::aws:policy/ReadOnlyAccess is an AWS-managed policy with ~200 services’ worth of read APIs. It has been in AWS since 2014. Every cloud security engineer has seen it. It includes:

Cost & Usage: ce:Get*, cur:DescribeReportDefinitions
EC2 / VPC / EBS: ec2:Describe* (instances, volumes, security groups, ELBs, NAT gateways)
EKS / ECS: eks:Describe*, eks:List*, ecs:Describe*
S3: s3:GetBucketLocation, s3:ListBucket, s3:GetEncryptionConfiguration, s3:GetBucketTagging
IAM: iam:Get*, iam:List* (read-only, cannot create/modify users or roles)
RDS / DynamoDB / ElastiCache: Describe*, List*
Lambda / API Gateway: lambda:Get*, lambda:List*
CloudWatch / CloudTrail: metrics, logs, lookup events
Tags: tag:Get*

What it does NOT include:

No Create*, Delete*, Put*, Modify*, Update*, Run*, Stop*, Start*, Terminate*, Reboot*
No iam:CreateUser, iam:DeleteUser, iam:AttachRolePolicy
No s3:PutObject, s3:DeleteObject, s3:DeleteBucket
No ec2:RunInstances, ec2:TerminateInstances, ec2:StopInstances

A FinOps tool whose only feature is “tell me what I am spending” needs only this policy. A tool that adds “and stop the idle ones” needs the deny + allow-list pattern below.

The two read-only permissions to be cautious about even within this policy:

s3:GetObject reads object content. If buckets contain PII, narrow to s3:ListBucket only or scope GetObject to specific prefixes.
ssm:GetParameter reads parameter values. If your team stores secrets in plaintext SSM (you should not), this exposes them. Add a deny on ssm:GetParameter or move secrets to Secrets Manager.

A tighter version of ReadOnlyAccess for FinOps purposes is the AWS-managed ViewOnlyAccess, which excludes Get* calls that return content (no s3:GetObject, no dynamodb:GetItem). For dashboards-only tools, ViewOnlyAccess is the cleanest answer.

The action permissions that are actually required (and which are not)

A FinOps tool that schedules shutdowns or tags resources needs SOME write permissions. The honest list, by feature category:

For “stop idle / scheduled / non-prod resources”:

Resource	Permission required	Permission NOT required
EC2 instances	`ec2:StopInstances`, `ec2:StartInstances`	`ec2:TerminateInstances`
RDS instances	`rds:StopDBInstance`, `rds:StartDBInstance`	`rds:DeleteDBInstance`
RDS Aurora	`rds:StopDBCluster`, `rds:StartDBCluster`	`rds:DeleteDBCluster`
EKS managed nodegroup	`eks:UpdateNodegroupConfig`	`eks:DeleteNodegroup`
ASG / VMSS	`autoscaling:UpdateAutoScalingGroup`	`autoscaling:DeleteAutoScalingGroup`
ElastiCache	`elasticache:ModifyReplicationGroup` (for shard count)	`elasticache:DeleteReplicationGroup`

The pattern: every “stop” / “start” / “scale-down” action has a Terminate / Delete sibling. The FinOps tool needs the first; almost never the second. A vendor that asks for ec2:TerminateInstances “for cleanup” should be asked which feature requires it. Most “cleanup” features can be done by the customer’s own ops team after the FinOps tool flags the resource.

For “auto-tagging” or “tag governance”:

Resource	Permission required
Most resources	`tag:TagResources`, `tag:UntagResources`
Specific resources	`ec2:CreateTags`, `s3:PutBucketTagging`, etc.

The Resource Groups Tagging API (tag:TagResources) is the unified surface: one permission across most AWS resources. Use it instead of the per-service tagging permissions where possible.

For “rightsizing / instance type changes”:

This is the line where action permissions get touchy. Modifying an instance type requires ec2:ModifyInstanceAttribute. Resizing an RDS instance requires rds:ModifyDBInstance. Both are write permissions with bigger blast radii than start/stop.

The honest answer: rightsizing should be a recommendation, not an action. The FinOps tool reads, computes, recommends. The customer’s CI/CD pipeline applies the change after a human approval. The vendor does not need ModifyInstanceAttribute. The customer’s pipeline does, and the customer already has it.

The deny-overlay pattern

The cleanest way to bound a vendor’s role is ReadOnlyAccess plus a small inline allow for the specific actions, plus a deny overlay that protects critical resources from even the allowed actions. The inline policy has three statement blocks:

Statement Sid	Effect	Action set	Resource / condition
`AllowScheduleActions`	Allow	`ec2:StartInstances`, `ec2:StopInstances`, `rds:StartDBInstance`, `rds:StopDBInstance`, `rds:StartDBCluster`, `rds:StopDBCluster`, `eks:UpdateNodegroupConfig`, `autoscaling:UpdateAutoScalingGroup`, `tag:TagResources`, `tag:UntagResources`	`*`
`DenyOnProductionAndCriticalContent`	Deny	`*`	Resource tag `Environment=production`
`DenySensitiveReads`	Deny	`s3:GetObject`, `ssm:GetParameter`, `ssm:GetParameters`, `secretsmanager:GetSecretValue`	`*`

Attach arn:aws:iam::aws:policy/ReadOnlyAccess plus this inline policy. The role can read state, can take the listed write actions, cannot touch anything tagged Environment=production, cannot read secrets or sensitive object content. The deny wins per IAM evaluation.

Three things make this defensible in a security review:

The deny on production tags is a hard floor. Even if the vendor’s product has a bug that targets the wrong resource, the production tag prevents action. This is the single most powerful guardrail for a FinOps tool whose blast radius is “non-prod only.”
The action allow list is short and named. The security engineer can read each action and judge it. There is no *:* wildcard.
Sensitive reads are explicitly denied. Even though ReadOnlyAccess includes s3:GetObject, the deny overlay removes it. The vendor cannot exfiltrate object content even by accident.

This is the role shape that gets approved in 20-30 minutes instead of 6 weeks.

The 4-question vendor checklist

Run these four questions on any FinOps vendor’s IAM ask before signing.

Q1: Why do you need each Allow action? Map it to a feature.

The vendor should be able to say “we need ec2:StopInstances for the scheduling feature; tag:TagResources for the auto-tagging feature; iam:GetRole for the IAM-context feature.” Every action maps to a named feature. If the vendor cannot map an action to a feature, ask them to remove it.

A bad answer: “we need ec2:* because some advanced features may require additional permissions.” A good answer: “we need exactly these 8 actions; here is which feature uses each.”

Q2: Will the role work with a tag-based deny on production resources?

The vendor’s product should be able to operate on a subset of your fleet (non-prod only). If the product breaks when production resources are denied, the vendor is doing something on your prod resources you may not want.

A bad answer: “the deny will break our discovery.” A good answer: “yes, we already support tag-based scoping; here is the feature flag.”

Q3: Where does the vendor’s IAM role live, and does it cross account boundaries?

The role should live in YOUR account, assumed by the vendor’s service account via cross-account trust. The trust policy should pin the vendor’s external ID (a unique value per customer) to prevent confused-deputy attacks.

A bad answer: “we manage the role in our account; you give us API keys.” A good answer: “you create the role in your account, we provide the trust policy template, the external ID is <unique-string-per-tenant>.”

Q4: What does the audit log of the vendor’s actions look like, and where is it stored?

The vendor should provide a per-action audit log, ideally streamable to your CloudWatch Logs / SIEM. Every action they take on your account should be logged with timestamp, action, resource, requesting user (if applicable), and the SaaS-side reason.

A bad answer: “you can see our actions in your CloudTrail.” (Technically true; not enough detail; CloudTrail is a haystack.) A good answer: “we ship a structured audit log per customer; here is a sample line; here is the destination configuration.”

The actual minimum scope for common features

For reference, the smallest defensible IAM scope for each common FinOps feature category:

Feature	AWS-managed policy	Inline allow	Inline deny
Visibility / dashboards / reports	`ViewOnlyAccess`	None	`s3:GetObject`, `secretsmanager:Get*`
Cost forensics + recommendations	`ReadOnlyAccess`	None	`s3:GetObject`, `ssm:GetParameter`
Non-prod scheduling (stop/start)	`ReadOnlyAccess`	6-8 specific start/stop actions	Tag-based deny on prod
Auto-tagging	`ReadOnlyAccess`	`tag:TagResources`, `tag:UntagResources`	Tag-based deny on prod
Rightsizing recommendations	`ReadOnlyAccess`	None (customer applies)	None needed
Closed-loop remediation	`ReadOnlyAccess`	Action allow list per remediation	Tag-based deny + critical-resource deny

A vendor whose feature is “visibility” should request ViewOnlyAccess. A vendor whose feature is “non-prod scheduling” should request ReadOnlyAccess plus 6-8 named start/stop actions. Any vendor request larger than that should be questioned.

This is the shape ZopNight’s scoped read-only IAM follows by default. The customer’s procurement gets a documented role template that maps each permission to a named feature, and the deny on production tags is enabled out of the box.

The closing call

The 6-week procurement stall on FinOps tools is almost always about the IAM ask. The vendor wants headroom. The customer wants a defensible scope. The honest answer is much closer to “read-only” than vendors usually present, and the 4-question checklist surfaces the gap in 20 minutes.

The two non-negotiables:

The role lives in your account. The vendor never holds a credential.
The deny on production tags is the floor. Even if the action allow list is too generous, the deny prevents prod blast radius.

Most FinOps tooling can operate at Level 2 (ReadOnlyAccess + a short action allow-list + deny overlay). Vendors that can articulate this scope cleanly are the ones whose security reviews pass quickly. Vendors that cannot are the ones who lose the deal in week six.

Run the 4 questions on your next vendor evaluation. The role you end up with will be one inline policy plus one AWS-managed ARN. The security engineer will sign off in a week. The savings start immediately. The conversation that used to take 6 weeks of back-and-forth becomes a one-meeting decision.