Resource Ownership Derived From Activity Logs: Backfilling the Tag Your Team Never Set

A FinOps lead opens the cost report. A team-aggregate row shows $14,000 a month attributed to no team. The drill-down reveals 320 EC2 instances and 47 RDS databases with no owner tag. The lead opens a ticket asking “whose resources are these.” The ticket sits open for three weeks while six engineers each say “not mine.” At week four the team gives up and writes off the spend as platform overhead. The 320 instances continue running.

This is the most common cloud governance failure. The owner tag is the foundational input to chargeback, to right-sizing approvals, to scheduling decisions, and to incident routing. Despite explicit policy that every resource must carry an owner tag, the tag is missing on 30 to 60% of resources in a typical organisation. The cloud’s own metadata has the answer: every RunInstances or CreateBucket call captured by CloudTrail records the IAM principal who made the call. The information has always been there. The friction is that reading it for one resource takes 20 minutes (find the event, decode the principal ARN, resolve assumed-role chains, look up the human in IAM), and reading it for 1,800 untagged resources is a multi-week project nobody assigns.

ZopNight ships activity-log-derived ownership. CloudTrail, GCP Audit Logs, and Azure Activity Log are read on a schedule. For each resource the creation event is parsed, the IAM principal extracted, and the principal resolved to a human (or to a CI/CD pipeline, flagged distinctly). The derived owner appears as a column on every resource page. Companies that flip the feature on see 80 to 90% of their previously-untagged resources gain an owner overnight.

This post walks through what the derivation does, why tag vs derived owner is the right reconciliation shape, how IAM principals get resolved to humans across assumed-role chains, what happens to legacy resources older than the activity log retention window, and how the auto-tagging pipeline closes the loop.

Why “who owns this” is the most-asked unanswered question

The fallback path for finding ownership of an untagged resource is well-documented. The path exists in every cloud’s documentation. The path is almost never taken.

Step in the manual path	Time per resource	Failure mode
Find the CloudTrail event for resource creation	4-7 minutes	Event outside default 90-day retention; resource older than log
Decode the IAM principal ARN	1-2 minutes	Assumed-role chain not obvious from the ARN
Resolve assumed role to the actual user	5-10 minutes	Multi-hop chains, expired sessions
Map IAM user to a human (Slack, HR, etc.)	3-5 minutes	User no longer at the company
Document the finding	2-3 minutes	Goes into a spreadsheet nobody reads later

The 20-minute exercise on its own is not the problem. The problem is that the exercise has to be done for hundreds or thousands of resources, by an engineer who does not own the work and is not motivated to finish it. The result is that the answer to “who owns this” is institutionally unknown, even though it lives in a log file the cloud writes for free.

Automating the read changes the economics. The 20 minutes becomes a millisecond per resource because the lookup is batched. The accuracy improves because the parsing handles edge cases (assumed-role chains, federated identities, service accounts) the human dig usually gets wrong. The output is a column, not a spreadsheet.

What activity-log ownership actually derives

The derivation pipeline reads activity logs on a schedule (every hour for new events, with a one-time backfill on initial enablement). The relevant event types are the resource-creation events for each tracked resource kind: RunInstances for EC2, CreateDBInstance for RDS, CreateBucket for S3, and the equivalents in GCP Audit Logs and Azure Activity Log.

For each parsed event the pipeline records three things: the resource ID, the timestamp of creation, and the resolved owner. The owner is either a human (with a display name, an email, and a link to the IAM principal) or a tagged automation entity (with the pipeline name and a link back to the configured automation identity).

The ingest is incremental: after the initial backfill, only new events are processed. The backfill itself takes 10 minutes to a few hours depending on the retention window and the activity log volume; for most customers it completes in under an hour.

Tag vs derived owner: the reconciliation that matters

The owner tag and the derived owner are different signals. The tag is what the team claims; the derived owner is what the cloud’s own logs show. The product surfaces both as separate columns and reconciles them on a per-resource basis.

Tag	Derived owner	Reconciliation outcome
`alice@company.com`	`alice@company.com`	Match: nothing to do
`alice@company.com`	`bob@company.com`	Disagree: governance review
missing	`alice@company.com`	Tag missing: auto-tag candidate
`alice@company.com`	missing (resource older than retention)	Trust the tag; flag low-confidence
missing	missing	Unrecoverable: needs human escalation
`team-payments`	`alice@company.com`	Team-tag vs personal: usually OK if alice is on team-payments

The “disagree” outcome is the most interesting. The most common cause is that the resource was created by one engineer, transferred to another team six months ago, and the tag was updated but the underlying provisioning logic still attributes creation to the original engineer. This is the resource that needs governance review: not because the tag is wrong, but because the audit-trail is now inconsistent with the team’s claim.

The “tag missing, derived owner present” outcome is the most numerous. These are the resources that gain an owner overnight when the feature is enabled. The governance review for these is typically a one-time bulk action: confirm the IAM-to-human mapping is correct for a given principal, then auto-tag every resource that principal created.

Resolving IAM principals to humans

The IAM principal in the activity log event is not always a human. Three cases dominate the resolution logic.

Case	Principal in the log	Resolution
Direct IAM user	`arn:aws:iam::123:user/alice`	Map `alice` to `alice@company.com` via the IAM user metadata
Federated user / SSO	`arn:aws:sts::123:assumed-role/sso-developer/alice@company.com`	Extract email from session name
Role chain	`arn:aws:sts::123:assumed-role/team-engineer/i-0123abc`	Follow `sourceIdentity` and `sessionContext.sessionIssuer` up to 4 hops
Automation (CI/CD)	`arn:aws:sts::123:assumed-role/github-actions/run-12345`	Flag as `automation: github-actions:repo/<repo>`

The role-chain case is the one most manual-lookup attempts get wrong. When an engineer at Alice’s company assumes a role to provision a resource in a different account, the CloudTrail event records the role-name as the principal, not Alice. The fix is to follow the sourceIdentity field and the sessionContext.sessionIssuer.userName field through the chain. ZopNight’s resolver does this automatically; the manual-lookup engineer usually gives up after one hop.

The automation case matters because mis-resolving a service account to a human (because the role has a name like provisioner-role and an engineer once used that role) is the failure mode that erodes trust in the derived ownership. The resolver explicitly identifies known automation patterns (CI/CD runners, scheduled Lambda functions, autoscaling group provisioners) and tags those resources with the pipeline name, not a human owner. The team can then look up the pipeline’s owning team via a configured mapping.

When the resolver cannot determine whether a principal is a human or automation (rare, but happens for custom role naming conventions), the principal appears as principal:<role-name> and the governance team is prompted to classify it once. The classification persists; the next resource created by that principal resolves automatically.

Backfilling legacy resources

The most-asked question after “what’s the feature” is “what does it do for resources we already have.” The answer depends on the activity log retention window.

Retention window	Resources older than window	Coverage rate
Default CloudTrail (90 days)	~30-50% of resources unrecoverable	50-70%
365 days	~5-10% unrecoverable	90-95%
Unlimited (S3-archived logs)	~0% unrecoverable	99%+
CloudTrail Lake (configurable)	0% if Lake covers all creates	99%+

A customer running default CloudTrail with the standard 90-day retention will see ownership derived for 50-70% of their untagged resources on initial enablement. For most teams that is a transformational improvement; from “30% of cost is attributed” to “85% of cost is attributed” overnight. The remaining 30-50% of untagged resources still need a manual call.

The recommendation surfaced inside the ownership page is to extend CloudTrail retention to 365 days (or longer). The cost is a marginal increase in S3 storage (CloudTrail logs at 365 days for a mid-size company cost on the order of hundreds of dollars a month, not thousands). The value is the ability to derive ownership for 90-95% of untagged resources, including resources that survive multiple ownership transfers.

For resources older than the log retention window, the derived ownership column simply shows “unknown — older than CloudTrail retention.” These resources need to be manually owned by their team, or to be cleaned up if no team claims them. The product is honest about its limits: the cloud cannot remember what it did not log.

Auto-tag on reconciliation: the destination

The derived ownership data is most valuable once it is fed back into the chargeback tag schema. The auto-tag flow is the destination.

Step	What happens
1	Derived ownership is computed for every untagged resource
2	Governance team reviews IAM-to-human mappings for each unique principal (one-time)
3	Team selects “auto-tag all resources owned by principal X with `owner = X@company.com`”
4	Tags are applied in bulk, with audit-trail entries showing the source (CloudTrail event abc, principal xyz)
5	Cost Reports, chargeback, and auto-remediation rules start seeing the resource as owned

The one-time governance review per principal is the safety gate. The reviewer confirms that “principal arn:…alice maps to alice@company.com” before any auto-tag is applied. This catches the cases where an IAM user was provisioned for a service account, where a person left the company and their IAM principal now refers to nobody, where the IAM mapping changed mid-employment.

Once the review is done, the bulk tagging is one click. A typical mid-size team gains 200 to 800 newly-tagged resources in the first round. The next round (the residual disagree-with-tag set) is smaller and more nuanced; teams handle those one at a time.

How to use it day to day

The day-one workflow is short.

Step	Action	Where
1	Enable activity-log ownership	Settings → Governance → Activity log ownership
2	Wait for backfill (10 min - 1 hour)	Status page
3	Open the ownership page	Sidebar → Governance → Ownership
4	Filter to “tag missing, derived owner present”	Filter chip on the ownership page
5	Group by principal	Group-by selector
6	Review each principal’s IAM-to-human mapping	One-time per principal
7	Bulk auto-tag confirmed principals	Per-principal button

For ongoing operation the workflow is even shorter. New resources get derived ownership at creation time (the activity log ingest runs hourly). The ownership page surfaces “disagreements” between tag and derived owner each week, and the governance team handles them in a 10-minute review.

What ZopNight does not yet ship: ownership inference for resources created by Terraform / Pulumi / Crossplane (where the principal is the CI/CD service account and the actual human is in the pull request description), automatic ownership migration when a person leaves the company, and cross-account assume-role chains spanning more than 4 hops. Each is a future direction; the current deliverable is the foundational derivation pipeline plus the auto-tag closing of the loop.

The owner tag was always going to be partial. Engineers will forget; policies will be applied unevenly; ownership will transfer faster than the tag catches up. The cloud’s activity log was always going to be complete. The work is to bridge the two, and to surface the disagreements where they actually matter. Derive once. Reconcile weekly. Auto-tag the easy cases. Escalate the ambiguous ones. That is the work the work is for.