Skip to main content
ZopDev MCP Server: 7 Typed Tools for Claude, Cursor, and Codex

ZopDev MCP Server: 7 Typed Tools for Claude, Cursor, and Codex

Riya Mittal By Riya Mittal
Published: May 11, 2026 11 min read

A developer asks Claude Code at 2 AM: “this terraform plan is failing admission, fix the bucket so it deploys.” Claude reads the error, generates a slightly different bucket config, runs the plan again, hits the same admission rule, reads the new error, generates another shape, runs again. Three to five rounds of this and either the agent stumbles into a config that passes (often by accident) or gives up and tells the developer to look at the rule manually. Every round burns LLM tokens. None of the rounds taught the agent what the underlying policy actually was.

This is the loop every cloud-aware agent in 2026 runs without integration. The agent sees the failure but cannot see the rule. It sees the resource but cannot see the policy graph. It sees the recommendation but cannot see the drift events that produced it. The fix is not a better agent. The fix is exposing the cloud’s governance state to the agent through typed tools so the agent gets context before it acts, instead of after it fails.

ZopNight v2.0 ships an MCP server endpoint that does exactly this. Claude Code, Cursor, Codex, and any other MCP-aware client can read the live policy graph, resource state, ownership, drift events, exceptions, and audit history through seven typed tools. The agent composes those tools into answers. The human reads the answer.

The piece sits next to the existing work on read-only MCP servers (the foundational pattern), policy-aware MCP governance (how policy data composes with the MCP shape), and read-write MCP failure modes (why writes are gated by capability tier). This is the product post: what shipped, what each tool does, and how to use it.

Cloud agents in 2026 work blind

Compare the same triage task with and without an MCP-aware governance surface.

StepAgent without MCPAgent with ZopNight MCP
Operator asks “why is this resource broken”Agent reads the alert / error textAgent reads the alert + reads policy graph via list_policies
Agent tries to fixGenerates a plausible config, retriesCalls check_resource to see what rule applies, generates the right config first time
Failure happensReads error, retries with shape variationReads violation_history to see why this fails and what passed historically
Operator gets the answerAfter 5-15 minutes of agent back-and-forthAfter one round-trip with 3-4 parallel tool calls
CostHigh token use, low signal-to-noiseLow token use, structured signal

The token cost gap is real. An agent burning through 5 to 15 rounds of trial-and-error on a single admission failure spends 8 to 20 thousand tokens on orchestration alone, not counting the final correct config. With MCP integration, the same task lands in 1 to 3 thousand tokens because the agent does not waste rounds on misconfigs the policy graph would have ruled out.

The latency gap is bigger. Agent back-and-forth at 2 AM is the worst possible UX for incident triage; the on-call engineer is waiting on the LLM and the LLM is waiting on retry timeouts. With MCP, the agent’s first response includes the policy context, the ownership, the recent drift, and a candidate fix.

What ZopNight’s MCP server exposes

The MCP server ships with seven typed tools covering the four kinds of cloud-governance question the agent needs to answer.

ToolReturnsUsed for
list_policies(scope)Active policies on the scope, severity, owners”Which rules apply to this resource”
check_resource(arn, action)Allow / deny + reasoning + cited policy”Should this change be made at all”
resource_ownership(arn)Team, cost center, on-call, escalationRouting questions, paging, ownership-driven approvals
drift_events(scope, window)Recent drift detections (deploy, IAM, config)“What changed lately around this resource”
exception_status(policy, resource)Active exception, expiry, approver”Is the override legitimate or expired”
violation_history(scope)Recent denies, by frequencyPattern detection for the agent’s reasoning
resource_topology(arn)Dependency graph (upstream + downstream)“What depends on this resource”

Each tool has a JSON schema input that the MCP client validates at call time. Each tool returns a structured object the agent can quote, summarise, or feed into its next reasoning step. Each call writes one audit-log line on the server side: agent identity, tool name, parameters, response status, latency.

The seven tools were not chosen by guessing. Each maps to a question that came up repeatedly in customer incident postmortems and developer-workflow research. The list is small on purpose; the surface should be learnable in an afternoon and explainable on a single slide. Adding tools is easy; removing them once they ship is hard.

How an agent uses the MCP server

A typical incident triage flow with the MCP server wired in:

Diagram 1

The operator types a natural-language question into Claude Code at 2:47 AM: “What is the context for the payments-prod EKS cluster, and is anything misconfigured or recently changed?”

The agent makes four MCP calls in parallel. Each takes 80 to 250 ms; the wall-clock latency is the longest one. The agent receives four structured objects and composes them into a one-paragraph synthesis:

The payments-prod EKS cluster is owned by team payments-platform, on-call is jen.li. There were three changes in the last 24 hours: a deploy at 01:23 UTC by jen.li (PR d7f3e2), an IAM role update by security-bot at 02:11, and a pod OOM restart at 02:31. Policy POL-RES-007 fired at 02:39: pod payments-api-3 is running without a memory limit, which is the same condition that triggered the OOM restart. There is no active exception. The most likely cause is the deploy at 01:23 changing the resource spec; check PR d7f3e2.

The operator reads the synthesis and goes straight to the PR. Total time from question to actionable context: 90 seconds. Pre-MCP, the same triage flow involves opening four different surfaces (PagerDuty for ownership, CloudTrail for drift, the GRC tool for policies, the runbook for exceptions) and mentally stitching the results.

Median triage time across ZopDev customers using the MCP integration drops 70 to 85% on this class of question. The cost is the seven tool calls; the savings is 12 to 30 minutes of operator attention per incident, plus the cognitive load of not having to remember which surface holds which piece of context.

Read-only is the right default

ZopNight’s MCP server ships read-only as the default surface. The seven tools above all read state and write nothing back to the cloud. Write capability exists (the platform can mutate cloud state through the same gRPC backend), but it lives behind a capability-tier gate that the customer enables per tool and the trust-score work decides per call.

The reasoning lines up with the read-write MCP failure modes work: most of the value an agent produces (faster triage, better context for decisions, runbook generation, recommendation reasoning) comes from reading state. Mutation is a smaller fraction of the agent’s actual workload and a much larger source of incidents. Read-only first; write capability second.

The capability tiers are exposed per MCP tool:

TierTool examplesDefault availability
Read-onlylist_policies, check_resource, resource_ownership, drift_events, exception_status, violation_history, resource_topologyAlways enabled
Mutate-low-blastTag-correct, retention-adjust, non-prod-stopOpt-in per customer per tool
Mutate-high-blastProduction resource changes, cross-region operationsHard-gated; effectively page a human for approval

Customers who want to grant write capability for low-blast operations enable per-tool. The audit log captures every mutation with the agent identity and the operator who authorised the agent’s PAT. Customers who want zero write capability ever can leave the tier-2 and tier-3 tools disabled and the read-only surface still produces most of the value.

Auth: PATs, per-user, one-click revocation

Authentication to the MCP server uses Personal Access Tokens (zn_pat_*). Each token is per-user, scoped to a single ZopNight organisation, and tied to the same RBAC policy graph that governs the dashboard.

Diagram 2

The MCP server itself holds no state. It validates the PAT, applies the RBAC scope to the request (the user’s policy scope determines which resources are visible), and proxies to the existing gRPC backend services (Config, Discoverer, Aggregator, Recommender). The same backend that powers the dashboard powers the MCP surface; the data is identical.

Token management is in the user’s Settings page. Generating a new PAT is one click; copying it to the agent’s config is one paste; revoking is one click. Token rotation can be automated through the same surface. The customer does not have to think about cross-account IAM trust policies the way they would for a vendor SaaS integration; the auth model is the same as their personal access tokens for any other tool.

Per-call audit logging is the system of record. Every MCP call (regardless of whether the underlying tool is read or write) writes one line: the agent’s PAT identity, the user the PAT belongs to, the tool name, the input parameters, the response status, the latency, the timestamp. The audit log is queryable from the dashboard; compliance teams can answer “show me every action this agent took in the last 90 days” with one filter.

Composition with auto-remediation

The MCP server composes naturally with auto-remediation. The agent can read the policy graph (via list_policies and violation_history), identify a recommendation that fits the customer’s stated intent, and propose the Remediate action. The customer still clicks the button; the agent’s job is context-gathering and proposal, not unilateral action.

A typical interaction:

StepAgent actionCloud effect
1Customer asks “why is my dev bill high this week”Nothing yet
2Agent calls list_policies(scope=dev-account) + violation_history(scope=dev-account, window=7d)Reads
3Agent identifies that 12 idle-EC2 recommendations are open with combined savings of $1,840/monthReads (still)
4Agent surfaces the recommendations with a one-paragraph summaryReads (still)
5Customer clicks Remediate on 8 of the 12Auto-remediation path runs
6Each remediation goes through precondition → action → validationCloud state changes
7Agent reports the result back to the customerReads

The agent does not click Remediate on the customer’s behalf. The capability tier model enforces this even if the agent tries to call a tier-2 tool directly — the read-only PAT would be rejected at the gateway. The pattern is “agent reads, agent recommends, customer authorises.”

For customers who want the agent to be more autonomous, the capability-tier model has the opt-in path: enable tier-2 tools per agent, set the trust-score threshold, let the agent execute on safe rules without human approval. Most customers do not need this; the read + propose pattern handles the majority of the agent’s actual usage.

How to enable and use the MCP server

Setup is three steps.

StepWhereResult
1. Enable MCP for your orgSettings → Integrations → MCPOrg-level toggle
2. Generate a PATSettings → Personal Access Tokens → Newzn_pat_* token
3. Install the MCP config in your agentClaude Code: ~/.config/claude/mcp.json. Cursor: Settings → MCP. Codex: similarAgent can call the seven tools

After step 3, ask the agent a natural-language question that requires cloud context. Typical first prompts:

PromptWhat the agent does
”What is the context for [resource ARN]?”Calls resource_ownership + drift_events + list_policies + exception_status in parallel
”Which policies apply to my prod accounts?”Calls list_policies(scope=prod-*), summarises
”What changed in the last 24h on the recommendation engine cluster?”Calls drift_events(scope=rec-engine, window=24h)
”Why is my dev bill high this week?”Calls violation_history + open recommendations + cost data

The agent’s first response includes the tool calls it made (visible in the agent UI) so the operator can audit the reasoning. The audit log on ZopNight’s side captures the same calls for the security and compliance teams.

Most customers see useful agent answers within the first session. The pattern that produces the best results is to ask focused questions about specific resources rather than open-ended “tell me about my cloud” — the seven tools are sharp on specific scopes, and the agent’s synthesis quality is highest when the input scope is small.

What’s next for the MCP surface

The seven tools cover the highest-volume context questions. Future surface additions follow customer signal rather than speculation.

Coming workWhat it adds
Per-org tool subsetCustomers can disable specific tools (e.g., hide violation_history) for compliance reasons
Streaming MCP responsesLong-running calls (e.g., topology of a 5,000-resource account) stream incrementally
Saved agent workflowsOperators can save “context-for-this-resource” as a one-click workflow callable from the dashboard
Bidirectional eventsThe MCP server pushes notifications to the agent (e.g., a new policy violation just landed)

The read-only surface is the foundation. Everything else layers on top without changing the existing seven tools.

If you have Claude Code, Cursor, or Codex installed and a ZopNight account connected, the MCP integration is a 5-minute setup. Generate a PAT, drop the config into your agent, ask the first context question. The synthesis you get back is the same data the dashboard has, in the shape the agent can act on. That is the difference between an agent that retries blindly and one that reads the policy graph first.

Riya Mittal

Written by

Riya Mittal Author

Engineer at Zop.Dev

ZopDev Resources

Stay in the loop

Get the latest articles, ebooks, and guides
delivered to your inbox. No spam, unsubscribe anytime.