An abstract visualization of AI skills and code execution as modular blocks connecting with glowing amber energy

Anthropic's Code Execution Release Makes MCP Hubs Smarter

If you've ever watched an agent spend more tokens loading tool manifests than actually doing useful work, you know the pain. Every MCP server you add is another JSON schema the model has to消化 (that's "digest" for the non-Chinese speakers). And the more tools you pile on, the more your context window looks like a crowded subway at rush hour.

Anthropic just fixed that. Sort of.

The upgrade you've been waiting for

What they shipped is elegantly simple: agents can now generate and run code against MCP servers, complete with persistent files and reusable "skills." Instead of juggling dozens of tool schemas in-context, the agent treats tools like importable functions—pull only what it needs, filter the results before they hit the conversation.

If you read my earlier post on why I built a three-tool MCP hub, you already know the pain this solves. If you haven't, here's the TL;DR: the more plugins you bolt on, the worse the latency, and the harder it becomes to experiment with smaller local models that already run tight on tokens.

[!note] This is part of a series on MCP tooling. Read Building an MCP Hub That Doesn't Drown Your Agent first if you want the full context.

What Anthropic actually shipped

The new workflow lets an agent write a script that calls MCP tools as if they were filesystem resources. That script executes in a sandboxed environment, so loops, conditionals, and retries happen in code instead of across multiple LLM messages.

Think about that for a second. Instead of:

"Use the tool"
"Now use it again with different params"
"Oops, hit rate limit, try again"

You get:

Agent writes Python script
Script calls MCP tools
Returns a tidy summary

Tool definitions become on-demand imports rather than manifest dumps. The agent can stash intermediate data to disk or upgrade it into a named Skill for future runs. And here's the kicker: massive tool outputs can be filtered, aggregated, or tokenized before they ever touch the model's context window.

That's the part that makes me unreasonably excited.

Why this pairs perfectly with an MCP hub

My hub already funnels tool discovery through search_tools, describe_tool, and invoke_tool—agents only fetch schemas for tools they actually use. Anthropic's execution layer pushes that idea further by keeping tool chatter outside the LLM altogether.

The hub remains the registry and permission gate. But now the heavy lifting happens in code:

Call search_tools to find candidates
Grab the schema for what you need
Write a quick transform
Let the script return a tiny summary instead of a megabyte of JSON

Less context burned. More reasoning budget saved for the actual task. It's the difference between bringing the entire toolbox to the bench "just in case" versus grabbing the one wrench you actually need.

Skills are the missing legos

The part I'm most excited about is the Skills concept. Once the agent writes a useful routine—say, reconcile GitHub issues with Notion docs—it can save that snippet and reuse it like any other tool.

Combine that with the hub's provider modules and you suddenly have a virtuous loop: new API integrations still register inside the hub, but the workflows evolve in code that's shareable, reviewable, and versioned.

You could, theoretically:

Write a skill that searches your Obsidian vault
Save it as "vault-search"
Call it from any agent, forever

That's the dream, anyway.

What to watch next

Running arbitrary code means we need better sandboxing, observability, and policy enforcement. Metrics from the hub become even more valuable when scripts can fan out across multiple providers.

I'm already experimenting with:

Semantic search inside search_tools so scripts locate the right capability faster
Per-agent allowlists so a rogue skill can't nuke production data
Better error handling so one failed tool call doesn't cascade into a debugging nightmare

Anthropic's release proves MCP doesn't have to be verbose or brittle. We just have to keep pushing tool discovery and execution into purpose-built layers.

The future of agents isn't bigger context windows. It's smarter pipelines.

Want to see how this plays out in practice? I run the MCP hub daily—hit me up if you want a demo.

Anthropic's Code Execution Release Makes MCP Hubs Smarter

Anthropic's Code Execution Release Makes MCP Hubs Smarter

The upgrade you've been waiting for

What Anthropic actually shipped

Why this pairs perfectly with an MCP hub

Skills are the missing legos

What to watch next

Related Articles

Building an MCP Hub That Doesn't Drown Your Agent

Building an AI Productivity Stack That Keeps Agents Grounded