Turn AI into an individual execution system, Claude's latest Managed Agents Best Practices Guide

By: blockbeats|2026/04/09 18:00:03

Original Article Title: Launching Claude Managed Agents
Original Author: Lance Martin
Translation: Peggy, BlockBeats

Editor's Note: This article introduces the Managed Agents launched by Claude. It provides a software form closer to the future: intelligent agents are no longer interfaces that respond to requests once but are execution systems that can be configured, deployed, scheduled, and run for the long term.

By thoroughly decoupling "intelligence" (model and runtime framework), "execution" (tools and sandbox), and "process" (session and log), Claude Managed Agents transform the agent from "logic in code" to an independent infrastructure unit. This design not only enhances the system's stability and security in long-running tasks but also allows agents to continuously expand as the model's capabilities evolve, unconstrained by existing frameworks.

Within this framework, common usage patterns have also changed: from event-triggered and scheduled execution to "trigger-on-delivery" automatic execution, and to complex tasks spanning days or even weeks, agents truly acquire the ability to "work continuously." This means that the value of AI is no longer only reflected in the quality of a single answer but in its ability to accumulate and compound over time.

If past APIs allowed developers to "invoke intelligence," Managed Agents are now attempting to answer another question: how to make intelligence a system that can be hosted, scheduled, and run continuously. In this sense, agents are no longer just tools but closer to a new computing primitive.

The original article is as follows:

TL;DR

Claude Managed Agents are a pre-built, configurable agent execution framework (agent harness) that runs on managed infrastructure. You only need to define an agent as a template—including tools, skills, file/code repositories, etc.—and the rest of the runtime framework and infrastructure are provided by the system. This system is designed to keep pace with Claude's rapidly growing intelligence levels and support long-running tasks.

Claude Managed Agents

Why Claude Managed Agents are Needed

Claude's messages API is fundamentally an entry point to interact directly with the model: input a message, get back a content block. Intelligent agents built on the messages API need to rely on a "runtime framework" to handle tool invocation routing, context management, and more. However, this poses several challenges:

1. The runtime framework needs to keep up with Claude's evolving capabilities
I recently wrote a blog post on how to build an agent based on Claude API's underlying capabilities for handling tool orchestration and context management. However, the issue is that the agent's runtime framework often implies some assumptions about "what Claude cannot do." As Claude's capabilities grow, these assumptions quickly become outdated and may even become performance bottlenecks. Therefore, the runtime framework must be continually updated to keep pace with Claude's rate of evolution.

2. Claude's task lifecycles are becoming longer
The span of tasks that Claude can handle is growing exponentially, exceeding 10 hours of human work in METR benchmark tests. This places higher demands on the agent's underlying infrastructure: it must have security, stability in long-running scenarios (handling various infrastructure failures), and scalability (e.g., supporting multiple teams of agents running simultaneously).

-- Price

Why These Challenges Matter

Addressing the challenges mentioned above is crucial because we anticipate that future versions of Claude will be able to operate continuously over periods of days, weeks, or even months, tackling humanity's most complex problems.

The Claude Agent SDK is the first step in this direction, providing a general-purpose, easy-to-use intelligent agent runtime framework. Meanwhile, Claude Managed Agents take it a step further: building on this foundation, they provide a complete runtime framework + managed infrastructure specifically designed to support secure, reliable task execution over long time spans.

Getting Started

A simple way to get started is by using our open-source claude-api skill, which can be used out of the box in Claude Code. Simply install the latest version of Claude Code, then run the following subcommand to complete the initialization configuration of Claude Managed Agents.

I personally have a strong preference for the "skills" approach to integrating new functionality, and I extensively use this skill in practice.

Turn AI into an individual execution system, Claude's latest Managed Agents Best Practices Guide

Additionally, you can refer to our documentation to quickly get started with the SDK or CLI and prototype your agents in the Claude Console.

Use Cases

You can find many interesting use cases in the Claude official blog. Combining these cases with my own practical experience, I have observed some common usage patterns:

1. Event-Triggered
Task execution by a Managed Agent triggered by a service.
For example, when a bug is detected in the system, an automated call to a managed agent is made to write a patch and submit a PR. No human intervention is required between the "issue identified" and "patch applied" stages.

2. Scheduled Execution
Scheduling tasks for a Managed Agent to execute.
For instance, many people, including myself, use this method to generate daily briefings (such as a summary of activities on Platform X or GitHub, or a team's progress report generated by an agent). Below is an example of my daily summary of activities on Platform X.

3. Fire-and-Forget
Task execution by a Managed Agent triggered by a human but requiring no ongoing follow-up. For instance, assigning tasks to a managed agent via Slack or Teams, which then autonomously completes the task and delivers the results (such as tables, slides, or even applications).

4. Long-horizon Tasks
A long-running task, which I consider one of the particularly valuable scenarios for Managed Agents.
I have conducted some experiments based on Andrej Karpathy's auto-research repo, exploring different ways of application. For example, I recently took _chenglou's pretext library as input and had a Managed Agent research how to apply it to our engineering blog content.

Core Concepts

There are three core concepts to understand in the onboarding process:

1. Agent
A version-controlled configuration that defines the "identity" of the agent: including the model, system prompt, tools, skills, MCP server, etc. Once created, it can be invoked repeatedly via ID.

2. Environment
A template used to describe the sandbox environment provided for the agent tool to run (e.g., runtime type, network policy, dependency package configuration, etc.).

3. Session
A stateful running instance launched based on a preconfigured agent and environment. It will create a brand-new sandbox from the environment template, mount the resources needed for this run (such as files, GitHub repositories), and securely store authentication information in a keystore (like MCP credentials).

You can think of it this way:

· Agent = The configuration itself

· Environment = The sandbox template required for agent operation

· Session = One specific execution process

One Agent can correspond to multiple Sessions.

Usage

Refer to the documentation for details. The overall usage is divided into two categories:

1. SDK (Code-Oriented)
Integrate the SDK into your application to drive sessions at runtime. Currently, Managed Agents support 6 languages: Python, TypeScript, Java, Go, Ruby, PHP.

2. CLI (Command Line Interface)
Interact with all API resources via the command line, including agents, environments, sessions, vaults, skills, files, etc. Each type of resource has corresponding subcommands.

Common Practice:
Usually, the CLI is used for configuration and initialization, while the SDK is used for runtime logic.
An agent template is persistent—you can create a template (e.g., defining the model, system prompt, tools, MCP server, skills in YAML), store it in Git, and apply it during the deployment process via the CLI.

Workflow

I co-authored an Anthropic engineering blog post with @mc_anthropic, @gcemaj, and @jkeatn, which provided a detailed explanation of the construction of Claude Managed Agents. A key conclusion in the article was that enabling agents to scale with Claude's intelligence level is fundamentally an "infrastructure problem," not just a runtime framework design issue.

This means that the real challenge lies not in "how to write a smarter agent," but in how to build a system that can run stably in the long term, be scalable, and be evolvable, allowing the agent to undertake increasingly complex and long-term tasks.

Based on this philosophy, we did not design a fixed agent runtime framework (harness) as we anticipated its continuous evolution. Instead, we "decoupled" several key parts of the system:

“Brain” (Claude and its runtime framework)

“Hands” (sandbox and tool performing concrete actions)

“Session” (records event logs of execution)

These three were designed as independent interfaces with minimal assumptions about each other. Each part can fail or be replaced independently without affecting the overall system.

In the article, we also shared how this architecture brings higher reliability, security, and flexibility—while also leaving room for future integration of new runtime frameworks, sandboxes, or infrastructure hosting sessions.

Conclusion

I am very excited about projects exploring Multi-Agent Orchestration or long-horizon tasks. One thing that has always frustrated me in the past is how the agent's execution framework struggles to keep up with the evolving capabilities of the model.

The significance of Claude Managed Agents is that it takes care of the execution framework and infrastructure layer for you, allowing you to focus on a higher level—treating the "agent" itself as a new foundational primitive in the Claude API, enabling further exploration and development on top of it.

[Original Post Link]