代理#

AutoGen AgentChat 提供了一组预设的代理，每个代理在如何响应消息方面都有所不同。所有代理都共享以下属性和方法

name：代理的唯一名称。
description：代理的文本描述。
run：该方法运行代理，给定一个字符串形式的任务或消息列表，并返回一个 TaskResult。 **代理应该是状态化的，并且此方法应该用新的消息调用，而不是完整的历史记录**。
run_stream：与 run() 相同，但返回一个消息迭代器，这些消息是 BaseAgentEvent 或 BaseChatMessage 的子类，随后是作为最后一项的 TaskResult。

有关 AgentChat 消息类型的更多信息，请参见 autogen_agentchat.messages。

助手代理#

AssistantAgent 是一个内置代理，它使用语言模型并且具有使用工具的能力。

警告

AssistantAgent 是一个用于原型设计和教育目的的“大杂烩”代理——它非常通用。请务必阅读文档和实现，以了解设计选择。完全理解设计后，您可能需要实现自己的代理。参见自定义代理。

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.messages import StructuredMessage
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient

# Define a tool that searches the web for information.
# For simplicity, we will use a mock function here that returns a static string.
async def web_search(query: str) -> str:
    """Find information on the web"""
    return "AutoGen is a programming framework for building multi-agent applications."


# Create an agent that uses the OpenAI GPT-4o model.
model_client = OpenAIChatCompletionClient(
    model="gpt-4.1-nano",
    # api_key="YOUR_API_KEY",
)
agent = AssistantAgent(
    name="assistant",
    model_client=model_client,
    tools=[web_search],
    system_message="Use tools to solve tasks.",
)

获取结果#

我们可以使用 run() 方法来获取在给定任务上运行的代理。

# Use asyncio.run(agent.run(...)) when running in a script.
result = await agent.run(task="Find information on AutoGen")
print(result.messages)

[TextMessage(source='user', models_usage=None, metadata={}, content='Find information on AutoGen', type='TextMessage'), ToolCallRequestEvent(source='assistant', models_usage=RequestUsage(prompt_tokens=61, completion_tokens=16), metadata={}, content=[FunctionCall(id='call_703i17OLXfztkuioUbkESnea', arguments='{"query":"AutoGen"}', name='web_search')], type='ToolCallRequestEvent'), ToolCallExecutionEvent(source='assistant', models_usage=None, metadata={}, content=[FunctionExecutionResult(content='AutoGen is a programming framework for building multi-agent applications.', name='web_search', call_id='call_703i17OLXfztkuioUbkESnea', is_error=False)], type='ToolCallExecutionEvent'), ToolCallSummaryMessage(source='assistant', models_usage=None, metadata={}, content='AutoGen is a programming framework for building multi-agent applications.', type='ToolCallSummaryMessage')]

对 run() 方法的调用返回一个 TaskResult，其中 messages 属性中包含消息列表，该属性存储了代理的“思考过程”以及最终响应。

注意

重要的是要注意，run() 将更新代理的内部状态——它会将消息添加到代理的消息历史记录中。您还可以调用没有任务的 run()，以使代理根据其当前状态生成响应。

注意

与 v0.2 AgentChat 不同，工具由同一代理直接在对 run() 的同一次调用中执行。默认情况下，代理将返回工具调用的结果作为最终响应。

流式消息#

我们还可以使用 run_stream() 方法来流式传输代理生成的每条消息，并使用 Console 将消息打印到控制台。

async def assistant_run_stream() -> None:
    # Option 1: read each message from the stream (as shown in the previous example).
    # async for message in agent.run_stream(task="Find information on AutoGen"):
    #     print(message)

    # Option 2: use Console to print all messages as they appear.
    await Console(
        agent.run_stream(task="Find information on AutoGen"),
        output_stats=True,  # Enable stats printing.
    )


# Use asyncio.run(assistant_run_stream()) when running in a script.
await assistant_run_stream()

---------- TextMessage (user) ----------
Find information on AutoGen
---------- ToolCallRequestEvent (assistant) ----------
[FunctionCall(id='call_HOTRhOzXCBm0zSqZCFbHD7YP', arguments='{"query":"AutoGen"}', name='web_search')]
[Prompt tokens: 61, Completion tokens: 16]
---------- ToolCallExecutionEvent (assistant) ----------
[FunctionExecutionResult(content='AutoGen is a programming framework for building multi-agent applications.', name='web_search', call_id='call_HOTRhOzXCBm0zSqZCFbHD7YP', is_error=False)]
---------- ToolCallSummaryMessage (assistant) ----------
AutoGen is a programming framework for building multi-agent applications.
---------- Summary ----------
Number of messages: 4
Finish reason: None
Total prompt tokens: 61
Total completion tokens: 16
Duration: 0.52 seconds

run_stream() 方法返回一个异步生成器，它产生代理生成的每条消息，随后是作为最后一项的 TaskResult。

从消息中，您可以观察到助手代理利用 web_search 工具来收集信息并根据搜索结果做出响应。

使用工具和工作台#

大型语言模型 (LLM) 通常仅限于生成文本或代码响应。然而，许多复杂的任务受益于使用执行特定操作的外部工具的能力，例如从 API 或数据库获取数据。

为了解决这个限制，现代 LLM 现在可以接受可用工具模式的列表（工具及其参数的描述）并生成一个工具调用消息。此功能称为**工具调用**或**函数调用**，并且正在成为构建基于智能代理的应用程序中的一种流行模式。有关 LLM 中工具调用的更多信息，请参阅来自 OpenAI 和 Anthropic 的文档。

在 AgentChat 中，AssistantAgent 可以使用工具来执行特定操作。web_search 工具就是这样一种工具，它允许助理代理在网络上搜索信息。单个自定义工具可以是 Python 函数，也可以是 BaseTool 的子类。

另一方面，Workbench 是共享状态和资源的一组工具的集合。

注意

有关如何将模型客户端直接与工具和工作台一起使用的信息，请参阅 Core User Guide 中的工具和工作台部分。

默认情况下，当 AssistantAgent 执行工具时，它将以 ToolCallSummaryMessage 的形式返回工具的输出作为字符串。如果您的工具没有以自然语言返回格式良好的字符串，您可以添加一个反思步骤，让模型总结工具的输出，方法是在 AssistantAgent 构造函数中设置 reflect_on_tool_use=True 参数。

内置工具和工作台#

AutoGen 扩展提供了一组内置工具，可与助理代理一起使用。前往 API 文档了解 autogen_ext.tools 命名空间下的所有可用工具。例如，您可以找到以下工具：

graphrag: 用于使用 GraphRAG 索引的工具。
http: 用于发出 HTTP 请求的工具。
langchain: 用于使用 LangChain 工具的适配器。
mcp: 用于使用模型聊天协议 (MCP) 服务器的工具和工作台。

函数工具#

AssistantAgent 自动将 Python 函数转换为 FunctionTool，该工具可以用作代理的工具，并自动从函数签名和文档字符串生成工具模式。

web_search_func 工具是函数工具的一个例子。模式是自动生成的。

from autogen_core.tools import FunctionTool


# Define a tool using a Python function.
async def web_search_func(query: str) -> str:
    """Find information on the web"""
    return "AutoGen is a programming framework for building multi-agent applications."


# This step is automatically performed inside the AssistantAgent if the tool is a Python function.
web_search_function_tool = FunctionTool(web_search_func, description="Find information on the web")
# The schema is provided to the model during AssistantAgent's on_messages call.
web_search_function_tool.schema

{'name': 'web_search_func',
 'description': 'Find information on the web',
 'parameters': {'type': 'object',
  'properties': {'query': {'description': 'query',
    'title': 'Query',
    'type': 'string'}},
  'required': ['query'],
  'additionalProperties': False},
 'strict': False}

模型上下文协议 (MCP) 工作台#

AssistantAgent 还可以使用从模型上下文协议 (MCP) 服务器提供的工具，通过使用 McpWorkbench() 实现。

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.messages import TextMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.tools.mcp import McpWorkbench, StdioServerParams

# Get the fetch tool from mcp-server-fetch.
fetch_mcp_server = StdioServerParams(command="uvx", args=["mcp-server-fetch"])

# Create an MCP workbench which provides a session to the mcp server.
async with McpWorkbench(fetch_mcp_server) as workbench:  # type: ignore
    # Create an agent that can use the fetch tool.
    model_client = OpenAIChatCompletionClient(model="gpt-4.1-nano")
    fetch_agent = AssistantAgent(
        name="fetcher", model_client=model_client, workbench=workbench, reflect_on_tool_use=True
    )

    # Let the agent fetch the content of a URL and summarize it.
    result = await fetch_agent.run(task="Summarize the content of https://en.wikipedia.org/wiki/Seattle")
    assert isinstance(result.messages[-1], TextMessage)
    print(result.messages[-1].content)

    # Close the connection to the model client.
    await model_client.close()

Seattle is a major city located in the state of Washington, United States. It was founded on November 13, 1851, and incorporated as a town on January 14, 1865, and later as a city on December 2, 1869. The city is named after Chief Seattle. It covers an area of approximately 142 square miles, with a population of around 737,000 as of the 2020 Census, and an estimated 755,078 residents in 2023. Seattle is known by nicknames such as The Emerald City, Jet City, and Rain City, and has mottos including The City of Flowers and The City of Goodwill. The city operates under a mayor–council government system, with Bruce Harrell serving as mayor. Key landmarks include the Space Needle, Pike Place Market, Amazon Spheres, and the Seattle Great Wheel. It is situated on the U.S. West Coast, with a diverse urban and metropolitan area that extends to a population of over 4 million in the greater metropolitan region.

并行工具调用#

某些模型支持并行工具调用，这对于需要同时调用多个工具的任务非常有用。默认情况下，如果模型客户端生成多个工具调用，AssistantAgent 将并行调用这些工具。

当工具具有可能相互干扰的副作用时，或者当代理行为需要在不同模型之间保持一致时，您可能需要禁用并行工具调用。这应该在模型客户端级别完成。

对于 OpenAIChatCompletionClient 和 AzureOpenAIChatCompletionClient，设置 parallel_tool_calls=False 以禁用并行工具调用。

model_client_no_parallel_tool_call = OpenAIChatCompletionClient(
    model="gpt-4o",
    parallel_tool_calls=False,  # type: ignore
)
agent_no_parallel_tool_call = AssistantAgent(
    name="assistant",
    model_client=model_client_no_parallel_tool_call,
    tools=[web_search],
    system_message="Use tools to solve tasks.",
)

在循环中运行代理#

AssistantAgent 一次执行一个步骤：一次模型调用，然后是一次工具调用（或并行工具调用），然后是可选的反思。

要在循环中运行它，例如，运行它直到它停止生成工具调用，请参阅单代理团队。

结构化输出#

结构化输出允许模型返回具有应用程序提供的预定义模式的结构化 JSON 文本。与 JSON 模式不同，该模式可以作为 Pydantic BaseModel 类提供，该类也可用于验证输出。

一旦你在 AssistantAgent 构造函数的 output_content_type 参数中指定了基本模型类，代理将响应一个 StructuredMessage，其 content 的类型是基本模型类的类型。

通过这种方式，您可以将代理的响应直接集成到您的应用程序中，并将模型的输出用作结构化对象。

注意

当设置了 output_content_type 时，默认情况下它要求代理反思工具的使用，并根据工具调用结果返回一个结构化输出消息。您可以通过显式设置 reflect_on_tool_use=False 来禁用此行为。

结构化输出对于在代理的响应中加入思维链推理也很有用。请参阅下面的示例，了解如何将结构化输出与助理代理一起使用。

from typing import Literal

from pydantic import BaseModel


# The response format for the agent as a Pydantic base model.
class AgentResponse(BaseModel):
    thoughts: str
    response: Literal["happy", "sad", "neutral"]


# Create an agent that uses the OpenAI GPT-4o model.
model_client = OpenAIChatCompletionClient(model="gpt-4o")
agent = AssistantAgent(
    "assistant",
    model_client=model_client,
    system_message="Categorize the input as happy, sad, or neutral following the JSON format.",
    # Define the output content type of the agent.
    output_content_type=AgentResponse,
)

result = await Console(agent.run_stream(task="I am happy."))

# Check the last message in the result, validate its type, and print the thoughts and response.
assert isinstance(result.messages[-1], StructuredMessage)
assert isinstance(result.messages[-1].content, AgentResponse)
print("Thought: ", result.messages[-1].content.thoughts)
print("Response: ", result.messages[-1].content.response)
await model_client.close()

---------- user ----------
I am happy.

---------- assistant ----------
{
  "thoughts": "The user explicitly states they are happy.",
  "response": "happy"
}
Thought:  The user explicitly states they are happy.
Response:  happy

流式传输 Token#

您可以通过设置 model_client_stream=True 来流式传输模型客户端生成的 Token。这将导致代理在 run_stream() 中生成 ModelClientStreamingChunkEvent 消息。

底层模型 API 必须支持流式传输 Token 才能使其工作。请与您的模型提供商联系以查看是否支持此功能。

model_client = OpenAIChatCompletionClient(model="gpt-4o")

streaming_assistant = AssistantAgent(
    name="assistant",
    model_client=model_client,
    system_message="You are a helpful assistant.",
    model_client_stream=True,  # Enable streaming tokens.
)

# Use an async function and asyncio.run() in a script.
async for message in streaming_assistant.run_stream(task="Name two cities in South America"):  # type: ignore
    print(message)

source='user' models_usage=None metadata={} content='Name two cities in South America' type='TextMessage'
source='assistant' models_usage=None metadata={} content='Two' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' cities' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' in' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' South' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' America' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' are' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' Buenos' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' Aires' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' in' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' Argentina' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' and' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' São' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' Paulo' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' in' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content=' Brazil' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None metadata={} content='.' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=RequestUsage(prompt_tokens=0, completion_tokens=0) metadata={} content='Two cities in South America are Buenos Aires in Argentina and São Paulo in Brazil.' type='TextMessage'
messages=[TextMessage(source='user', models_usage=None, metadata={}, content='Name two cities in South America', type='TextMessage'), TextMessage(source='assistant', models_usage=RequestUsage(prompt_tokens=0, completion_tokens=0), metadata={}, content='Two cities in South America are Buenos Aires in Argentina and São Paulo in Brazil.', type='TextMessage')] stop_reason=None

您可以在上面的输出中看到流式传输的块。这些块由模型客户端生成，并由代理在接收到时生成。最终响应（所有块的串联）在最后一个块之后立即生成。

使用模型上下文#

AssistantAgent 具有一个 model_context 参数，可用于传入 ChatCompletionContext 对象。这允许代理使用不同的模型上下文，例如 BufferedChatCompletionContext 来限制发送到模型的上下文。

默认情况下，AssistantAgent 使用 UnboundedChatCompletionContext，它会将完整的对话历史发送到模型。要将上下文限制为最近的 n 条消息，您可以使用 BufferedChatCompletionContext。要按 token 数量限制上下文，您可以使用 TokenLimitedChatCompletionContext。

from autogen_core.model_context import BufferedChatCompletionContext

# Create an agent that uses only the last 5 messages in the context to generate responses.
agent = AssistantAgent(
    name="assistant",
    model_client=model_client,
    tools=[web_search],
    system_message="Use tools to solve tasks.",
    model_context=BufferedChatCompletionContext(buffer_size=5),  # Only use the last 5 messages in the context.
)

其他预设 Agent#

以下预设 Agent 可用

UserProxyAgent: 一个接受用户输入并将其作为响应返回的 agent。
CodeExecutorAgent: 一个可以执行代码的 agent。
OpenAIAssistantAgent: 一个由 OpenAI Assistant 支持的 agent，能够使用自定义工具。
MultimodalWebSurfer: 一个多模态 agent，可以搜索网络并访问网页以获取信息。
FileSurfer: 一个可以搜索和浏览本地文件以获取信息的 agent。
VideoSurfer: 一个可以观看视频以获取信息的 agent。

下一步#

在探索了 AssistantAgent 的用法后，我们现在可以继续下一节，了解 AgentChat 中的团队功能。

代理#