v0.2 到 v0.4 迁移指南#

这是针对 autogen-agentchat v0.2.* 版本用户向 v0.4 版本迁移的指南,v0.4 版本引入了一套新的 API 和功能,并包含了一些重大更改。请仔细阅读本指南。我们仍将在 0.2 分支中维护 v0.2 版本;但是,我们强烈建议您升级到 v0.4 版本。

注意

我们不再拥有 pyautogen PyPI 包的管理员权限,自 0.2.34 版本以来,该包的发布不再来自 Microsoft。要继续使用 AutoGen 的 v0.2 版本,请使用 autogen-agentchat~=0.2 进行安装。请阅读我们关于分支的澄清声明

v0.4 是什么?#

自 AutoGen 于 2023 年发布以来,我们 intensively 听取了我们社区以及来自小型初创公司和大型企业用户的反馈,收集了大量信息。基于这些反馈,我们构建了 AutoGen v0.4,这是一个从头开始重写的版本,采用了异步、事件驱动的架构,以解决可观察性、灵活性、交互式控制和规模等问题。

v0.4 API 是分层的:核心 API 是基础层,提供了一个可扩展的、事件驱动的 actor 框架来创建智能体工作流;AgentChat API 构建在核心 API 之上,提供了一个任务驱动的、高级框架来构建交互式智能体应用程序。它是 AutoGen v0.2 的替代品。

本指南的大部分内容都侧重于 v0.4 的 AgentChat API;但是,您也可以仅使用核心 API 构建自己的高级框架。

AutoGen 新手?#

直接跳转到AgentChat 教程以开始使用 v0.4

本指南包含什么?#

我们提供了关于如何将现有代码库从 v0.2 迁移到 v0.4 的详细指南。

请参阅下面的每个功能,了解如何迁移的详细信息。

v0.2 中当前存在的以下功能将在 v0.4.* 版本的未来版本中提供

  • 模型客户端成本 #4835

  • 可教学智能体

  • RAG 智能体

当缺失功能可用时,我们将更新本指南。

模型客户端#

v0.2 中,您按如下方式配置模型客户端,并创建 OpenAIWrapper 对象。

from autogen.oai import OpenAIWrapper

config_list = [
    {"model": "gpt-4o", "api_key": "sk-xxx"},
    {"model": "gpt-4o-mini", "api_key": "sk-xxx"},
]

model_client = OpenAIWrapper(config_list=config_list)

注意:在 AutoGen 0.2 中,OpenAI 客户端将尝试列表中的配置,直到其中一个起作用。而 0.4 则期望选择特定的模型配置。

v0.4 中,我们提供了两种创建模型客户端的方法。

使用组件配置#

AutoGen 0.4 有一个通用组件配置系统。模型客户端是此系统的一个绝佳用例。请参见下文,了解如何创建 OpenAI 聊天完成客户端。


from autogen_core.models import ChatCompletionClient

config = {
    "provider": "OpenAIChatCompletionClient",
    "config": {
        "model": "gpt-4o",
        "api_key": "sk-xxx" # os.environ["...']
    }
}

model_client = ChatCompletionClient.load_component(config)

直接使用模型客户端类#

Open AI

from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(model="gpt-4o", api_key="sk-xxx")

Azure OpenAI

from autogen_ext.models.openai import AzureOpenAIChatCompletionClient

model_client = AzureOpenAIChatCompletionClient(
    azure_deployment="gpt-4o",
    azure_endpoint="https://<your-endpoint>.openai.azure.com/",
    model="gpt-4o",
    api_version="2024-09-01-preview",
    api_key="sk-xxx",
)

阅读更多关于 OpenAIChatCompletionClient 的信息。

OpenAI 兼容 API 的模型客户端#

您可以使用 OpenAIChatCompletionClient 连接到 OpenAI 兼容的 API,但您需要指定 base_urlmodel_info

from autogen_ext.models.openai import OpenAIChatCompletionClient

custom_model_client = OpenAIChatCompletionClient(
    model="custom-model-name",
    base_url="https://custom-model.com/reset/of/the/path",
    api_key="placeholder",
    model_info={
        "vision": True,
        "function_calling": True,
        "json_output": True,
        "family": "unknown",
        "structured_output": True,
    },
)

注意:我们不测试所有 OpenAI 兼容的 API,尽管它们可能声称支持 OpenAI API,但其中许多 API 的工作方式与 OpenAI API 不同。请在使用前进行测试。

阅读 AgentChat 教程中的模型客户端以及核心 API 文档中的更多详细信息。

未来将添加对其他托管模型的支持。

模型客户端缓存#

v0.2 中,您可以通过 LLM 配置中的 cache_seed 参数设置缓存种子。缓存默认启用。

llm_config = {
    "config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
    "seed": 42,
    "temperature": 0,
    "cache_seed": 42,
}

v0.4 中,缓存默认不启用,要使用它,您需要使用 ChatCompletionCache 包装模型客户端。

您可以使用 DiskCacheStoreRedisStore 来存储缓存。

pip install -U "autogen-ext[openai, diskcache, redis]"

以下是使用 diskcache 进行本地缓存的示例。

import asyncio
import tempfile

from autogen_core.models import UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.models.cache import ChatCompletionCache, CHAT_CACHE_VALUE_TYPE
from autogen_ext.cache_store.diskcache import DiskCacheStore
from diskcache import Cache


async def main():
    with tempfile.TemporaryDirectory() as tmpdirname:
        # Initialize the original client
        openai_model_client = OpenAIChatCompletionClient(model="gpt-4o")

        # Then initialize the CacheStore, in this case with diskcache.Cache.
        # You can also use redis like:
        # from autogen_ext.cache_store.redis import RedisStore
        # import redis
        # redis_instance = redis.Redis()
        # cache_store = RedisCacheStore[CHAT_CACHE_VALUE_TYPE](redis_instance)
        cache_store = DiskCacheStore[CHAT_CACHE_VALUE_TYPE](Cache(tmpdirname))
        cache_client = ChatCompletionCache(openai_model_client, cache_store)

        response = await cache_client.create([UserMessage(content="Hello, how are you?", source="user")])
        print(response)  # Should print response from OpenAI
        response = await cache_client.create([UserMessage(content="Hello, how are you?", source="user")])
        print(response)  # Should print cached response
        await openai_model_client.close()


asyncio.run(main())

助手智能体#

v0.2 中,您按如下方式创建助手智能体:

from autogen.agentchat import AssistantAgent

llm_config = {
    "config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
    "seed": 42,
    "temperature": 0,
}

assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config=llm_config,
)

v0.4 中,类似,但您需要指定 model_client 而不是 llm_config

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(model="gpt-4o", api_key="sk-xxx", seed=42, temperature=0)

assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    model_client=model_client,
)

但是,用法有些不同。在 v0.4 中,您调用 assistant.on_messagesassistant.on_messages_stream 来处理传入消息,而不是调用 assistant.send。此外,on_messageson_messages_stream 方法是异步的,后者返回一个异步生成器以流式传输智能体的内心想法。

以下是您如何在 v0.4 中直接调用助手智能体的示例,继续上面的例子:

import asyncio
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)

    assistant = AssistantAgent(
        name="assistant",
        system_message="You are a helpful assistant.",
        model_client=model_client,
    )

    cancellation_token = CancellationToken()
    response = await assistant.on_messages([TextMessage(content="Hello!", source="user")], cancellation_token)
    print(response)

    await model_client.close()

asyncio.run(main())

当您调用 cancellation_token.cancel() 时,CancellationToken 可用于异步取消请求,这将导致 on_messages 调用上的 await 引发 CancelledError

阅读更多关于智能体教程AssistantAgent的信息。

多模态智能体#

如果模型客户端支持多模态输入,v0.4 中的 AssistantAgent 将支持。模型客户端的 vision 能力用于确定智能体是否支持多模态输入。

import asyncio
from pathlib import Path
from autogen_agentchat.messages import MultiModalMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken, Image
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)

    assistant = AssistantAgent(
        name="assistant",
        system_message="You are a helpful assistant.",
        model_client=model_client,
    )

    cancellation_token = CancellationToken()
    message = MultiModalMessage(
        content=["Here is an image:", Image.from_file(Path("test.png"))],
        source="user",
    )
    response = await assistant.on_messages([message], cancellation_token)
    print(response)

    await model_client.close()

asyncio.run(main())

用户代理#

v0.2 中,您按如下方式创建用户代理

from autogen.agentchat import UserProxyAgent

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    code_execution_config=False,
    llm_config=False,
)

此用户代理将通过控制台从用户获取输入,并且如果传入消息以“TERMINATE”结尾,则会终止。

v0.4 中,用户代理只是一个只接受用户输入的智能体,不需要其他特殊配置。您可以按如下方式创建用户代理:

from autogen_agentchat.agents import UserProxyAgent

user_proxy = UserProxyAgent("user_proxy")

请参阅 UserProxyAgent 了解更多详细信息以及如何使用超时自定义输入函数。

RAG 智能体#

v0.2 中,存在可教学智能体和可接受数据库配置的 RAG 智能体概念。

teachable_agent = ConversableAgent(
    name="teachable_agent",
    llm_config=llm_config
)

# Instantiate a Teachability object. Its parameters are all optional.
teachability = Teachability(
    reset_db=False,
    path_to_db_dir="./tmp/interactive/teachability_db"
)

teachability.add_to_agent(teachable_agent)

v0.4 中,您可以使用 Memory 类实现 RAG 智能体。具体来说,您可以定义一个内存存储类,并将其作为参数传递给助手智能体。有关详细信息,请参阅内存教程。

这种清晰的职责分离允许您实现一个使用任何数据库或存储系统(您必须继承自 Memory 类)的内存存储,并将其与助手智能体一起使用。下面的示例展示了如何将 ChromaDB 向量内存存储与助手智能体一起使用。此外,您的应用程序逻辑应决定何时以及如何向内存存储添加内容。例如,您可以选择为助手智能体的每次响应调用 memory.add,或者使用单独的 LLM 调用来确定是否应将内容添加到内存存储。


# ...
# example of a ChromaDBVectorMemory class
chroma_user_memory = ChromaDBVectorMemory(
    config=PersistentChromaDBVectorMemoryConfig(
        collection_name="preferences",
        persistence_path=os.path.join(str(Path.home()), ".chromadb_autogen"),
        k=2,  # Return top  k results
        score_threshold=0.4,  # Minimum similarity score
    )
)

# you can add logic such as a document indexer that adds content to the memory store

assistant_agent = AssistantAgent(
    name="assistant_agent",
    model_client=OpenAIChatCompletionClient(
        model="gpt-4o",
    ),
    tools=[get_weather],
    memory=[chroma_user_memory],
)

可对话智能体和注册回复#

v0.2 中,您可以按如下方式创建可对话智能体并注册回复函数

from typing import Any, Dict, List, Optional, Tuple, Union
from autogen.agentchat import ConversableAgent

llm_config = {
    "config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
    "seed": 42,
    "temperature": 0,
}

conversable_agent = ConversableAgent(
    name="conversable_agent",
    system_message="You are a helpful assistant.",
    llm_config=llm_config,
    code_execution_config={"work_dir": "coding"},
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
)

def reply_func(
    recipient: ConversableAgent,
    messages: Optional[List[Dict]] = None,
    sender: Optional[Agent] = None,
    config: Optional[Any] = None,
) -> Tuple[bool, Union[str, Dict, None]]:
    # Custom reply logic here
    return True, "Custom reply"

# Register the reply function
conversable_agent.register_reply([ConversableAgent], reply_func, position=0)

# NOTE: An async reply function will only be invoked with async send.

v0.4 中,与其猜测 reply_func 的作用、其所有参数以及 position 应该是什么,我们只需创建一个自定义智能体并实现 on_messageson_resetproduced_message_types 方法。

from typing import Sequence
from autogen_core import CancellationToken
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.messages import TextMessage, BaseChatMessage
from autogen_agentchat.base import Response

class CustomAgent(BaseChatAgent):
    async def on_messages(self, messages: Sequence[BaseChatMessage], cancellation_token: CancellationToken) -> Response:
        return Response(chat_message=TextMessage(content="Custom reply", source=self.name))

    async def on_reset(self, cancellation_token: CancellationToken) -> None:
        pass

    @property
    def produced_message_types(self) -> Sequence[type[BaseChatMessage]]:
        return (TextMessage,)

然后,您可以像使用 AssistantAgent 一样使用自定义智能体。有关详细信息,请参阅自定义智能体教程

保存和加载智能体状态#

v0.2 中,没有内置的方法来保存和加载智能体的状态:您需要通过导出 ConversableAgentchat_messages 属性并通过 chat_messages 参数将其导入来实现。

v0.4 中,您可以调用智能体上的 save_stateload_state 方法来保存和加载它们的状态。

import asyncio
import json
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)

    assistant = AssistantAgent(
        name="assistant",
        system_message="You are a helpful assistant.",
        model_client=model_client,
    )

    cancellation_token = CancellationToken()
    response = await assistant.on_messages([TextMessage(content="Hello!", source="user")], cancellation_token)
    print(response)

    # Save the state.
    state = await assistant.save_state()

    # (Optional) Write state to disk.
    with open("assistant_state.json", "w") as f:
        json.dump(state, f)

    # (Optional) Load it back from disk.
    with open("assistant_state.json", "r") as f:
        state = json.load(f)
        print(state) # Inspect the state, which contains the chat history.

    # Carry on the chat.
    response = await assistant.on_messages([TextMessage(content="Tell me a joke.", source="user")], cancellation_token)
    print(response)

    # Load the state, resulting the agent to revert to the previous state before the last message.
    await assistant.load_state(state)

    # Carry on the same chat again.
    response = await assistant.on_messages([TextMessage(content="Tell me a joke.", source="user")], cancellation_token)
    # Close the connection to the model client.
    await model_client.close()

asyncio.run(main())

您也可以在任何团队上调用 save_stateload_state,例如 RoundRobinGroupChat,以保存和加载整个团队的状态。

双智能体聊天#

v0.2 中,您可以按如下方式创建用于代码执行的双智能体聊天:

from autogen.coding import LocalCommandLineCodeExecutor
from autogen.agentchat import AssistantAgent, UserProxyAgent

llm_config = {
    "config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
    "seed": 42,
    "temperature": 0,
}

assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant. Write all code in python. Reply only 'TERMINATE' if the task is done.",
    llm_config=llm_config,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    code_execution_config={"code_executor": LocalCommandLineCodeExecutor(work_dir="coding")},
    llm_config=False,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)

chat_result = user_proxy.initiate_chat(assistant, message="Write a python script to print 'Hello, world!'")
# Intermediate messages are printed to the console directly.
print(chat_result)

要在 v0.4 中获得相同的行为,您可以在 RoundRobinGroupChat 中一起使用 AssistantAgentCodeExecutorAgent

import asyncio
from autogen_agentchat.agents import AssistantAgent, CodeExecutorAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination, MaxMessageTermination
from autogen_agentchat.ui import Console
from autogen_ext.code_executors.local import LocalCommandLineCodeExecutor
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)

    assistant = AssistantAgent(
        name="assistant",
        system_message="You are a helpful assistant. Write all code in python. Reply only 'TERMINATE' if the task is done.",
        model_client=model_client,
    )

    code_executor = CodeExecutorAgent(
        name="code_executor",
        code_executor=LocalCommandLineCodeExecutor(work_dir="coding"),
    )

    # The termination condition is a combination of text termination and max message termination, either of which will cause the chat to terminate.
    termination = TextMentionTermination("TERMINATE") | MaxMessageTermination(10)

    # The group chat will alternate between the assistant and the code executor.
    group_chat = RoundRobinGroupChat([assistant, code_executor], termination_condition=termination)

    # `run_stream` returns an async generator to stream the intermediate messages.
    stream = group_chat.run_stream(task="Write a python script to print 'Hello, world!'")
    # `Console` is a simple UI to display the stream.
    await Console(stream)
    
    # Close the connection to the model client.
    await model_client.close()

asyncio.run(main())

工具使用#

v0.2 中,要创建工具使用聊天机器人,您必须有两个智能体,一个用于调用工具,一个用于执行工具。您需要为每个用户请求启动一个双智能体聊天。

from autogen.agentchat import AssistantAgent, UserProxyAgent, register_function

llm_config = {
    "config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
    "seed": 42,
    "temperature": 0,
}

tool_caller = AssistantAgent(
    name="tool_caller",
    system_message="You are a helpful assistant. You can call tools to help user.",
    llm_config=llm_config,
    max_consecutive_auto_reply=1, # Set to 1 so that we return to the application after each assistant reply as we are building a chatbot.
)

tool_executor = UserProxyAgent(
    name="tool_executor",
    human_input_mode="NEVER",
    code_execution_config=False,
    llm_config=False,
)

def get_weather(city: str) -> str:
    return f"The weather in {city} is 72 degree and sunny."

# Register the tool function to the tool caller and executor.
register_function(get_weather, caller=tool_caller, executor=tool_executor)

while True:
    user_input = input("User: ")
    if user_input == "exit":
        break
    chat_result = tool_executor.initiate_chat(
        tool_caller,
        message=user_input,
        summary_method="reflection_with_llm", # To let the model reflect on the tool use, set to "last_msg" to return the tool call result directly.
    )
    print("Assistant:", chat_result.summary)

v0.4 中,您只需要一个智能体——AssistantAgent——即可处理工具调用和工具执行。

import asyncio
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.messages import TextMessage

def get_weather(city: str) -> str: # Async tool is possible too.
    return f"The weather in {city} is 72 degree and sunny."

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
    assistant = AssistantAgent(
        name="assistant",
        system_message="You are a helpful assistant. You can call tools to help user.",
        model_client=model_client,
        tools=[get_weather],
        reflect_on_tool_use=True, # Set to True to have the model reflect on the tool use, set to False to return the tool call result directly.
    )
    while True:
        user_input = input("User: ")
        if user_input == "exit":
            break
        response = await assistant.on_messages([TextMessage(content=user_input, source="user")], CancellationToken())
        print("Assistant:", response.chat_message.to_text())
    await model_client.close()

asyncio.run(main())

当在群聊(例如 RoundRobinGroupChat)中使用配备工具的智能体时,您只需像上面一样为智能体添加工具,然后创建一个包含这些智能体的群聊。

聊天结果#

v0.2 中,您从 initiate_chat 方法获取 ChatResult 对象。例如:

chat_result = tool_executor.initiate_chat(
    tool_caller,
    message=user_input,
    summary_method="reflection_with_llm",
)
print(chat_result.summary) # Get LLM-reflected summary of the chat.
print(chat_result.chat_history) # Get the chat history.
print(chat_result.cost) # Get the cost of the chat.
print(chat_result.human_input) # Get the human input solicited by the chat.

有关详细信息,请参阅ChatResult 文档

v0.4 中,您从 runrun_stream 方法获取 TaskResult 对象。TaskResult 对象包含 messages,即聊天的消息历史记录,包括智能体的私有(工具调用等)和公共消息。

TaskResultChatResult 之间存在一些显著差异

  • TaskResult 中的 messages 列表使用的消息格式与 ChatResult.chat_history 列表不同。

  • 没有 summary 字段。应用程序可以根据 messages 列表决定如何总结聊天。

  • TaskResult 对象中不提供 human_input,因为可以通过 source 字段筛选 messages 列表来提取用户输入。

  • TaskResult 对象中不提供 cost,但是,您可以根据令牌使用量计算成本。添加成本计算将是一个很棒的社区扩展。请参阅社区扩展

v0.2 和 v0.4 消息之间的转换#

您可以使用以下转换函数在 autogen_agentchat.base.TaskResult.messages 中的 v0.4 消息和 ChatResult.chat_history 中的 v0.2 消息之间进行转换。

from typing import Any, Dict, List, Literal

from autogen_agentchat.messages import (
    BaseAgentEvent,
    BaseChatMessage,
    HandoffMessage,
    MultiModalMessage,
    StopMessage,
    TextMessage,
    ToolCallExecutionEvent,
    ToolCallRequestEvent,
    ToolCallSummaryMessage,
)
from autogen_core import FunctionCall, Image
from autogen_core.models import FunctionExecutionResult


def convert_to_v02_message(
    message: BaseAgentEvent | BaseChatMessage,
    role: Literal["assistant", "user", "tool"],
    image_detail: Literal["auto", "high", "low"] = "auto",
) -> Dict[str, Any]:
    """Convert a v0.4 AgentChat message to a v0.2 message.

    Args:
        message (BaseAgentEvent | BaseChatMessage): The message to convert.
        role (Literal["assistant", "user", "tool"]): The role of the message.
        image_detail (Literal["auto", "high", "low"], optional): The detail level of image content in multi-modal message. Defaults to "auto".

    Returns:
        Dict[str, Any]: The converted AutoGen v0.2 message.
    """
    v02_message: Dict[str, Any] = {}
    if isinstance(message, TextMessage | StopMessage | HandoffMessage | ToolCallSummaryMessage):
        v02_message = {"content": message.content, "role": role, "name": message.source}
    elif isinstance(message, MultiModalMessage):
        v02_message = {"content": [], "role": role, "name": message.source}
        for modal in message.content:
            if isinstance(modal, str):
                v02_message["content"].append({"type": "text", "text": modal})
            elif isinstance(modal, Image):
                v02_message["content"].append(modal.to_openai_format(detail=image_detail))
            else:
                raise ValueError(f"Invalid multimodal message content: {modal}")
    elif isinstance(message, ToolCallRequestEvent):
        v02_message = {"tool_calls": [], "role": "assistant", "content": None, "name": message.source}
        for tool_call in message.content:
            v02_message["tool_calls"].append(
                {
                    "id": tool_call.id,
                    "type": "function",
                    "function": {"name": tool_call.name, "args": tool_call.arguments},
                }
            )
    elif isinstance(message, ToolCallExecutionEvent):
        tool_responses: List[Dict[str, str]] = []
        for tool_result in message.content:
            tool_responses.append(
                {
                    "tool_call_id": tool_result.call_id,
                    "role": "tool",
                    "content": tool_result.content,
                }
            )
        content = "\n\n".join([response["content"] for response in tool_responses])
        v02_message = {"tool_responses": tool_responses, "role": "tool", "content": content}
    else:
        raise ValueError(f"Invalid message type: {type(message)}")
    return v02_message


def convert_to_v04_message(message: Dict[str, Any]) -> BaseAgentEvent | BaseChatMessage:
    """Convert a v0.2 message to a v0.4 AgentChat message."""
    if "tool_calls" in message:
        tool_calls: List[FunctionCall] = []
        for tool_call in message["tool_calls"]:
            tool_calls.append(
                FunctionCall(
                    id=tool_call["id"],
                    name=tool_call["function"]["name"],
                    arguments=tool_call["function"]["args"],
                )
            )
        return ToolCallRequestEvent(source=message["name"], content=tool_calls)
    elif "tool_responses" in message:
        tool_results: List[FunctionExecutionResult] = []
        for tool_response in message["tool_responses"]:
            tool_results.append(
                FunctionExecutionResult(
                    call_id=tool_response["tool_call_id"],
                    content=tool_response["content"],
                    is_error=False,
                    name=tool_response["name"],
                )
            )
        return ToolCallExecutionEvent(source="tools", content=tool_results)
    elif isinstance(message["content"], list):
        content: List[str | Image] = []
        for modal in message["content"]:  # type: ignore
            if modal["type"] == "text":  # type: ignore
                content.append(modal["text"])  # type: ignore
            else:
                content.append(Image.from_uri(modal["image_url"]["url"]))  # type: ignore
        return MultiModalMessage(content=content, source=message["name"])
    elif isinstance(message["content"], str):
        return TextMessage(content=message["content"], source=message["name"])
    else:
        raise ValueError(f"Unable to convert message: {message}")

群聊#

v0.2 中,您需要创建一个 GroupChat 类并将其传递给 GroupChatManager,并让一个参与者作为用户代理来发起聊天。对于作家和评论家的简单场景,您可以执行以下操作:

from autogen.agentchat import AssistantAgent, GroupChat, GroupChatManager

llm_config = {
    "config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
    "seed": 42,
    "temperature": 0,
}

writer = AssistantAgent(
    name="writer",
    description="A writer.",
    system_message="You are a writer.",
    llm_config=llm_config,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("APPROVE"),
)

critic = AssistantAgent(
    name="critic",
    description="A critic.",
    system_message="You are a critic, provide feedback on the writing. Reply only 'APPROVE' if the task is done.",
    llm_config=llm_config,
)

# Create a group chat with the writer and critic.
groupchat = GroupChat(agents=[writer, critic], messages=[], max_round=12)

# Create a group chat manager to manage the group chat, use round-robin selection method.
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config, speaker_selection_method="round_robin")

# Initiate the chat with the editor, intermediate messages are printed to the console directly.
result = editor.initiate_chat(
    manager,
    message="Write a short story about a robot that discovers it has feelings.",
)
print(result.summary)

v0.4 中,您可以使用 RoundRobinGroupChat 实现相同的行为。

import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)

    writer = AssistantAgent(
        name="writer",
        description="A writer.",
        system_message="You are a writer.",
        model_client=model_client,
    )

    critic = AssistantAgent(
        name="critic",
        description="A critic.",
        system_message="You are a critic, provide feedback on the writing. Reply only 'APPROVE' if the task is done.",
        model_client=model_client,
    )

    # The termination condition is a text termination, which will cause the chat to terminate when the text "APPROVE" is received.
    termination = TextMentionTermination("APPROVE")

    # The group chat will alternate between the writer and the critic.
    group_chat = RoundRobinGroupChat([writer, critic], termination_condition=termination, max_turns=12)

    # `run_stream` returns an async generator to stream the intermediate messages.
    stream = group_chat.run_stream(task="Write a short story about a robot that discovers it has feelings.")
    # `Console` is a simple UI to display the stream.
    await Console(stream)
    # Close the connection to the model client.
    await model_client.close()

asyncio.run(main())

对于基于 LLM 的发言人选择,您可以使用 SelectorGroupChat。有关更多详细信息,请参阅选择器群聊教程SelectorGroupChat

注意:在 v0.4 中,您无需在用户代理上注册函数即可在群聊中使用工具。您只需将工具函数传递给 AssistantAgent,如工具使用部分所示。智能体将在需要时自动调用工具。如果您的工具输出格式不正确,您可以使用 reflect_on_tool_use 参数让模型反思工具使用。

带恢复功能的群聊#

v0.2 中,带恢复功能的群聊有点复杂。您需要明确保存群聊消息并在要恢复聊天时将其加载回来。有关详细信息,请参阅 v0.2 中的群聊恢复

v0.4 中,您只需使用相同的群聊对象再次调用 runrun_stream 即可恢复聊天。要导出和加载状态,您可以使用 save_stateload_state 方法。

import asyncio
import json
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient

def create_team(model_client : OpenAIChatCompletionClient) -> RoundRobinGroupChat:
    writer = AssistantAgent(
        name="writer",
        description="A writer.",
        system_message="You are a writer.",
        model_client=model_client,
    )

    critic = AssistantAgent(
        name="critic",
        description="A critic.",
        system_message="You are a critic, provide feedback on the writing. Reply only 'APPROVE' if the task is done.",
        model_client=model_client,
    )

    # The termination condition is a text termination, which will cause the chat to terminate when the text "APPROVE" is received.
    termination = TextMentionTermination("APPROVE")

    # The group chat will alternate between the writer and the critic.
    group_chat = RoundRobinGroupChat([writer, critic], termination_condition=termination)

    return group_chat


async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
    # Create team.
    group_chat = create_team(model_client)

    # `run_stream` returns an async generator to stream the intermediate messages.
    stream = group_chat.run_stream(task="Write a short story about a robot that discovers it has feelings.")
    # `Console` is a simple UI to display the stream.
    await Console(stream)

    # Save the state of the group chat and all participants.
    state = await group_chat.save_state()
    with open("group_chat_state.json", "w") as f:
        json.dump(state, f)

    # Create a new team with the same participants configuration.
    group_chat = create_team(model_client)

    # Load the state of the group chat and all participants.
    with open("group_chat_state.json", "r") as f:
        state = json.load(f)
    await group_chat.load_state(state)

    # Resume the chat.
    stream = group_chat.run_stream(task="Translate the story into Chinese.")
    await Console(stream)

    # Close the connection to the model client.
    await model_client.close()

asyncio.run(main())

保存和加载群聊状态#

v0.2 中,您需要明确保存群聊消息并在要恢复聊天时将其加载回来。

v0.4 中,您只需在群聊对象上调用 save_stateload_state 方法。请参见带恢复功能的群聊中的示例。

带工具使用的群聊#

v0.2 群聊中,当涉及工具时,您需要将工具函数注册到用户代理上,并将用户代理包含在群聊中。其他智能体进行的工具调用将被路由到用户代理执行。

我们发现这种方法存在许多问题,例如工具调用路由未按预期工作,以及工具调用请求和结果无法被不支持函数调用的模型接受。

v0.4 中,无需在用户代理上注册工具函数,因为工具直接在 AssistantAgent 中执行,它将工具的响应发布到群聊中。因此,群聊管理器无需参与工具调用路由。

有关在群聊中使用工具的示例,请参阅选择器群聊教程

带自定义选择器(状态流)的群聊#

v0.2 群聊中,当 speaker_selection_method 设置为自定义函数时,它可以覆盖默认的选择方法。这对于实现基于状态的选择方法很有用。有关详细信息,请参阅 v0.2 中的自定义发言人选择

v0.4 中,您可以使用带 selector_funcSelectorGroupChat 来实现相同的行为。selector_func 是一个函数,它接受群聊的当前消息线程并返回下一个发言人的名称。如果返回 None,则将使用基于 LLM 的选择方法。

以下是使用基于状态的选择方法实现网页搜索/分析场景的示例。

import asyncio
from typing import Sequence
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import MaxMessageTermination, TextMentionTermination
from autogen_agentchat.messages import BaseAgentEvent, BaseChatMessage
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient

# Note: This example uses mock tools instead of real APIs for demonstration purposes
def search_web_tool(query: str) -> str:
    if "2006-2007" in query:
        return """Here are the total points scored by Miami Heat players in the 2006-2007 season:
        Udonis Haslem: 844 points
        Dwayne Wade: 1397 points
        James Posey: 550 points
        ...
        """
    elif "2007-2008" in query:
        return "The number of total rebounds for Dwayne Wade in the Miami Heat season 2007-2008 is 214."
    elif "2008-2009" in query:
        return "The number of total rebounds for Dwayne Wade in the Miami Heat season 2008-2009 is 398."
    return "No data found."


def percentage_change_tool(start: float, end: float) -> float:
    return ((end - start) / start) * 100

def create_team(model_client : OpenAIChatCompletionClient) -> SelectorGroupChat:
    planning_agent = AssistantAgent(
        "PlanningAgent",
        description="An agent for planning tasks, this agent should be the first to engage when given a new task.",
        model_client=model_client,
        system_message="""
        You are a planning agent.
        Your job is to break down complex tasks into smaller, manageable subtasks.
        Your team members are:
            Web search agent: Searches for information
            Data analyst: Performs calculations

        You only plan and delegate tasks - you do not execute them yourself.

        When assigning tasks, use this format:
        1. <agent> : <task>

        After all tasks are complete, summarize the findings and end with "TERMINATE".
        """,
    )

    web_search_agent = AssistantAgent(
        "WebSearchAgent",
        description="A web search agent.",
        tools=[search_web_tool],
        model_client=model_client,
        system_message="""
        You are a web search agent.
        Your only tool is search_tool - use it to find information.
        You make only one search call at a time.
        Once you have the results, you never do calculations based on them.
        """,
    )

    data_analyst_agent = AssistantAgent(
        "DataAnalystAgent",
        description="A data analyst agent. Useful for performing calculations.",
        model_client=model_client,
        tools=[percentage_change_tool],
        system_message="""
        You are a data analyst.
        Given the tasks you have been assigned, you should analyze the data and provide results using the tools provided.
        """,
    )

    # The termination condition is a combination of text mention termination and max message termination.
    text_mention_termination = TextMentionTermination("TERMINATE")
    max_messages_termination = MaxMessageTermination(max_messages=25)
    termination = text_mention_termination | max_messages_termination

    # The selector function is a function that takes the current message thread of the group chat
    # and returns the next speaker's name. If None is returned, the LLM-based selection method will be used.
    def selector_func(messages: Sequence[BaseAgentEvent | BaseChatMessage]) -> str | None:
        if messages[-1].source != planning_agent.name:
            return planning_agent.name # Always return to the planning agent after the other agents have spoken.
        return None

    team = SelectorGroupChat(
        [planning_agent, web_search_agent, data_analyst_agent],
        model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"), # Use a smaller model for the selector.
        termination_condition=termination,
        selector_func=selector_func,
    )
    return team

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o")
    team = create_team(model_client)
    task = "Who was the Miami Heat player with the highest points in the 2006-2007 season, and what was the percentage change in his total rebounds between the 2007-2008 and 2008-2009 seasons?"
    await Console(team.run_stream(task=task))

asyncio.run(main())

嵌套聊天#

嵌套聊天允许您将整个团队或另一个智能体嵌套在一个智能体中。这对于创建智能体的层次结构或“信息孤岛”很有用,因为嵌套的智能体不能直接与同一组之外的其他智能体通信。

v0.2 中,通过在 ConversableAgent 类上使用 register_nested_chats 方法支持嵌套聊天。您需要使用字典指定智能体的嵌套序列,有关详细信息,请参阅 v0.2 中的嵌套聊天

v0.4 中,嵌套聊天是自定义智能体的实现细节。您可以创建一个自定义智能体,该智能体将一个团队或另一个智能体作为参数,并实现 on_messages 方法来触发嵌套团队或智能体。由应用程序决定如何传递或转换来自嵌套团队或智能体的消息。

以下示例展示了一个简单的嵌套聊天,用于计数数字。

import asyncio
from typing import Sequence
from autogen_core import CancellationToken
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.messages import TextMessage, BaseChatMessage
from autogen_agentchat.base import Response

class CountingAgent(BaseChatAgent):
    """An agent that returns a new number by adding 1 to the last number in the input messages."""
    async def on_messages(self, messages: Sequence[BaseChatMessage], cancellation_token: CancellationToken) -> Response:
        if len(messages) == 0:
            last_number = 0 # Start from 0 if no messages are given.
        else:
            assert isinstance(messages[-1], TextMessage)
            last_number = int(messages[-1].content) # Otherwise, start from the last number.
        return Response(chat_message=TextMessage(content=str(last_number + 1), source=self.name))

    async def on_reset(self, cancellation_token: CancellationToken) -> None:
        pass

    @property
    def produced_message_types(self) -> Sequence[type[BaseChatMessage]]:
        return (TextMessage,)

class NestedCountingAgent(BaseChatAgent):
    """An agent that increments the last number in the input messages
    multiple times using a nested counting team."""
    def __init__(self, name: str, counting_team: RoundRobinGroupChat) -> None:
        super().__init__(name, description="An agent that counts numbers.")
        self._counting_team = counting_team

    async def on_messages(self, messages: Sequence[BaseChatMessage], cancellation_token: CancellationToken) -> Response:
        # Run the inner team with the given messages and returns the last message produced by the team.
        result = await self._counting_team.run(task=messages, cancellation_token=cancellation_token)
        # To stream the inner messages, implement `on_messages_stream` and use that to implement `on_messages`.
        assert isinstance(result.messages[-1], TextMessage)
        return Response(chat_message=result.messages[-1], inner_messages=result.messages[len(messages):-1])

    async def on_reset(self, cancellation_token: CancellationToken) -> None:
        # Reset the inner team.
        await self._counting_team.reset()

    @property
    def produced_message_types(self) -> Sequence[type[BaseChatMessage]]:
        return (TextMessage,)

async def main() -> None:
    # Create a team of two counting agents as the inner team.
    counting_agent_1 = CountingAgent("counting_agent_1", description="An agent that counts numbers.")
    counting_agent_2 = CountingAgent("counting_agent_2", description="An agent that counts numbers.")
    counting_team = RoundRobinGroupChat([counting_agent_1, counting_agent_2], max_turns=5)
    # Create a nested counting agent that takes the inner team as a parameter.
    nested_counting_agent = NestedCountingAgent("nested_counting_agent", counting_team)
    # Run the nested counting agent with a message starting from 1.
    response = await nested_counting_agent.on_messages([TextMessage(content="1", source="user")], CancellationToken())
    assert response.inner_messages is not None
    for message in response.inner_messages:
        print(message)
    print(response.chat_message)

asyncio.run(main())

您应该看到以下输出

source='counting_agent_1' models_usage=None content='2' type='TextMessage'
source='counting_agent_2' models_usage=None content='3' type='TextMessage'
source='counting_agent_1' models_usage=None content='4' type='TextMessage'
source='counting_agent_2' models_usage=None content='5' type='TextMessage'
source='counting_agent_1' models_usage=None content='6' type='TextMessage'

您可以查看 SocietyOfMindAgent 以了解更复杂的实现。

顺序聊天#

v0.2 中,通过使用 initiate_chats 函数支持顺序聊天。它接受一个字典配置列表作为输入,用于序列的每个步骤。有关详细信息,请参阅 v0.2 中的顺序聊天

根据社区的反馈,initiate_chats 函数过于主观,灵活性不足以支持用户希望实现的各种场景。我们经常发现用户在尝试使 initiate_chats 函数工作时遇到困难,而他们可以轻松地使用基本的 Python 代码将步骤连接起来。因此,在 v0.4 中,我们没有在 AgentChat API 中提供内置的顺序聊天函数。

相反,您可以使用核心 API 创建一个事件驱动的顺序工作流,并使用 AgentChat API 提供的其他组件来实现工作流的每个步骤。请参阅核心 API 教程中的顺序工作流示例。

我们认识到工作流概念是许多应用程序的核心,未来我们将为工作流提供更多内置支持。

GPTAssistantAgent#

v0.2 中,GPTAssistantAgent 是一个由 OpenAI Assistant API 支持的特殊智能体类。

v0.4 中,等效的是 OpenAIAssistantAgent 类。它支持与 v0.2 中的 GPTAssistantAgent 相同的功能集,并增加了更多功能,例如可定制的线程和文件上传。有关详细信息,请参阅 OpenAIAssistantAgent

长上下文处理#

v0.2 中,可以通过使用添加到 ConversableAgent 之后构建的 transforms 功能来处理超出模型上下文窗口的长上下文。

来自我们社区的反馈让我们相信此功能至关重要,并且应该成为 AssistantAgent 的内置组件,并且可以用于每个自定义智能体。

v0.4 中,我们引入了 ChatCompletionContext 基类,它管理消息历史记录并提供历史记录的虚拟视图。应用程序可以使用内置实现,例如 BufferedChatCompletionContext 来限制发送到模型的历史消息,或者提供自己的实现来创建不同的虚拟视图。

在聊天机器人场景中,在 AssistantAgent 中使用 BufferedChatCompletionContext

import asyncio
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken
from autogen_core.model_context import BufferedChatCompletionContext
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)

    assistant = AssistantAgent(
        name="assistant",
        system_message="You are a helpful assistant.",
        model_client=model_client,
        model_context=BufferedChatCompletionContext(buffer_size=10), # Model can only view the last 10 messages.
    )
    while True:
        user_input = input("User: ")
        if user_input == "exit":
            break
        response = await assistant.on_messages([TextMessage(content=user_input, source="user")], CancellationToken())
        print("Assistant:", response.chat_message.to_text())
    
    await model_client.close()

asyncio.run(main())

在此示例中,聊天机器人只能读取历史记录中的最后 10 条消息。

可观察性和控制#

v0.4 AgentChat 中,您可以使用 on_messages_stream 方法观察智能体,该方法返回一个异步生成器以流式传输智能体的内心想法和行动。对于团队,您可以使用 run_stream 方法流式传输团队中智能体之间的内部对话。您的应用程序可以使用这些流来实时观察智能体和团队。

on_messages_streamrun_stream 方法都接受 CancellationToken 作为参数,该参数可用于异步取消输出流并停止智能体或团队。对于团队,您还可以使用终止条件在满足特定条件时停止团队。有关详细信息,请参阅终止条件教程

v0.2 附带的特殊日志模块不同,v0.4 API 仅使用 Python 的 logging 模块来记录事件,例如模型客户端调用。有关详细信息,请参阅核心 API 文档中的日志记录

代码执行器#

v0.2v0.4 中的代码执行器几乎相同,只是 v0.4 执行器支持异步 API。您还可以使用 CancellationToken 来取消长时间的代码执行。请参阅核心 API 文档中的命令行代码执行器教程

我们还添加了 ACADynamicSessionsCodeExecutor,它可以使用 Azure Container Apps (ACA) 动态会话进行代码执行。请参阅ACA 动态会话代码执行器文档