v0.2 到 v0.4 迁移指南#
这是针对 autogen-agentchat v0.2.* 版本用户向 v0.4 版本迁移的指南,v0.4 版本引入了一套新的 API 和功能,并包含了一些重大更改。请仔细阅读本指南。我们仍将在 0.2 分支中维护 v0.2 版本;但是,我们强烈建议您升级到 v0.4 版本。
注意
我们不再拥有 pyautogen PyPI 包的管理员权限,自 0.2.34 版本以来,该包的发布不再来自 Microsoft。要继续使用 AutoGen 的 v0.2 版本,请使用 autogen-agentchat~=0.2 进行安装。请阅读我们关于分支的澄清声明。
v0.4 是什么?#
自 AutoGen 于 2023 年发布以来,我们 intensively 听取了我们社区以及来自小型初创公司和大型企业用户的反馈,收集了大量信息。基于这些反馈,我们构建了 AutoGen v0.4,这是一个从头开始重写的版本,采用了异步、事件驱动的架构,以解决可观察性、灵活性、交互式控制和规模等问题。
v0.4 API 是分层的:核心 API 是基础层,提供了一个可扩展的、事件驱动的 actor 框架来创建智能体工作流;AgentChat API 构建在核心 API 之上,提供了一个任务驱动的、高级框架来构建交互式智能体应用程序。它是 AutoGen v0.2 的替代品。
本指南的大部分内容都侧重于 v0.4 的 AgentChat API;但是,您也可以仅使用核心 API 构建自己的高级框架。
AutoGen 新手?#
直接跳转到AgentChat 教程以开始使用 v0.4。
本指南包含什么?#
我们提供了关于如何将现有代码库从 v0.2 迁移到 v0.4 的详细指南。
请参阅下面的每个功能,了解如何迁移的详细信息。
v0.2 中当前存在的以下功能将在 v0.4.* 版本的未来版本中提供
模型客户端成本 #4835
可教学智能体
RAG 智能体
当缺失功能可用时,我们将更新本指南。
模型客户端#
在 v0.2 中,您按如下方式配置模型客户端,并创建 OpenAIWrapper 对象。
from autogen.oai import OpenAIWrapper
config_list = [
{"model": "gpt-4o", "api_key": "sk-xxx"},
{"model": "gpt-4o-mini", "api_key": "sk-xxx"},
]
model_client = OpenAIWrapper(config_list=config_list)
注意:在 AutoGen 0.2 中,OpenAI 客户端将尝试列表中的配置,直到其中一个起作用。而 0.4 则期望选择特定的模型配置。
在 v0.4 中,我们提供了两种创建模型客户端的方法。
使用组件配置#
AutoGen 0.4 有一个通用组件配置系统。模型客户端是此系统的一个绝佳用例。请参见下文,了解如何创建 OpenAI 聊天完成客户端。
from autogen_core.models import ChatCompletionClient
config = {
"provider": "OpenAIChatCompletionClient",
"config": {
"model": "gpt-4o",
"api_key": "sk-xxx" # os.environ["...']
}
}
model_client = ChatCompletionClient.load_component(config)
直接使用模型客户端类#
Open AI
from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient(model="gpt-4o", api_key="sk-xxx")
Azure OpenAI
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
model_client = AzureOpenAIChatCompletionClient(
azure_deployment="gpt-4o",
azure_endpoint="https://<your-endpoint>.openai.azure.com/",
model="gpt-4o",
api_version="2024-09-01-preview",
api_key="sk-xxx",
)
阅读更多关于 OpenAIChatCompletionClient 的信息。
OpenAI 兼容 API 的模型客户端#
您可以使用 OpenAIChatCompletionClient 连接到 OpenAI 兼容的 API,但您需要指定 base_url 和 model_info。
from autogen_ext.models.openai import OpenAIChatCompletionClient
custom_model_client = OpenAIChatCompletionClient(
model="custom-model-name",
base_url="https://custom-model.com/reset/of/the/path",
api_key="placeholder",
model_info={
"vision": True,
"function_calling": True,
"json_output": True,
"family": "unknown",
"structured_output": True,
},
)
注意:我们不测试所有 OpenAI 兼容的 API,尽管它们可能声称支持 OpenAI API,但其中许多 API 的工作方式与 OpenAI API 不同。请在使用前进行测试。
阅读 AgentChat 教程中的模型客户端以及核心 API 文档中的更多详细信息。
未来将添加对其他托管模型的支持。
模型客户端缓存#
在 v0.2 中,您可以通过 LLM 配置中的 cache_seed 参数设置缓存种子。缓存默认启用。
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
"seed": 42,
"temperature": 0,
"cache_seed": 42,
}
在 v0.4 中,缓存默认不启用,要使用它,您需要使用 ChatCompletionCache 包装模型客户端。
您可以使用 DiskCacheStore 或 RedisStore 来存储缓存。
pip install -U "autogen-ext[openai, diskcache, redis]"
以下是使用 diskcache 进行本地缓存的示例。
import asyncio
import tempfile
from autogen_core.models import UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.models.cache import ChatCompletionCache, CHAT_CACHE_VALUE_TYPE
from autogen_ext.cache_store.diskcache import DiskCacheStore
from diskcache import Cache
async def main():
with tempfile.TemporaryDirectory() as tmpdirname:
# Initialize the original client
openai_model_client = OpenAIChatCompletionClient(model="gpt-4o")
# Then initialize the CacheStore, in this case with diskcache.Cache.
# You can also use redis like:
# from autogen_ext.cache_store.redis import RedisStore
# import redis
# redis_instance = redis.Redis()
# cache_store = RedisCacheStore[CHAT_CACHE_VALUE_TYPE](redis_instance)
cache_store = DiskCacheStore[CHAT_CACHE_VALUE_TYPE](Cache(tmpdirname))
cache_client = ChatCompletionCache(openai_model_client, cache_store)
response = await cache_client.create([UserMessage(content="Hello, how are you?", source="user")])
print(response) # Should print response from OpenAI
response = await cache_client.create([UserMessage(content="Hello, how are you?", source="user")])
print(response) # Should print cached response
await openai_model_client.close()
asyncio.run(main())
助手智能体#
在 v0.2 中,您按如下方式创建助手智能体:
from autogen.agentchat import AssistantAgent
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
"seed": 42,
"temperature": 0,
}
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant.",
llm_config=llm_config,
)
在 v0.4 中,类似,但您需要指定 model_client 而不是 llm_config。
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
model_client = OpenAIChatCompletionClient(model="gpt-4o", api_key="sk-xxx", seed=42, temperature=0)
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant.",
model_client=model_client,
)
但是,用法有些不同。在 v0.4 中,您调用 assistant.on_messages 或 assistant.on_messages_stream 来处理传入消息,而不是调用 assistant.send。此外,on_messages 和 on_messages_stream 方法是异步的,后者返回一个异步生成器以流式传输智能体的内心想法。
以下是您如何在 v0.4 中直接调用助手智能体的示例,继续上面的例子:
import asyncio
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant.",
model_client=model_client,
)
cancellation_token = CancellationToken()
response = await assistant.on_messages([TextMessage(content="Hello!", source="user")], cancellation_token)
print(response)
await model_client.close()
asyncio.run(main())
当您调用 cancellation_token.cancel() 时,CancellationToken 可用于异步取消请求,这将导致 on_messages 调用上的 await 引发 CancelledError。
阅读更多关于智能体教程和AssistantAgent的信息。
多模态智能体#
如果模型客户端支持多模态输入,v0.4 中的 AssistantAgent 将支持。模型客户端的 vision 能力用于确定智能体是否支持多模态输入。
import asyncio
from pathlib import Path
from autogen_agentchat.messages import MultiModalMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken, Image
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant.",
model_client=model_client,
)
cancellation_token = CancellationToken()
message = MultiModalMessage(
content=["Here is an image:", Image.from_file(Path("test.png"))],
source="user",
)
response = await assistant.on_messages([message], cancellation_token)
print(response)
await model_client.close()
asyncio.run(main())
用户代理#
在 v0.2 中,您按如下方式创建用户代理
from autogen.agentchat import UserProxyAgent
user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
code_execution_config=False,
llm_config=False,
)
此用户代理将通过控制台从用户获取输入,并且如果传入消息以“TERMINATE”结尾,则会终止。
在 v0.4 中,用户代理只是一个只接受用户输入的智能体,不需要其他特殊配置。您可以按如下方式创建用户代理:
from autogen_agentchat.agents import UserProxyAgent
user_proxy = UserProxyAgent("user_proxy")
请参阅 UserProxyAgent 了解更多详细信息以及如何使用超时自定义输入函数。
RAG 智能体#
在 v0.2 中,存在可教学智能体和可接受数据库配置的 RAG 智能体概念。
teachable_agent = ConversableAgent(
name="teachable_agent",
llm_config=llm_config
)
# Instantiate a Teachability object. Its parameters are all optional.
teachability = Teachability(
reset_db=False,
path_to_db_dir="./tmp/interactive/teachability_db"
)
teachability.add_to_agent(teachable_agent)
在 v0.4 中,您可以使用 Memory 类实现 RAG 智能体。具体来说,您可以定义一个内存存储类,并将其作为参数传递给助手智能体。有关详细信息,请参阅内存教程。
这种清晰的职责分离允许您实现一个使用任何数据库或存储系统(您必须继承自 Memory 类)的内存存储,并将其与助手智能体一起使用。下面的示例展示了如何将 ChromaDB 向量内存存储与助手智能体一起使用。此外,您的应用程序逻辑应决定何时以及如何向内存存储添加内容。例如,您可以选择为助手智能体的每次响应调用 memory.add,或者使用单独的 LLM 调用来确定是否应将内容添加到内存存储。
# ...
# example of a ChromaDBVectorMemory class
chroma_user_memory = ChromaDBVectorMemory(
config=PersistentChromaDBVectorMemoryConfig(
collection_name="preferences",
persistence_path=os.path.join(str(Path.home()), ".chromadb_autogen"),
k=2, # Return top k results
score_threshold=0.4, # Minimum similarity score
)
)
# you can add logic such as a document indexer that adds content to the memory store
assistant_agent = AssistantAgent(
name="assistant_agent",
model_client=OpenAIChatCompletionClient(
model="gpt-4o",
),
tools=[get_weather],
memory=[chroma_user_memory],
)
可对话智能体和注册回复#
在 v0.2 中,您可以按如下方式创建可对话智能体并注册回复函数
from typing import Any, Dict, List, Optional, Tuple, Union
from autogen.agentchat import ConversableAgent
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
"seed": 42,
"temperature": 0,
}
conversable_agent = ConversableAgent(
name="conversable_agent",
system_message="You are a helpful assistant.",
llm_config=llm_config,
code_execution_config={"work_dir": "coding"},
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
)
def reply_func(
recipient: ConversableAgent,
messages: Optional[List[Dict]] = None,
sender: Optional[Agent] = None,
config: Optional[Any] = None,
) -> Tuple[bool, Union[str, Dict, None]]:
# Custom reply logic here
return True, "Custom reply"
# Register the reply function
conversable_agent.register_reply([ConversableAgent], reply_func, position=0)
# NOTE: An async reply function will only be invoked with async send.
在 v0.4 中,与其猜测 reply_func 的作用、其所有参数以及 position 应该是什么,我们只需创建一个自定义智能体并实现 on_messages、on_reset 和 produced_message_types 方法。
from typing import Sequence
from autogen_core import CancellationToken
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.messages import TextMessage, BaseChatMessage
from autogen_agentchat.base import Response
class CustomAgent(BaseChatAgent):
async def on_messages(self, messages: Sequence[BaseChatMessage], cancellation_token: CancellationToken) -> Response:
return Response(chat_message=TextMessage(content="Custom reply", source=self.name))
async def on_reset(self, cancellation_token: CancellationToken) -> None:
pass
@property
def produced_message_types(self) -> Sequence[type[BaseChatMessage]]:
return (TextMessage,)
然后,您可以像使用 AssistantAgent 一样使用自定义智能体。有关详细信息,请参阅自定义智能体教程。
保存和加载智能体状态#
在 v0.2 中,没有内置的方法来保存和加载智能体的状态:您需要通过导出 ConversableAgent 的 chat_messages 属性并通过 chat_messages 参数将其导入来实现。
在 v0.4 中,您可以调用智能体上的 save_state 和 load_state 方法来保存和加载它们的状态。
import asyncio
import json
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant.",
model_client=model_client,
)
cancellation_token = CancellationToken()
response = await assistant.on_messages([TextMessage(content="Hello!", source="user")], cancellation_token)
print(response)
# Save the state.
state = await assistant.save_state()
# (Optional) Write state to disk.
with open("assistant_state.json", "w") as f:
json.dump(state, f)
# (Optional) Load it back from disk.
with open("assistant_state.json", "r") as f:
state = json.load(f)
print(state) # Inspect the state, which contains the chat history.
# Carry on the chat.
response = await assistant.on_messages([TextMessage(content="Tell me a joke.", source="user")], cancellation_token)
print(response)
# Load the state, resulting the agent to revert to the previous state before the last message.
await assistant.load_state(state)
# Carry on the same chat again.
response = await assistant.on_messages([TextMessage(content="Tell me a joke.", source="user")], cancellation_token)
# Close the connection to the model client.
await model_client.close()
asyncio.run(main())
您也可以在任何团队上调用 save_state 和 load_state,例如 RoundRobinGroupChat,以保存和加载整个团队的状态。
双智能体聊天#
在 v0.2 中,您可以按如下方式创建用于代码执行的双智能体聊天:
from autogen.coding import LocalCommandLineCodeExecutor
from autogen.agentchat import AssistantAgent, UserProxyAgent
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
"seed": 42,
"temperature": 0,
}
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant. Write all code in python. Reply only 'TERMINATE' if the task is done.",
llm_config=llm_config,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)
user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
code_execution_config={"code_executor": LocalCommandLineCodeExecutor(work_dir="coding")},
llm_config=False,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
)
chat_result = user_proxy.initiate_chat(assistant, message="Write a python script to print 'Hello, world!'")
# Intermediate messages are printed to the console directly.
print(chat_result)
要在 v0.4 中获得相同的行为,您可以在 RoundRobinGroupChat 中一起使用 AssistantAgent 和 CodeExecutorAgent。
import asyncio
from autogen_agentchat.agents import AssistantAgent, CodeExecutorAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination, MaxMessageTermination
from autogen_agentchat.ui import Console
from autogen_ext.code_executors.local import LocalCommandLineCodeExecutor
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant. Write all code in python. Reply only 'TERMINATE' if the task is done.",
model_client=model_client,
)
code_executor = CodeExecutorAgent(
name="code_executor",
code_executor=LocalCommandLineCodeExecutor(work_dir="coding"),
)
# The termination condition is a combination of text termination and max message termination, either of which will cause the chat to terminate.
termination = TextMentionTermination("TERMINATE") | MaxMessageTermination(10)
# The group chat will alternate between the assistant and the code executor.
group_chat = RoundRobinGroupChat([assistant, code_executor], termination_condition=termination)
# `run_stream` returns an async generator to stream the intermediate messages.
stream = group_chat.run_stream(task="Write a python script to print 'Hello, world!'")
# `Console` is a simple UI to display the stream.
await Console(stream)
# Close the connection to the model client.
await model_client.close()
asyncio.run(main())
工具使用#
在 v0.2 中,要创建工具使用聊天机器人,您必须有两个智能体,一个用于调用工具,一个用于执行工具。您需要为每个用户请求启动一个双智能体聊天。
from autogen.agentchat import AssistantAgent, UserProxyAgent, register_function
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
"seed": 42,
"temperature": 0,
}
tool_caller = AssistantAgent(
name="tool_caller",
system_message="You are a helpful assistant. You can call tools to help user.",
llm_config=llm_config,
max_consecutive_auto_reply=1, # Set to 1 so that we return to the application after each assistant reply as we are building a chatbot.
)
tool_executor = UserProxyAgent(
name="tool_executor",
human_input_mode="NEVER",
code_execution_config=False,
llm_config=False,
)
def get_weather(city: str) -> str:
return f"The weather in {city} is 72 degree and sunny."
# Register the tool function to the tool caller and executor.
register_function(get_weather, caller=tool_caller, executor=tool_executor)
while True:
user_input = input("User: ")
if user_input == "exit":
break
chat_result = tool_executor.initiate_chat(
tool_caller,
message=user_input,
summary_method="reflection_with_llm", # To let the model reflect on the tool use, set to "last_msg" to return the tool call result directly.
)
print("Assistant:", chat_result.summary)
在 v0.4 中,您只需要一个智能体——AssistantAgent——即可处理工具调用和工具执行。
import asyncio
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.messages import TextMessage
def get_weather(city: str) -> str: # Async tool is possible too.
return f"The weather in {city} is 72 degree and sunny."
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant. You can call tools to help user.",
model_client=model_client,
tools=[get_weather],
reflect_on_tool_use=True, # Set to True to have the model reflect on the tool use, set to False to return the tool call result directly.
)
while True:
user_input = input("User: ")
if user_input == "exit":
break
response = await assistant.on_messages([TextMessage(content=user_input, source="user")], CancellationToken())
print("Assistant:", response.chat_message.to_text())
await model_client.close()
asyncio.run(main())
当在群聊(例如 RoundRobinGroupChat)中使用配备工具的智能体时,您只需像上面一样为智能体添加工具,然后创建一个包含这些智能体的群聊。
聊天结果#
在 v0.2 中,您从 initiate_chat 方法获取 ChatResult 对象。例如:
chat_result = tool_executor.initiate_chat(
tool_caller,
message=user_input,
summary_method="reflection_with_llm",
)
print(chat_result.summary) # Get LLM-reflected summary of the chat.
print(chat_result.chat_history) # Get the chat history.
print(chat_result.cost) # Get the cost of the chat.
print(chat_result.human_input) # Get the human input solicited by the chat.
有关详细信息,请参阅ChatResult 文档。
在 v0.4 中,您从 run 或 run_stream 方法获取 TaskResult 对象。TaskResult 对象包含 messages,即聊天的消息历史记录,包括智能体的私有(工具调用等)和公共消息。
TaskResult 和 ChatResult 之间存在一些显著差异
TaskResult中的messages列表使用的消息格式与ChatResult.chat_history列表不同。没有
summary字段。应用程序可以根据messages列表决定如何总结聊天。TaskResult对象中不提供human_input,因为可以通过source字段筛选messages列表来提取用户输入。TaskResult对象中不提供cost,但是,您可以根据令牌使用量计算成本。添加成本计算将是一个很棒的社区扩展。请参阅社区扩展。
v0.2 和 v0.4 消息之间的转换#
您可以使用以下转换函数在 autogen_agentchat.base.TaskResult.messages 中的 v0.4 消息和 ChatResult.chat_history 中的 v0.2 消息之间进行转换。
from typing import Any, Dict, List, Literal
from autogen_agentchat.messages import (
BaseAgentEvent,
BaseChatMessage,
HandoffMessage,
MultiModalMessage,
StopMessage,
TextMessage,
ToolCallExecutionEvent,
ToolCallRequestEvent,
ToolCallSummaryMessage,
)
from autogen_core import FunctionCall, Image
from autogen_core.models import FunctionExecutionResult
def convert_to_v02_message(
message: BaseAgentEvent | BaseChatMessage,
role: Literal["assistant", "user", "tool"],
image_detail: Literal["auto", "high", "low"] = "auto",
) -> Dict[str, Any]:
"""Convert a v0.4 AgentChat message to a v0.2 message.
Args:
message (BaseAgentEvent | BaseChatMessage): The message to convert.
role (Literal["assistant", "user", "tool"]): The role of the message.
image_detail (Literal["auto", "high", "low"], optional): The detail level of image content in multi-modal message. Defaults to "auto".
Returns:
Dict[str, Any]: The converted AutoGen v0.2 message.
"""
v02_message: Dict[str, Any] = {}
if isinstance(message, TextMessage | StopMessage | HandoffMessage | ToolCallSummaryMessage):
v02_message = {"content": message.content, "role": role, "name": message.source}
elif isinstance(message, MultiModalMessage):
v02_message = {"content": [], "role": role, "name": message.source}
for modal in message.content:
if isinstance(modal, str):
v02_message["content"].append({"type": "text", "text": modal})
elif isinstance(modal, Image):
v02_message["content"].append(modal.to_openai_format(detail=image_detail))
else:
raise ValueError(f"Invalid multimodal message content: {modal}")
elif isinstance(message, ToolCallRequestEvent):
v02_message = {"tool_calls": [], "role": "assistant", "content": None, "name": message.source}
for tool_call in message.content:
v02_message["tool_calls"].append(
{
"id": tool_call.id,
"type": "function",
"function": {"name": tool_call.name, "args": tool_call.arguments},
}
)
elif isinstance(message, ToolCallExecutionEvent):
tool_responses: List[Dict[str, str]] = []
for tool_result in message.content:
tool_responses.append(
{
"tool_call_id": tool_result.call_id,
"role": "tool",
"content": tool_result.content,
}
)
content = "\n\n".join([response["content"] for response in tool_responses])
v02_message = {"tool_responses": tool_responses, "role": "tool", "content": content}
else:
raise ValueError(f"Invalid message type: {type(message)}")
return v02_message
def convert_to_v04_message(message: Dict[str, Any]) -> BaseAgentEvent | BaseChatMessage:
"""Convert a v0.2 message to a v0.4 AgentChat message."""
if "tool_calls" in message:
tool_calls: List[FunctionCall] = []
for tool_call in message["tool_calls"]:
tool_calls.append(
FunctionCall(
id=tool_call["id"],
name=tool_call["function"]["name"],
arguments=tool_call["function"]["args"],
)
)
return ToolCallRequestEvent(source=message["name"], content=tool_calls)
elif "tool_responses" in message:
tool_results: List[FunctionExecutionResult] = []
for tool_response in message["tool_responses"]:
tool_results.append(
FunctionExecutionResult(
call_id=tool_response["tool_call_id"],
content=tool_response["content"],
is_error=False,
name=tool_response["name"],
)
)
return ToolCallExecutionEvent(source="tools", content=tool_results)
elif isinstance(message["content"], list):
content: List[str | Image] = []
for modal in message["content"]: # type: ignore
if modal["type"] == "text": # type: ignore
content.append(modal["text"]) # type: ignore
else:
content.append(Image.from_uri(modal["image_url"]["url"])) # type: ignore
return MultiModalMessage(content=content, source=message["name"])
elif isinstance(message["content"], str):
return TextMessage(content=message["content"], source=message["name"])
else:
raise ValueError(f"Unable to convert message: {message}")
群聊#
在 v0.2 中,您需要创建一个 GroupChat 类并将其传递给 GroupChatManager,并让一个参与者作为用户代理来发起聊天。对于作家和评论家的简单场景,您可以执行以下操作:
from autogen.agentchat import AssistantAgent, GroupChat, GroupChatManager
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "sk-xxx"}],
"seed": 42,
"temperature": 0,
}
writer = AssistantAgent(
name="writer",
description="A writer.",
system_message="You are a writer.",
llm_config=llm_config,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("APPROVE"),
)
critic = AssistantAgent(
name="critic",
description="A critic.",
system_message="You are a critic, provide feedback on the writing. Reply only 'APPROVE' if the task is done.",
llm_config=llm_config,
)
# Create a group chat with the writer and critic.
groupchat = GroupChat(agents=[writer, critic], messages=[], max_round=12)
# Create a group chat manager to manage the group chat, use round-robin selection method.
manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config, speaker_selection_method="round_robin")
# Initiate the chat with the editor, intermediate messages are printed to the console directly.
result = editor.initiate_chat(
manager,
message="Write a short story about a robot that discovers it has feelings.",
)
print(result.summary)
在 v0.4 中,您可以使用 RoundRobinGroupChat 实现相同的行为。
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
writer = AssistantAgent(
name="writer",
description="A writer.",
system_message="You are a writer.",
model_client=model_client,
)
critic = AssistantAgent(
name="critic",
description="A critic.",
system_message="You are a critic, provide feedback on the writing. Reply only 'APPROVE' if the task is done.",
model_client=model_client,
)
# The termination condition is a text termination, which will cause the chat to terminate when the text "APPROVE" is received.
termination = TextMentionTermination("APPROVE")
# The group chat will alternate between the writer and the critic.
group_chat = RoundRobinGroupChat([writer, critic], termination_condition=termination, max_turns=12)
# `run_stream` returns an async generator to stream the intermediate messages.
stream = group_chat.run_stream(task="Write a short story about a robot that discovers it has feelings.")
# `Console` is a simple UI to display the stream.
await Console(stream)
# Close the connection to the model client.
await model_client.close()
asyncio.run(main())
对于基于 LLM 的发言人选择,您可以使用 SelectorGroupChat。有关更多详细信息,请参阅选择器群聊教程和 SelectorGroupChat。
注意:在
v0.4中,您无需在用户代理上注册函数即可在群聊中使用工具。您只需将工具函数传递给AssistantAgent,如工具使用部分所示。智能体将在需要时自动调用工具。如果您的工具输出格式不正确,您可以使用reflect_on_tool_use参数让模型反思工具使用。
带恢复功能的群聊#
在 v0.2 中,带恢复功能的群聊有点复杂。您需要明确保存群聊消息并在要恢复聊天时将其加载回来。有关详细信息,请参阅 v0.2 中的群聊恢复。
在 v0.4 中,您只需使用相同的群聊对象再次调用 run 或 run_stream 即可恢复聊天。要导出和加载状态,您可以使用 save_state 和 load_state 方法。
import asyncio
import json
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
def create_team(model_client : OpenAIChatCompletionClient) -> RoundRobinGroupChat:
writer = AssistantAgent(
name="writer",
description="A writer.",
system_message="You are a writer.",
model_client=model_client,
)
critic = AssistantAgent(
name="critic",
description="A critic.",
system_message="You are a critic, provide feedback on the writing. Reply only 'APPROVE' if the task is done.",
model_client=model_client,
)
# The termination condition is a text termination, which will cause the chat to terminate when the text "APPROVE" is received.
termination = TextMentionTermination("APPROVE")
# The group chat will alternate between the writer and the critic.
group_chat = RoundRobinGroupChat([writer, critic], termination_condition=termination)
return group_chat
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
# Create team.
group_chat = create_team(model_client)
# `run_stream` returns an async generator to stream the intermediate messages.
stream = group_chat.run_stream(task="Write a short story about a robot that discovers it has feelings.")
# `Console` is a simple UI to display the stream.
await Console(stream)
# Save the state of the group chat and all participants.
state = await group_chat.save_state()
with open("group_chat_state.json", "w") as f:
json.dump(state, f)
# Create a new team with the same participants configuration.
group_chat = create_team(model_client)
# Load the state of the group chat and all participants.
with open("group_chat_state.json", "r") as f:
state = json.load(f)
await group_chat.load_state(state)
# Resume the chat.
stream = group_chat.run_stream(task="Translate the story into Chinese.")
await Console(stream)
# Close the connection to the model client.
await model_client.close()
asyncio.run(main())
保存和加载群聊状态#
在 v0.2 中,您需要明确保存群聊消息并在要恢复聊天时将其加载回来。
在 v0.4 中,您只需在群聊对象上调用 save_state 和 load_state 方法。请参见带恢复功能的群聊中的示例。
带工具使用的群聊#
在 v0.2 群聊中,当涉及工具时,您需要将工具函数注册到用户代理上,并将用户代理包含在群聊中。其他智能体进行的工具调用将被路由到用户代理执行。
我们发现这种方法存在许多问题,例如工具调用路由未按预期工作,以及工具调用请求和结果无法被不支持函数调用的模型接受。
在 v0.4 中,无需在用户代理上注册工具函数,因为工具直接在 AssistantAgent 中执行,它将工具的响应发布到群聊中。因此,群聊管理器无需参与工具调用路由。
有关在群聊中使用工具的示例,请参阅选择器群聊教程。
带自定义选择器(状态流)的群聊#
在 v0.2 群聊中,当 speaker_selection_method 设置为自定义函数时,它可以覆盖默认的选择方法。这对于实现基于状态的选择方法很有用。有关详细信息,请参阅 v0.2 中的自定义发言人选择。
在 v0.4 中,您可以使用带 selector_func 的 SelectorGroupChat 来实现相同的行为。selector_func 是一个函数,它接受群聊的当前消息线程并返回下一个发言人的名称。如果返回 None,则将使用基于 LLM 的选择方法。
以下是使用基于状态的选择方法实现网页搜索/分析场景的示例。
import asyncio
from typing import Sequence
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import MaxMessageTermination, TextMentionTermination
from autogen_agentchat.messages import BaseAgentEvent, BaseChatMessage
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
# Note: This example uses mock tools instead of real APIs for demonstration purposes
def search_web_tool(query: str) -> str:
if "2006-2007" in query:
return """Here are the total points scored by Miami Heat players in the 2006-2007 season:
Udonis Haslem: 844 points
Dwayne Wade: 1397 points
James Posey: 550 points
...
"""
elif "2007-2008" in query:
return "The number of total rebounds for Dwayne Wade in the Miami Heat season 2007-2008 is 214."
elif "2008-2009" in query:
return "The number of total rebounds for Dwayne Wade in the Miami Heat season 2008-2009 is 398."
return "No data found."
def percentage_change_tool(start: float, end: float) -> float:
return ((end - start) / start) * 100
def create_team(model_client : OpenAIChatCompletionClient) -> SelectorGroupChat:
planning_agent = AssistantAgent(
"PlanningAgent",
description="An agent for planning tasks, this agent should be the first to engage when given a new task.",
model_client=model_client,
system_message="""
You are a planning agent.
Your job is to break down complex tasks into smaller, manageable subtasks.
Your team members are:
Web search agent: Searches for information
Data analyst: Performs calculations
You only plan and delegate tasks - you do not execute them yourself.
When assigning tasks, use this format:
1. <agent> : <task>
After all tasks are complete, summarize the findings and end with "TERMINATE".
""",
)
web_search_agent = AssistantAgent(
"WebSearchAgent",
description="A web search agent.",
tools=[search_web_tool],
model_client=model_client,
system_message="""
You are a web search agent.
Your only tool is search_tool - use it to find information.
You make only one search call at a time.
Once you have the results, you never do calculations based on them.
""",
)
data_analyst_agent = AssistantAgent(
"DataAnalystAgent",
description="A data analyst agent. Useful for performing calculations.",
model_client=model_client,
tools=[percentage_change_tool],
system_message="""
You are a data analyst.
Given the tasks you have been assigned, you should analyze the data and provide results using the tools provided.
""",
)
# The termination condition is a combination of text mention termination and max message termination.
text_mention_termination = TextMentionTermination("TERMINATE")
max_messages_termination = MaxMessageTermination(max_messages=25)
termination = text_mention_termination | max_messages_termination
# The selector function is a function that takes the current message thread of the group chat
# and returns the next speaker's name. If None is returned, the LLM-based selection method will be used.
def selector_func(messages: Sequence[BaseAgentEvent | BaseChatMessage]) -> str | None:
if messages[-1].source != planning_agent.name:
return planning_agent.name # Always return to the planning agent after the other agents have spoken.
return None
team = SelectorGroupChat(
[planning_agent, web_search_agent, data_analyst_agent],
model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"), # Use a smaller model for the selector.
termination_condition=termination,
selector_func=selector_func,
)
return team
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o")
team = create_team(model_client)
task = "Who was the Miami Heat player with the highest points in the 2006-2007 season, and what was the percentage change in his total rebounds between the 2007-2008 and 2008-2009 seasons?"
await Console(team.run_stream(task=task))
asyncio.run(main())
嵌套聊天#
嵌套聊天允许您将整个团队或另一个智能体嵌套在一个智能体中。这对于创建智能体的层次结构或“信息孤岛”很有用,因为嵌套的智能体不能直接与同一组之外的其他智能体通信。
在 v0.2 中,通过在 ConversableAgent 类上使用 register_nested_chats 方法支持嵌套聊天。您需要使用字典指定智能体的嵌套序列,有关详细信息,请参阅 v0.2 中的嵌套聊天。
在 v0.4 中,嵌套聊天是自定义智能体的实现细节。您可以创建一个自定义智能体,该智能体将一个团队或另一个智能体作为参数,并实现 on_messages 方法来触发嵌套团队或智能体。由应用程序决定如何传递或转换来自嵌套团队或智能体的消息。
以下示例展示了一个简单的嵌套聊天,用于计数数字。
import asyncio
from typing import Sequence
from autogen_core import CancellationToken
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.messages import TextMessage, BaseChatMessage
from autogen_agentchat.base import Response
class CountingAgent(BaseChatAgent):
"""An agent that returns a new number by adding 1 to the last number in the input messages."""
async def on_messages(self, messages: Sequence[BaseChatMessage], cancellation_token: CancellationToken) -> Response:
if len(messages) == 0:
last_number = 0 # Start from 0 if no messages are given.
else:
assert isinstance(messages[-1], TextMessage)
last_number = int(messages[-1].content) # Otherwise, start from the last number.
return Response(chat_message=TextMessage(content=str(last_number + 1), source=self.name))
async def on_reset(self, cancellation_token: CancellationToken) -> None:
pass
@property
def produced_message_types(self) -> Sequence[type[BaseChatMessage]]:
return (TextMessage,)
class NestedCountingAgent(BaseChatAgent):
"""An agent that increments the last number in the input messages
multiple times using a nested counting team."""
def __init__(self, name: str, counting_team: RoundRobinGroupChat) -> None:
super().__init__(name, description="An agent that counts numbers.")
self._counting_team = counting_team
async def on_messages(self, messages: Sequence[BaseChatMessage], cancellation_token: CancellationToken) -> Response:
# Run the inner team with the given messages and returns the last message produced by the team.
result = await self._counting_team.run(task=messages, cancellation_token=cancellation_token)
# To stream the inner messages, implement `on_messages_stream` and use that to implement `on_messages`.
assert isinstance(result.messages[-1], TextMessage)
return Response(chat_message=result.messages[-1], inner_messages=result.messages[len(messages):-1])
async def on_reset(self, cancellation_token: CancellationToken) -> None:
# Reset the inner team.
await self._counting_team.reset()
@property
def produced_message_types(self) -> Sequence[type[BaseChatMessage]]:
return (TextMessage,)
async def main() -> None:
# Create a team of two counting agents as the inner team.
counting_agent_1 = CountingAgent("counting_agent_1", description="An agent that counts numbers.")
counting_agent_2 = CountingAgent("counting_agent_2", description="An agent that counts numbers.")
counting_team = RoundRobinGroupChat([counting_agent_1, counting_agent_2], max_turns=5)
# Create a nested counting agent that takes the inner team as a parameter.
nested_counting_agent = NestedCountingAgent("nested_counting_agent", counting_team)
# Run the nested counting agent with a message starting from 1.
response = await nested_counting_agent.on_messages([TextMessage(content="1", source="user")], CancellationToken())
assert response.inner_messages is not None
for message in response.inner_messages:
print(message)
print(response.chat_message)
asyncio.run(main())
您应该看到以下输出
source='counting_agent_1' models_usage=None content='2' type='TextMessage'
source='counting_agent_2' models_usage=None content='3' type='TextMessage'
source='counting_agent_1' models_usage=None content='4' type='TextMessage'
source='counting_agent_2' models_usage=None content='5' type='TextMessage'
source='counting_agent_1' models_usage=None content='6' type='TextMessage'
您可以查看 SocietyOfMindAgent 以了解更复杂的实现。
顺序聊天#
在 v0.2 中,通过使用 initiate_chats 函数支持顺序聊天。它接受一个字典配置列表作为输入,用于序列的每个步骤。有关详细信息,请参阅 v0.2 中的顺序聊天。
根据社区的反馈,initiate_chats 函数过于主观,灵活性不足以支持用户希望实现的各种场景。我们经常发现用户在尝试使 initiate_chats 函数工作时遇到困难,而他们可以轻松地使用基本的 Python 代码将步骤连接起来。因此,在 v0.4 中,我们没有在 AgentChat API 中提供内置的顺序聊天函数。
相反,您可以使用核心 API 创建一个事件驱动的顺序工作流,并使用 AgentChat API 提供的其他组件来实现工作流的每个步骤。请参阅核心 API 教程中的顺序工作流示例。
我们认识到工作流概念是许多应用程序的核心,未来我们将为工作流提供更多内置支持。
GPTAssistantAgent#
在 v0.2 中,GPTAssistantAgent 是一个由 OpenAI Assistant API 支持的特殊智能体类。
在 v0.4 中,等效的是 OpenAIAssistantAgent 类。它支持与 v0.2 中的 GPTAssistantAgent 相同的功能集,并增加了更多功能,例如可定制的线程和文件上传。有关详细信息,请参阅 OpenAIAssistantAgent。
长上下文处理#
在 v0.2 中,可以通过使用添加到 ConversableAgent 之后构建的 transforms 功能来处理超出模型上下文窗口的长上下文。
来自我们社区的反馈让我们相信此功能至关重要,并且应该成为 AssistantAgent 的内置组件,并且可以用于每个自定义智能体。
在 v0.4 中,我们引入了 ChatCompletionContext 基类,它管理消息历史记录并提供历史记录的虚拟视图。应用程序可以使用内置实现,例如 BufferedChatCompletionContext 来限制发送到模型的历史消息,或者提供自己的实现来创建不同的虚拟视图。
在聊天机器人场景中,在 AssistantAgent 中使用 BufferedChatCompletionContext。
import asyncio
from autogen_agentchat.messages import TextMessage
from autogen_agentchat.agents import AssistantAgent
from autogen_core import CancellationToken
from autogen_core.model_context import BufferedChatCompletionContext
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o", seed=42, temperature=0)
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful assistant.",
model_client=model_client,
model_context=BufferedChatCompletionContext(buffer_size=10), # Model can only view the last 10 messages.
)
while True:
user_input = input("User: ")
if user_input == "exit":
break
response = await assistant.on_messages([TextMessage(content=user_input, source="user")], CancellationToken())
print("Assistant:", response.chat_message.to_text())
await model_client.close()
asyncio.run(main())
在此示例中,聊天机器人只能读取历史记录中的最后 10 条消息。
可观察性和控制#
在 v0.4 AgentChat 中,您可以使用 on_messages_stream 方法观察智能体,该方法返回一个异步生成器以流式传输智能体的内心想法和行动。对于团队,您可以使用 run_stream 方法流式传输团队中智能体之间的内部对话。您的应用程序可以使用这些流来实时观察智能体和团队。
on_messages_stream 和 run_stream 方法都接受 CancellationToken 作为参数,该参数可用于异步取消输出流并停止智能体或团队。对于团队,您还可以使用终止条件在满足特定条件时停止团队。有关详细信息,请参阅终止条件教程。
与 v0.2 附带的特殊日志模块不同,v0.4 API 仅使用 Python 的 logging 模块来记录事件,例如模型客户端调用。有关详细信息,请参阅核心 API 文档中的日志记录。
代码执行器#
v0.2 和 v0.4 中的代码执行器几乎相同,只是 v0.4 执行器支持异步 API。您还可以使用 CancellationToken 来取消长时间的代码执行。请参阅核心 API 文档中的命令行代码执行器教程。
我们还添加了 ACADynamicSessionsCodeExecutor,它可以使用 Azure Container Apps (ACA) 动态会话进行代码执行。请参阅ACA 动态会话代码执行器文档。