autogen_ext.tools.graphrag#

class GlobalSearchTool(token_encoder: Encoding, model: ChatModel, data_config: GlobalDataConfig, context_config: GlobalContextConfig = _default_context_config, mapreduce_config: MapReduceConfig = _default_mapreduce_config)[source]#

Bases: BaseTool[GlobalSearchToolArgs, GlobalSearchToolReturn]

启用将 GraphRAG 全局搜索查询作为 AutoGen 工具运行。

此工具允许您使用 GraphRAG 框架对文档语料库执行语义搜索。搜索结合了基于图的文档关系和语义嵌入来查找相关信息。

注意

此工具需要 autogen-ext 包的 graphrag 额外功能。

要安装

pip install -U "autogen-agentchat" "autogen-ext[graphrag]"

在使用此工具之前,您必须完成 GraphRAG 的设置和索引过程

  1. 请遵循 GraphRAG 文档初始化您的项目和设置

  2. 根据特定用例配置和调整您的提示

  3. 运行索引过程以生成所需的数据文件

  4. 确保您拥有设置过程中的 settings.yaml 文件

有关完成这些先决步骤的详细说明,请参阅 [GraphRAG 文档](https://msdocs.cn/graphrag/)。

AssistantAgent 的示例用法

import asyncio
from pathlib import Path
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console
from autogen_ext.tools.graphrag import GlobalSearchTool
from autogen_agentchat.agents import AssistantAgent


async def main():
    # Initialize the OpenAI client
    openai_client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key="<api-key>",
    )

    # Set up global search tool
    global_tool = GlobalSearchTool.from_settings(root_dir=Path("./"), config_filepath=Path("./settings.yaml"))

    # Create assistant agent with the global search tool
    assistant_agent = AssistantAgent(
        name="search_assistant",
        tools=[global_tool],
        model_client=openai_client,
        system_message=(
            "You are a tool selector AI assistant using the GraphRAG framework. "
            "Your primary task is to determine the appropriate search tool to call based on the user's query. "
            "For broader, abstract questions requiring a comprehensive understanding of the dataset, call the 'global_search' function."
        ),
    )

    # Run a sample query
    query = "What is the overall sentiment of the community reports?"
    await Console(assistant_agent.run_stream(task=query))


if __name__ == "__main__":
    asyncio.run(main())
async run(args: GlobalSearchToolArgs, cancellation_token: CancellationToken) GlobalSearchToolReturn[source]#
classmethod from_settings(root_dir: str | Path, config_filepath: str | Path | None = None) GlobalSearchTool[source]#

从 GraphRAG 设置文件创建 GlobalSearchTool 实例。

参数:
  • root_dir – GraphRAG 根目录的路径

  • config_filepath – GraphRAG 设置文件的路径(可选)

返回:

一个已初始化的 GlobalSearchTool 实例

class LocalSearchTool(token_encoder: Encoding, model: ChatModel, embedder: EmbeddingModel, data_config: LocalDataConfig, context_config: LocalContextConfig = _default_context_config, search_config: SearchConfig = _default_search_config)[source]#

Bases: BaseTool[LocalSearchToolArgs, LocalSearchToolReturn]

启用将 GraphRAG 本地搜索查询作为 AutoGen 工具运行。

此工具允许您使用 GraphRAG 框架对文档语料库执行语义搜索。搜索结合了本地文档上下文和语义嵌入来查找相关信息。

注意

此工具需要 autogen-ext 包的 graphrag 额外功能。要安装

pip install -U "autogen-agentchat" "autogen-ext[graphrag]"

在使用此工具之前,您必须完成 GraphRAG 的设置和索引过程

  1. 请遵循 GraphRAG 文档初始化您的项目和设置

  2. 根据特定用例配置和调整您的提示

  3. 运行索引过程以生成所需的数据文件

  4. 确保您拥有设置过程中的 settings.yaml 文件

有关完成这些先决步骤的详细说明,请参阅 [GraphRAG 文档](https://msdocs.cn/graphrag/)。

AssistantAgent 的示例用法

import asyncio
from pathlib import Path
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console
from autogen_ext.tools.graphrag import LocalSearchTool
from autogen_agentchat.agents import AssistantAgent


async def main():
    # Initialize the OpenAI client
    openai_client = OpenAIChatCompletionClient(
        model="gpt-4o-mini",
        api_key="<api-key>",
    )

    # Set up local search tool
    local_tool = LocalSearchTool.from_settings(root_dir=Path("./"), config_filepath=Path("./settings.yaml"))

    # Create assistant agent with the local search tool
    assistant_agent = AssistantAgent(
        name="search_assistant",
        tools=[local_tool],
        model_client=openai_client,
        system_message=(
            "You are a tool selector AI assistant using the GraphRAG framework. "
            "Your primary task is to determine the appropriate search tool to call based on the user's query. "
            "For specific, detailed information about particular entities or relationships, call the 'local_search' function."
        ),
    )

    # Run a sample query
    query = "What does the station-master say about Dr. Becher?"
    await Console(assistant_agent.run_stream(task=query))


if __name__ == "__main__":
    asyncio.run(main())
参数:
  • token_encoder (tiktoken.Encoding) – 用于文本编码的分词器

  • model – 用于搜索的聊天模型(GraphRAG ChatModel)

  • embedder – 要使用的文本嵌入模型(GraphRAG EmbeddingModel)

  • data_config (DataConfig) – 数据源位置和设置的配置

  • context_config (LocalContextConfig, optional) – 上下文构建的配置。默认为默认配置。

  • search_config (SearchConfig, optional) – 搜索操作的配置。默认为默认配置。

async run(args: LocalSearchToolArgs, cancellation_token: CancellationToken) LocalSearchToolReturn[source]#
classmethod from_settings(root_dir: Path, config_filepath: Path | None = None) LocalSearchTool[source]#

从 GraphRAG 设置文件创建 LocalSearchTool 实例。

参数:
  • root_dir – GraphRAG 根目录的路径

  • config_filepath – GraphRAG 设置文件的路径(可选)

返回:

一个已初始化的 LocalSearchTool 实例

pydantic model GlobalDataConfig[source]#

基类: DataConfig

显示 JSON 模式
{
   "title": "GlobalDataConfig",
   "type": "object",
   "properties": {
      "input_dir": {
         "title": "Input Dir",
         "type": "string"
      },
      "entity_table": {
         "default": "entities",
         "title": "Entity Table",
         "type": "string"
      },
      "entity_embedding_table": {
         "default": "entities",
         "title": "Entity Embedding Table",
         "type": "string"
      },
      "community_table": {
         "default": "communities",
         "title": "Community Table",
         "type": "string"
      },
      "community_level": {
         "default": 2,
         "title": "Community Level",
         "type": "integer"
      },
      "community_report_table": {
         "default": "community_reports",
         "title": "Community Report Table",
         "type": "string"
      }
   },
   "required": [
      "input_dir"
   ]
}

字段:
  • community_report_table (str)

field community_report_table: str = 'community_reports'#
pydantic model LocalDataConfig[source]#

基类: DataConfig

显示 JSON 模式
{
   "title": "LocalDataConfig",
   "type": "object",
   "properties": {
      "input_dir": {
         "title": "Input Dir",
         "type": "string"
      },
      "entity_table": {
         "default": "entities",
         "title": "Entity Table",
         "type": "string"
      },
      "entity_embedding_table": {
         "default": "entities",
         "title": "Entity Embedding Table",
         "type": "string"
      },
      "community_table": {
         "default": "communities",
         "title": "Community Table",
         "type": "string"
      },
      "community_level": {
         "default": 2,
         "title": "Community Level",
         "type": "integer"
      },
      "relationship_table": {
         "default": "relationships",
         "title": "Relationship Table",
         "type": "string"
      },
      "text_unit_table": {
         "default": "text_units",
         "title": "Text Unit Table",
         "type": "string"
      }
   },
   "required": [
      "input_dir"
   ]
}

字段:
  • relationship_table (str)

  • text_unit_table (str)

field relationship_table: str = 'relationships'#
field text_unit_table: str = 'text_units'#
pydantic model GlobalContextConfig[source]#

基类: ContextConfig

显示 JSON 模式
{
   "title": "GlobalContextConfig",
   "type": "object",
   "properties": {
      "max_data_tokens": {
         "default": 12000,
         "title": "Max Data Tokens",
         "type": "integer"
      },
      "use_community_summary": {
         "default": false,
         "title": "Use Community Summary",
         "type": "boolean"
      },
      "shuffle_data": {
         "default": true,
         "title": "Shuffle Data",
         "type": "boolean"
      },
      "include_community_rank": {
         "default": true,
         "title": "Include Community Rank",
         "type": "boolean"
      },
      "min_community_rank": {
         "default": 0,
         "title": "Min Community Rank",
         "type": "integer"
      },
      "community_rank_name": {
         "default": "rank",
         "title": "Community Rank Name",
         "type": "string"
      },
      "include_community_weight": {
         "default": true,
         "title": "Include Community Weight",
         "type": "boolean"
      },
      "community_weight_name": {
         "default": "occurrence weight",
         "title": "Community Weight Name",
         "type": "string"
      },
      "normalize_community_weight": {
         "default": true,
         "title": "Normalize Community Weight",
         "type": "boolean"
      }
   }
}

字段:
  • community_rank_name (str)

  • community_weight_name (str)

  • include_community_rank (bool)

  • include_community_weight (bool)

  • max_data_tokens (int)

  • min_community_rank (int)

  • normalize_community_weight (bool)

  • shuffle_data (bool)

  • use_community_summary (bool)

field use_community_summary: bool = False#
field shuffle_data: bool = True#
field include_community_rank: bool = True#
field min_community_rank: int = 0#
field community_rank_name: str = 'rank'#
field include_community_weight: bool = True#
field community_weight_name: str = 'occurrence weight'#
field normalize_community_weight: bool = True#
field max_data_tokens: int = 12000#
pydantic model GlobalSearchToolArgs[source]#

基类: BaseModel

显示 JSON 模式
{
   "title": "GlobalSearchToolArgs",
   "type": "object",
   "properties": {
      "query": {
         "description": "The user query to perform global search on.",
         "title": "Query",
         "type": "string"
      }
   },
   "required": [
      "query"
   ]
}

字段:
  • query (str)

field query: str [Required]#

用于执行全局搜索的用户查询。

pydantic model GlobalSearchToolReturn[source]#

基类: BaseModel

显示 JSON 模式
{
   "title": "GlobalSearchToolReturn",
   "type": "object",
   "properties": {
      "answer": {
         "title": "Answer",
         "type": "string"
      }
   },
   "required": [
      "answer"
   ]
}

字段:
  • answer (str)

field answer: str [Required]#
pydantic model LocalContextConfig[source]#

基类: ContextConfig

显示 JSON 模式
{
   "title": "LocalContextConfig",
   "type": "object",
   "properties": {
      "max_data_tokens": {
         "default": 8000,
         "title": "Max Data Tokens",
         "type": "integer"
      },
      "text_unit_prop": {
         "default": 0.5,
         "title": "Text Unit Prop",
         "type": "number"
      },
      "community_prop": {
         "default": 0.25,
         "title": "Community Prop",
         "type": "number"
      },
      "include_entity_rank": {
         "default": true,
         "title": "Include Entity Rank",
         "type": "boolean"
      },
      "rank_description": {
         "default": "number of relationships",
         "title": "Rank Description",
         "type": "string"
      },
      "include_relationship_weight": {
         "default": true,
         "title": "Include Relationship Weight",
         "type": "boolean"
      },
      "relationship_ranking_attribute": {
         "default": "rank",
         "title": "Relationship Ranking Attribute",
         "type": "string"
      }
   }
}

字段:
  • community_prop (float)

  • include_entity_rank (bool)

  • include_relationship_weight (bool)

  • rank_description (str)

  • relationship_ranking_attribute (str)

  • text_unit_prop (float)

field text_unit_prop: float = 0.5#
field community_prop: float = 0.25#
field include_entity_rank: bool = True#
field rank_description: str = 'number of relationships'#
field include_relationship_weight: bool = True#
field relationship_ranking_attribute: str = 'rank'#
pydantic model LocalSearchToolArgs[source]#

基类: BaseModel

显示 JSON 模式
{
   "title": "LocalSearchToolArgs",
   "type": "object",
   "properties": {
      "query": {
         "description": "The user query to perform local search on.",
         "title": "Query",
         "type": "string"
      }
   },
   "required": [
      "query"
   ]
}

字段:
  • query (str)

field query: str [Required]#

用于执行本地搜索的用户查询。

pydantic model LocalSearchToolReturn[source]#

基类: BaseModel

显示 JSON 模式
{
   "title": "LocalSearchToolReturn",
   "type": "object",
   "properties": {
      "answer": {
         "description": "The answer to the user query.",
         "title": "Answer",
         "type": "string"
      }
   },
   "required": [
      "answer"
   ]
}

字段:
  • answer (str)

field answer: str [Required]#

用户查询的答案。

pydantic model MapReduceConfig[source]#

基类: BaseModel

显示 JSON 模式
{
   "title": "MapReduceConfig",
   "type": "object",
   "properties": {
      "map_max_tokens": {
         "default": 1000,
         "title": "Map Max Tokens",
         "type": "integer"
      },
      "map_temperature": {
         "default": 0.0,
         "title": "Map Temperature",
         "type": "number"
      },
      "reduce_max_tokens": {
         "default": 2000,
         "title": "Reduce Max Tokens",
         "type": "integer"
      },
      "reduce_temperature": {
         "default": 0.0,
         "title": "Reduce Temperature",
         "type": "number"
      },
      "allow_general_knowledge": {
         "default": false,
         "title": "Allow General Knowledge",
         "type": "boolean"
      },
      "json_mode": {
         "default": false,
         "title": "Json Mode",
         "type": "boolean"
      },
      "response_type": {
         "default": "multiple paragraphs",
         "title": "Response Type",
         "type": "string"
      }
   }
}

字段:
  • allow_general_knowledge (bool)

  • json_mode (bool)

  • map_max_tokens (int)

  • map_temperature (float)

  • reduce_max_tokens (int)

  • reduce_temperature (float)

  • response_type (str)

field map_max_tokens: int = 1000#
field map_temperature: float = 0.0#
field reduce_max_tokens: int = 2000#
field reduce_temperature: float = 0.0#
field allow_general_knowledge: bool = False#
field json_mode: bool = False#
field response_type: str = 'multiple paragraphs'#
pydantic model SearchConfig[source]#

基类: BaseModel

显示 JSON 模式
{
   "title": "SearchConfig",
   "type": "object",
   "properties": {
      "max_tokens": {
         "default": 1500,
         "title": "Max Tokens",
         "type": "integer"
      },
      "temperature": {
         "default": 0.0,
         "title": "Temperature",
         "type": "number"
      },
      "response_type": {
         "default": "multiple paragraphs",
         "title": "Response Type",
         "type": "string"
      }
   }
}

字段:
  • max_tokens (int)

  • response_type (str)

  • temperature (float)

field max_tokens: int = 1500#
field temperature: float = 0.0#
field response_type: str = 'multiple paragraphs'#