会话

一个 Session 是用户和 UFO 之间的一次对话实例。它是一个持续的交互过程，从用户发起请求开始，到请求完成结束。UFO 支持在同一个会话中进行多个请求。每个请求由一次 Round 交互按顺序处理，直到用户的请求得到满足。我们在下图中展示 Session 和 Round 之间的关系

会话生命周期

一个 Session 的生命周期如下

1. 会话初始化

当用户开始与 UFO 对话时，Session 被初始化。Session 对象被创建，并启动第一次 Round 交互。在此阶段，用户的请求由 HostAgent 处理，以确定执行请求的适当应用程序。Context 对象被创建，用于存储会话中所有 Rounds 共享的对话状态。

2. 会话处理

Session 初始化后，交互的 Round 开始，它通过协调 HostAgent 和 AppAgent 来完成单个用户请求。

3. 下一轮

第一次 Round 完成后，Session 从用户请求下一个请求，以开始下一 Round 交互。这个过程一直持续到用户没有更多的请求。Session 的核心逻辑如下所示

def run(self) -> None:
    """
    Run the session.
    """

    while not self.is_finished():

        round = self.create_new_round()
        if round is None:
            break
        round.run()

    if self.application_window is not None:
        self.capture_last_snapshot()

    if self._should_evaluate and not self.is_error():
        self.evaluation()

    self.print_cost()

4. 会话终止

如果用户没有更多的请求或决定结束对话，Session 将被终止，对话结束。如果配置了 EvaluationAgent，它将评估 Session 的完整性。

参考

基类：ABC

UFO 中的基本会话。一个会话由多轮交互和对话组成。

初始化会话。

参数	`task` (`str`) – 当前任务的名称。 `should_evaluate` (`bool`) – 是否评估会话。 `id` (`int`) – 会话的 ID。

源代码在 module/basic.py 中

def __init__(self, task: str, should_evaluate: bool, id: int) -> None:
    """
    Initialize a session.
    :param task: The name of current task.
    :param should_evaluate: Whether to evaluate the session.
    :param id: The id of the session.
    """

    self._should_evaluate = should_evaluate
    self._id = id

    # Logging-related properties
    self.log_path = f"logs/{task}/"
    utils.create_folder(self.log_path)

    self._rounds: Dict[int, BaseRound] = {}

    self._context = Context()
    self._init_context()
    self._finish = False
    self._results = {}

    self._host_agent: HostAgent = AgentFactory.create_agent(
        "host",
        "HostAgent",
        configs["HOST_AGENT"]["VISUAL_MODE"],
        configs["HOSTAGENT_PROMPT"],
        configs["HOSTAGENT_EXAMPLE_PROMPT"],
        configs["API_PROMPT"],
    )

`application_window` `property` `writable`

获取会话的应用程序。返回：会话的应用程序。

`context` `property`

获取会话的上下文。返回：会话的上下文。

`cost` `property` `writable`

获取会话的成本。返回：会话的成本。

`current_round` `property`

获取会话的当前轮次。返回：会话的当前轮次。

`evaluation_logger` `property`

获取用于评估的日志记录器。返回：用于评估的日志记录器。

`id` `property`

获取会话的ID。返回：会话的ID。

`results` `property` `writable`

获取会话的评估结果。返回：会话的评估结果。

`rounds` `property`

获取会话的轮次。返回：会话的轮次。

`session_type` `property`

获取会话的类名。返回：会话的类名。

`step` `property`

获取会话的步骤。返回：会话的步骤。

`total_rounds` `property`

获取会话中的总轮次。返回：会话中的总轮次。

`add_round(id, round)`

向会话添加一轮。

参数	`id` (`int`) – 轮次的ID。 `round` (`BaseRound`) – 要添加的轮次。

源代码在 module/basic.py 中

def add_round(self, id: int, round: BaseRound) -> None:
    """
    Add a round to the session.
    :param id: The id of the round.
    :param round: The round to be added.
    """
    self._rounds[id] = round

`capture_last_snapshot()`

捕获应用程序的最后快照，包括屏幕截图和XML文件（如果已配置）。

源代码在 module/basic.py 中

def capture_last_snapshot(self) -> None:
    """
    Capture the last snapshot of the application, including the screenshot and the XML file if configured.
    """

    # Capture the final screenshot
    screenshot_save_path = self.log_path + f"action_step_final.png"

    if self.application_window is not None:

        try:
            PhotographerFacade().capture_app_window_screenshot(
                self.application_window, save_path=screenshot_save_path
            )

        except Exception as e:
            utils.print_with_color(
                f"Warning: The last snapshot capture failed, due to the error: {e}",
                "yellow",
            )

        if configs.get("SAVE_UI_TREE", False):
            step_ui_tree = ui_tree.UITree(self.application_window)

            ui_tree_path = os.path.join(self.log_path, "ui_trees")

            ui_tree_file_name = "ui_tree_final.json"

            step_ui_tree.save_ui_tree_to_json(
                os.path.join(
                    ui_tree_path,
                    ui_tree_file_name,
                )
            )

        if configs.get("SAVE_FULL_SCREEN", False):

            desktop_save_path = self.log_path + f"desktop_final.png"

            # Capture the desktop screenshot for all screens.
            PhotographerFacade().capture_desktop_screen_screenshot(
                all_screens=True, save_path=desktop_save_path
            )

        # Save the final XML file
        if configs["LOG_XML"]:
            log_abs_path = os.path.abspath(self.log_path)
            xml_save_path = os.path.join(log_abs_path, f"xml/action_step_final.xml")

            app_agent = self._host_agent.get_active_appagent()
            if app_agent is not None:
                app_agent.Puppeteer.save_to_xml(xml_save_path)

`create_following_round()`

创建下一轮。返回：下一轮。

源代码在 module/basic.py 中

def create_following_round(self) -> BaseRound:
    """
    Create a following round.
    return: The following round.
    """
    pass

`create_new_round()` `abstractmethod`

创建一个新回合。

源代码在 module/basic.py 中

@abstractmethod
def create_new_round(self) -> Optional[BaseRound]:
    """
    Create a new round.
    """
    pass

`evaluation()`

评估会话。

源代码在 module/basic.py 中

def evaluation(self) -> None:
    """
    Evaluate the session.
    """
    utils.print_with_color("Evaluating the session...", "yellow")

    is_visual = configs.get("EVALUATION_AGENT", {}).get("VISUAL_MODE", True)

    evaluator = EvaluationAgent(
        name="eva_agent",
        app_root_name=self.context.get(ContextNames.APPLICATION_ROOT_NAME),
        is_visual=is_visual,
        main_prompt=configs["EVALUATION_PROMPT"],
        example_prompt="",
        api_prompt=configs["API_PROMPT"],
    )

    requests = self.request_to_evaluate()

    # Evaluate the session, first use the default setting, if failed, then disable the screenshot evaluation.
    try:
        result, cost = evaluator.evaluate(
            request=requests,
            log_path=self.log_path,
            eva_all_screenshots=configs.get("EVA_ALL_SCREENSHOTS", True),
        )
    except Exception as e:
        result, cost = evaluator.evaluate(
            request=requests,
            log_path=self.log_path,
            eva_all_screenshots=False,
        )

    # Add additional information to the evaluation result.
    additional_info = {"level": "session", "request": requests, "id": 0}
    result.update(additional_info)

    self.results = result

    self.cost += cost

    evaluator.print_response(result)

    self.evaluation_logger.info(json.dumps(result))

`experience_saver()`

将当前轨迹保存为智能体经验。

源代码在 module/basic.py 中

def experience_saver(self) -> None:
    """
    Save the current trajectory as agent experience.
    """
    utils.print_with_color(
        "Summarizing and saving the execution flow as experience...", "yellow"
    )

    summarizer = ExperienceSummarizer(
        configs["APP_AGENT"]["VISUAL_MODE"],
        configs["EXPERIENCE_PROMPT"],
        configs["APPAGENT_EXAMPLE_PROMPT"],
        configs["API_PROMPT"],
    )
    experience = summarizer.read_logs(self.log_path)
    summaries, cost = summarizer.get_summary_list(experience)

    experience_path = configs["EXPERIENCE_SAVED_PATH"]
    utils.create_folder(experience_path)
    summarizer.create_or_update_yaml(
        summaries, os.path.join(experience_path, "experience.yaml")
    )
    summarizer.create_or_update_vector_db(
        summaries, os.path.join(experience_path, "experience_db")
    )

    self.cost += cost
    utils.print_with_color("The experience has been saved.", "magenta")

`initialize_logger(log_path, log_filename, mode='a', configs=configs)` `staticmethod`

初始化日志记录。log_path：日志文件的路径。log_filename：日志文件的名称。返回：日志记录器。

源代码在 module/basic.py 中

@staticmethod
def initialize_logger(
    log_path: str, log_filename: str, mode="a", configs=configs
) -> logging.Logger:
    """
    Initialize logging.
    log_path: The path of the log file.
    log_filename: The name of the log file.
    return: The logger.
    """
    # Code for initializing logging
    logger = logging.Logger(log_filename)

    if not configs["PRINT_LOG"]:
        # Remove existing handlers if PRINT_LOG is False
        logger.handlers = []

    log_file_path = os.path.join(log_path, log_filename)
    file_handler = logging.FileHandler(log_file_path, mode=mode, encoding="utf-8")
    formatter = logging.Formatter("%(message)s")
    file_handler.setFormatter(formatter)
    logger.addHandler(file_handler)
    logger.setLevel(configs["LOG_LEVEL"])

    return logger

`is_error()`

检查会话是否处于错误状态。返回：如果会话处于错误状态，则为True，否则为False。

源代码在 module/basic.py 中

def is_error(self):
    """
    Check if the session is in error state.
    return: True if the session is in error state, otherwise False.
    """
    if self.current_round is not None:
        return self.current_round.state.name() == AgentStatus.ERROR.value
    return False

`is_finished()`

检查会话是否结束。返回：如果会话结束，则为True，否则为False。

源代码在 module/basic.py 中

def is_finished(self) -> bool:
    """
    Check if the session is ended.
    return: True if the session is ended, otherwise False.
    """
    if (
        self._finish
        or self.step >= configs["MAX_STEP"]
        or self.total_rounds >= configs["MAX_ROUND"]
    ):
        return True

    if self.is_error():
        return True

    return False

`next_request()` `abstractmethod`

获取会话的下一个请求。返回：会话的请求。

源代码在 module/basic.py 中

@abstractmethod
def next_request(self) -> str:
    """
    Get the next request of the session.
    return: The request of the session.
    """
    pass

`print_cost()`

打印会话的总成本。

源代码在 module/basic.py 中

def print_cost(self) -> None:
    """
    Print the total cost of the session.
    """

    if isinstance(self.cost, float) and self.cost > 0:
        formatted_cost = "${:.2f}".format(self.cost)
        utils.print_with_color(
            f"Total request cost of the session: {formatted_cost}$", "yellow"
        )
    else:
        utils.print_with_color(
            "Cost is not available for the model {host_model} or {app_model}.".format(
                host_model=configs["HOST_AGENT"]["API_MODEL"],
                app_model=configs["APP_AGENT"]["API_MODEL"],
            ),
            "yellow",
        )

`request_to_evaluate()` `abstractmethod`

获取要评估的请求。返回：要评估的请求。

源代码在 module/basic.py 中

@abstractmethod
def request_to_evaluate(self) -> str:
    """
    Get the request to evaluate.
    return: The request(s) to evaluate.
    """
    pass

`run()`

运行会话。

源代码在 module/basic.py 中

def run(self) -> None:
    """
    Run the session.
    """

    while not self.is_finished():

        round = self.create_new_round()
        if round is None:
            break
        round.run()

    if self.application_window is not None:
        self.capture_last_snapshot()

    if self._should_evaluate and not self.is_error():
        self.evaluation()

    if configs.get("LOG_TO_MARKDOWN", True):

        file_path = self.log_path
        trajectory = Trajectory(file_path)
        trajectory.to_markdown(file_path + "/output.md")

    self.print_cost()

会话

会话生命周期

1. 会话初始化

2. 会话处理

3. 下一轮

4. 会话终止

参考

application_window property writable

context property

cost property writable

current_round property

evaluation_logger property

id property

results property writable

rounds property

session_type property

step property

total_rounds property

add_round(id, round)

capture_last_snapshot()

create_following_round()

create_new_round() abstractmethod

evaluation()

experience_saver()

initialize_logger(log_path, log_filename, mode='a', configs=configs) staticmethod

is_error()

is_finished()

next_request() abstractmethod

print_cost()

request_to_evaluate() abstractmethod

run()

`application_window` `property` `writable`

`context` `property`

`cost` `property` `writable`

`current_round` `property`

`evaluation_logger` `property`

`id` `property`

`results` `property` `writable`

`rounds` `property`

`session_type` `property`

`step` `property`

`total_rounds` `property`

`add_round(id, round)`

`capture_last_snapshot()`

`create_following_round()`

`create_new_round()` `abstractmethod`

`evaluation()`

`experience_saver()`

`initialize_logger(log_path, log_filename, mode='a', configs=configs)` `staticmethod`

`is_error()`

`is_finished()`

`next_request()` `abstractmethod`

`print_cost()`

`request_to_evaluate()` `abstractmethod`

`run()`