推测性多动作执行

UFO² 引入了一项名为推测性多动作执行的新功能。此功能允许代理将几个预测步骤捆绑到一个 LLM 调用中，然后实时验证这些步骤。与单独推断每个步骤相比，这种方法可以使查询次数减少高达 51%。代理将首先预测一批可能的动作，然后通过一次性验证它们与实时的 UIA 状态。我们在下图中说明了推测性多动作执行

配置

要激活推测性多动作执行，您需要在 config_dev.yaml 文件中将 ACTION_SEQUENCE 设置为 True。

ACTION_SEQUENCE: True

参考资料

推测性多动作执行的实现位于 ufo/agents/processors/actions.py 文件中。以下类用于推测性多动作执行

agents/processors/actions.py 中的源代码

def __init__(
    self,
    function: str = "",
    args: Dict[str, Any] = {},
    control_label: str = "",
    control_text: str = "",
    after_status: str = "",
    results: Optional[ActionExecutionLog] = None,
    configs=Config.get_instance().config_data,
):
    self._function = function
    self._args = args
    self._control_label = control_label
    self._control_text = control_text
    self._after_status = after_status
    self._results = ActionExecutionLog() if results is None else results
    self._configs = configs
    self._control_log = BaseControlLog()

`after_status` `property`

获取状态。

返回	`字符串` – 状态。

`args` `property`

获取参数。

返回	`Dict[str, Any]` – 参数。

`command_string` `property`

生成函数调用字符串。

返回	`字符串` – 函数调用字符串。

`control_label` `property`

获取控件标签。

返回	`字符串` – 控件标签。

`control_log` `property` `writable`

获取控制日志。

返回	`BaseControlLog` – 控制日志。

`control_text` `property`

获取控制文本。

返回	`字符串` – 控制文本。

`function` `property`

获取函数名称。

返回	`字符串` – 函数。

`results` `property` `writable`

获取结果。

返回	`ActionExecutionLog` – 结果。

`action_flow(puppeteer, control_dict, application_window)`

执行动作流。

参数	`puppeteer` (`AppPuppeteer`) – 控制应用程序的 puppeteer。 `control_dict` (`Dict[str, UIAWrapper]`) – 控件字典。 `application_window` (`UIAWrapper`) – 控件所在的应用程序窗口。

返回	`Tuple[ActionExecutionLog, BaseControlLog]` – 动作执行日志。

agents/processors/actions.py 中的源代码

def action_flow(
    self,
    puppeteer: AppPuppeteer,
    control_dict: Dict[str, UIAWrapper],
    application_window: UIAWrapper,
) -> Tuple[ActionExecutionLog, BaseControlLog]:
    """
    Execute the action flow.
    :param puppeteer: The puppeteer that controls the application.
    :param control_dict: The control dictionary.
    :param application_window: The application window where the control is located.
    :return: The action execution log.
    """
    control_selected: UIAWrapper = control_dict.get(self.control_label, None)

    # If the control is selected, but not available, return an error.
    if control_selected is not None and not self._control_validation(
        control_selected
    ):
        self.results = ActionExecutionLog(
            status="error",
            traceback="Control is not available.",
            error="Control is not available.",
        )
        self._control_log = BaseControlLog()

        return self.results

    # Create the control receiver.
    puppeteer.receiver_manager.create_ui_control_receiver(
        control_selected, application_window
    )

    if self.function:

        if self._configs.get("SHOW_VISUAL_OUTLINE_ON_SCREEN", True):
            if control_selected:
                control_selected.draw_outline(colour="red", thickness=3)
                time.sleep(self._configs.get("RECTANGLE_TIME", 0))

        self._control_log = self._get_control_log(
            control_selected=control_selected, application_window=application_window
        )

        try:
            return_value = self.execute(puppeteer=puppeteer)
            if not utils.is_json_serializable(return_value):
                return_value = ""

            self.results = ActionExecutionLog(
                status="success",
                return_value=return_value,
            )

        except Exception as e:

            import traceback

            self.results = ActionExecutionLog(
                status="error",
                traceback=traceback.format_exc(),
                error=str(e),
            )
        return self.results

`count_repeat_times(previous_actions)`

获取前一个动作中相同动作的次数。

参数	`previous_actions` (`List[Dict[str, Any]]`) – 前一个动作。

返回	`int` – 前一个动作中相同动作的次数。

agents/processors/actions.py 中的源代码

def count_repeat_times(self, previous_actions: List[Dict[str, Any]]) -> int:
    """
    Get the times of the same action in the previous actions.
    :param previous_actions: The previous actions.
    :return: The times of the same action in the previous actions.
    """

    count = 0
    for action in previous_actions[::-1]:
        if self.is_same_action(action):
            count += 1
        else:
            break
    return count

`execute(puppeteer)`

执行动作。

参数	`puppeteer` (`AppPuppeteer`) – 控制应用程序的 puppeteer。

agents/processors/actions.py 中的源代码

def execute(self, puppeteer: AppPuppeteer) -> Any:
    """
    Execute the action.
    :param puppeteer: The puppeteer that controls the application.
    """
    return puppeteer.execute_command(self.function, self.args)

`get_operation_point_list()`

获取动作的操作点。

返回	`List[Tuple[int]]` – 动作的操作点。

agents/processors/actions.py 中的源代码

def get_operation_point_list(self) -> List[Tuple[int]]:
    """
    Get the operation points of the action.
    :return: The operation points of the action.
    """

    if "path" in self.args:
        return [(point["x"], point["y"]) for point in self.args["path"]]
    elif "x" in self.args and "y" in self.args:
        return [(self.args["x"], self.args["y"])]
    else:
        return []

`is_same_action(action_to_compare)`

检查两个动作是否相同。

参数	`action_to_compare` (`Dict[str, Any]`) – 与当前动作进行比较的动作。

返回	`布尔值` – 两个动作是否相同。

agents/processors/actions.py 中的源代码

def is_same_action(self, action_to_compare: Dict[str, Any]) -> bool:
    """
    Check whether the two actions are the same.
    :param action_to_compare: The action to compare with the current action.
    :return: Whether the two actions are the same.
    """

    return (
        self.function == action_to_compare.get("Function")
        and self.args == action_to_compare.get("Args")
        and self.control_text == action_to_compare.get("ControlText")
    )

`print_result()`

打印动作执行结果。

agents/processors/actions.py 中的源代码

def print_result(self) -> None:
    """
    Print the action execution result.
    """

    utils.print_with_color(
        "Selected item🕹️: {control_text}, Label: {label}".format(
            control_text=self.control_text, label=self.control_label
        ),
        "yellow",
    )
    utils.print_with_color(
        "Action applied⚒️: {action}".format(action=self.command_string), "blue"
    )

    result_color = "red" if self.results.status != "success" else "green"

    utils.print_with_color(
        "Execution result📜: {result}".format(result=asdict(self.results)),
        result_color,
    )

`to_dict(previous_actions)`

将动作转换为字典。

参数	`previous_actions` (`Optional[List[Dict[str, Any]]]`) – 前一个动作。

返回	`Dict[str, Any]` – 动作的字典。

agents/processors/actions.py 中的源代码

def to_dict(
    self, previous_actions: Optional[List[Dict[str, Any]]]
) -> Dict[str, Any]:
    """
    Convert the action to a dictionary.
    :param previous_actions: The previous actions.
    :return: The dictionary of the action.
    """

    action_dict = {
        "Function": self.function,
        "Args": self.args,
        "ControlLabel": self.control_label,
        "ControlText": self.control_text,
        "Status": self.after_status,
        "Results": asdict(self.results),
    }

    # Add the repetitive times of the same action in the previous actions if the previous actions are provided.
    if previous_actions:
        action_dict["RepeatTimes"] = self.count_repeat_times(previous_actions)

    return action_dict

`to_string(previous_actions)`

将动作转换为字符串。

参数	`previous_actions` (`Optional[List[OneStepAction]]`) – 前一个动作。

返回	`字符串` – 动作的字符串。

agents/processors/actions.py 中的源代码

def to_string(self, previous_actions: Optional[List["OneStepAction"]]) -> str:
    """
    Convert the action to a string.
    :param previous_actions: The previous actions.
    :return: The string of the action.
    """
    return json.dumps(self.to_dict(previous_actions), ensure_ascii=False)

推测性多动作执行

配置

参考资料

after_status property

args property

command_string property

control_label property

control_log property writable

control_text property

function property

results property writable

action_flow(puppeteer, control_dict, application_window)

count_repeat_times(previous_actions)

execute(puppeteer)

get_operation_point_list()

is_same_action(action_to_compare)

print_result()

to_dict(previous_actions)

to_string(previous_actions)

`after_status` `property`

`args` `property`

`command_string` `property`

`control_label` `property`

`control_log` `property` `writable`

`control_text` `property`

`function` `property`

`results` `property` `writable`

`action_flow(puppeteer, control_dict, application_window)`

`count_repeat_times(previous_actions)`

`execute(puppeteer)`

`get_operation_point_list()`

`is_same_action(action_to_compare)`

`print_result()`

`to_dict(previous_actions)`

`to_string(previous_actions)`