图标过滤器

图标控件过滤器是一种根据控件图标图像与代理计划之间使用图像/文本嵌入的相似性来过滤控件的方法。

配置

要激活图标控件过滤,您需要在 config_dev.yaml 文件中的 CONTROL_FILTER 列表中添加 ICON。以下是 config_dev.yaml 文件中详细的图标控件过滤器配置:

  • CONTROL_FILTER: 您要应用于控件的过滤方法列表。要激活图标控件过滤,请将 ICON 添加到列表中。
  • CONTROL_FILTER_TOP_K_ICON: 过滤后保留的控件数量。
  • CONTROL_FILTER_MODEL_ICON_NAME: 用于图标相似度的控件过滤器模型名称。默认设置为“clip-ViT-B-32”。

参考

基类: BasicControlFilter

一个表示用于控件过滤的图标模型的类。

control_filter(control_dicts, cropped_icons_dict, plans, top_k)

根据分数过滤控件项并返回前 k 个项。

参数
  • control_dicts

    所有控件项的字典。

  • cropped_icons_dict

    裁剪图标的字典。

  • plans

    用于与控件图标进行比较的计划。

  • top_k

    要返回的前 k 个项的数量。

返回
  • 根据分数排列的前 k 个控件项列表。

源代码在 automator/ui_control/control_filter.py
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
def control_filter(self, control_dicts, cropped_icons_dict, plans, top_k):
    """
    Filters control items based on their scores and returns the top-k items.
    :param control_dicts: The dictionary of all control items.
    :param cropped_icons_dict: The dictionary of the cropped icons.
    :param plans: The plans to compare the control icons against.
    :param top_k: The number of top items to return.
    :return: The list of top-k control items based on their scores.
    """

    scores_items = []
    filtered_control_dict = {}

    for label, cropped_icon in cropped_icons_dict.items():
        score = self.control_filter_score(cropped_icon, plans)
        scores_items.append((score, label))
    topk_scores_items = heapq.nlargest(top_k, scores_items, key=lambda x: x[0])
    topk_labels = [scores_items[1] for scores_items in topk_scores_items]

    for label, control_item in control_dicts.items():
        if label in topk_labels:
            filtered_control_dict[label] = control_item
    return filtered_control_dict

control_filter_score(control_icon, plans)

根据控件图标与给定关键词的相似性计算其分数。

参数
  • control_icon

    控件图标图像。

  • plans

    用于与控件图标进行比较的计划。

返回
  • 控件图标与关键词之间的最大相似度分数。

源代码在 automator/ui_control/control_filter.py
240
241
242
243
244
245
246
247
248
249
250
def control_filter_score(self, control_icon, plans):
    """
    Calculates the score of a control icon based on its similarity to the given keywords.
    :param control_icon: The control icon image.
    :param plans: The plan to compare the control icon against.
    :return: The maximum similarity score between the control icon and the keywords.
    """

    plans_embedding = self.get_embedding(plans)
    control_icon_embedding = self.get_embedding(control_icon)
    return max(self.cos_sim(control_icon_embedding, plans_embedding).tolist()[0])