EverywhereEverywhere

Visual Context

Empower your assistant with the ability to perceive and interact with on-screen UI elements.

Even if this plugin is not enabled, Everywhere will attempt to retrieve the visual context if a visual element is included when sending a message in the chat window. This plugin only provides additional functions.

Functions

FunctionDescriptionPermission(s)
List WindowsLists all windows on the screen.Read Screen
Capture UI ElementCapture a screenshot of a visual element.Read Screen
Automate ActionsExecute a set of automated actions, such as clicks, inputs, or sending shortcuts.Access Screen

Notes

Automate Actions

The "Automate Actions" feature is experimental and may not work as expected. Use it with caution.

The decision to execute actions on UI elements is determined by the large model itself. In most cases, the model may be reluctant to perform such actions, and the execution results may sometimes be unsatisfactory.

Software Compatibility

Since the visual context is obtained via UI automation, content cannot be retrieved from software that does not support accessibility features (such as WeChat). Additionally, applications like games are not supported.

Real-time Capability

The visual context acquisition works like a snapshot rather than a real-time stream, so tasks like real-time YouTube subtitle translation are not possible.

How is this guide?

Last updated on

On this page