Skip to main content
UA

UiPath Automation Hub

Make RPA bots read dynamic screens like a human

UiPath AI Computer Vision helps UiPath Robots recognize on-screen elements when selectors break. It’s aimed at teams automating VDIs, legacy apps, PDFs, images, and other hard-to-target interfaces.

iOS
Vision
B2B
Computer Use
Cloud Hosted
Hybrid
For Teams

Is this your tool? Claim this listing to manage your content and analytics.

Ask about UiPath Automation Hub

Get answers based on UiPath Automation Hub's actual documentation

Try asking:

About

What It Is

UiPath AI Computer Vision is a computer vision capability for UiPath’s RPA platform. It is designed for automation teams and RPA developers who need robots to interact with dynamic interfaces, virtual desktops, and applications where traditional selectors are unreliable.

According to UiPath, you get started inside the UiPath automation ecosystem and can deploy it through SaaS, on-premises for Linux and Windows, or from a desktop local server pack. It is meant to work with UiPath Robots and supports automation across desktop and web environments, especially in VDI setups such as Citrix, VMware, Microsoft RDP, and VNC.

What to Know

This looks strongest when you need vision-based automation on interfaces that are difficult to handle with standard RPA techniques. UiPath says it uses a neural network with custom Screen OCR, text matching, and a multi-anchoring system to identify UI elements, which should make automations more resilient on remote desktops and mixed UI types.

The tradeoff is that this is not a general-purpose AI agent or a standalone computer-use product; it is a specialized part of UiPath’s automation stack. Pricing was not publicly available on the page, and the content does not clearly spell out model provider details, MCP support, or whether any local/offline model is involved. If you are not already using UiPath, or you need a simple chat-based assistant rather than RPA infrastructure, this is probably not the right fit.

Key Features
Recognizes on-screen UI elements using computer vision
Supports VDI environments such as Citrix, VMware, Microsoft RDP, and VNC
Works on desktop and web applications
Handles interfaces where selectors are unreliable
Supports Flash, Silverlight, PDFs, images, and other non-standard elements
Use Cases
Automating legacy desktop applications with unstable selectors
Running RPA workflows inside virtual desktop environments
Interacting with PDFs, images, and other non-traditional UI elements
Agenticness: Reactive Tool

Responds to prompts but takes no autonomous action.

High evidence
Last evaluated: Mar 28, 2026

Dimension Breakdown

Action Capability
Autonomy
Adaptation
State & Memory
Safety

Categories

Pricing
  • Pricing not publicly available: The page promotes a free trial, but does not list public product pricing.
Details
AddedJanuary 16, 2026
RefreshedMarch 28, 2026
Quick Facts
DeploymentHybrid (cloud + self-hosted)
AutonomySemi-autonomous
Model supportSingle model
Open sourceNo
Team supportEnterprise
Pricing modelSubscription
Interfacegui, api, desktop
Sources
Last updated March 30, 2026
Similar tools

Related Tools

Stagehand is an open-source browser automation SDK built for developers and LLM-powered agents. It combines code-based browser control with natural-language actions so you can build web workflows that are more resilient to page changes.

Open Source
Web
API
+4

Fireflies.ai connects to Webex to record, transcribe, and summarize meetings automatically. It also extracts action items and makes notes searchable and shareable across team tools.

Paid
Web
Voice
+4

browser-use helps you build agents that interact with websites, fill forms, and complete web tasks. It supports both a self-hosted open-source library and a cloud option for faster setup and scaling.

Open Source
iOS
Web Browsing
+4

Perplexity Computer orchestrates 19 different AI models simultaneously — routing each subtask to the optimal model (Claude for reasoning, Gemini for deep research, GPT for long context). Tasks run in isolated Firecracker microVMs with 400+ app integrations and can persist for hours, days, or months. Requires Perplexity Max ($200/month).

iOS
API
Web Browsing
+4