Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Cua is open-source infrastructure for building, testing, and deploying AI agents that control full desktop environments. Often described as 'Docker for Computer-Use Agents,' Cua provides isolated VM sandboxes, a unified Agent SDK, and standardized benchmarks — all in a single coherent platform backed by Y Combinator. ## CuaBot: Agent Sandbox CLI CuaBot is the primary developer entry point. A `npx cuabot` command launches a multi-agent computer-use sandbox where individual application windows appear natively on the desktop with H.265 video, shared clipboard, and audio. This makes the sandbox environment indistinguishable from a real computer — a critical requirement for testing agents designed to interact with real GUIs. CuaBot supports agent-browser tasks (web automation) and agent-device tasks (iOS/Android control), covering the full scope of computer-use scenarios. ## Cua Agent SDK The Agent SDK provides a programmatic interface for connecting frontier AI models — including Anthropic Claude Sonnet, OpenAI Codex CLI, and custom models — to virtual computers. Agents built with the SDK can see screen contents, move the mouse, click buttons, type text, and execute code in fully isolated environments. The interface is consistent regardless of the underlying OS or virtualization technology. ## Lume: Near-Native macOS Virtualization For Apple Silicon developers, Lume uses Apple's Virtualization.framework directly to run macOS and Linux VMs at up to 97% native CPU speed. This is a significant performance advantage over traditional x86 emulation solutions. Linux environments run via Docker; Windows is supported via QEMU. ## Cua-Bench: Standardized Evaluation Cua-Bench integrates OSWorld, ScreenSpot, and Windows Arena benchmark suites for standardized agent evaluation, plus support for custom task definitions. Trajectory data can be exported for model fine-tuning, creating a pipeline from agent deployment to training data generation — a compounding advantage for teams building computer-use models. ## OmniParser Integration Cua integrates Microsoft's OmniParser for UI element detection from screenshots. This bridges the gap between raw pixel data and structured UI understanding, enabling agents to reliably identify and interact with interface components across different operating systems and applications. ## Technical Architecture The codebase spans Python (63%), Swift (12.4%), and TypeScript (12.2%), reflecting genuine cross-platform complexity. Python handles agent logic and benchmarking; Swift powers macOS virtualization; TypeScript manages the web interface. The project has logged 2,903 commits across 51 contributors with 373+ releases, indicating active development pace. Latest release: @trycua/playground v0.2.2 (February 19, 2026). Python 3.12+ is required.