Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
UI-TARS Desktop is an open-source multimodal AI agent stack developed by ByteDance that connects cutting-edge AI models with agent infrastructure for GUI automation. The project comprises two main components: Agent TARS, a general-purpose multimodal agent with one-click CLI and Web UI deployment, hybrid browser control using GUI agents and DOM interactions, event stream protocol for context engineering, and MCP server integration; and UI-TARS Desktop, a native GUI automation tool with screenshot analysis, precise mouse and keyboard control, and cross-platform compatibility across Windows, macOS, and browser environments. The framework integrates with multiple AI model providers including ByteDance's Doubao models, Anthropic's Claude family, ByteDance's UI-TARS models (1.5 and 1.6), and Seed vision-language models. Built with TypeScript on a Node.js monorepo architecture managed by pnpm, it supports both headful and headless execution modes with local processing for privacy. With 29,100+ stars and active development, UI-TARS Desktop has rapidly become a leading open-source solution for building multimodal AI agents that can interact with any graphical user interface.
hacksider
Real-time AI face swap and one-click video deepfake with only a single image
harry0703
AI-powered short video generator that automates scripting, footage sourcing, subtitles, and composition — supporting 10+ LLM providers and batch production.