Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Screenpipe - Open Source | Evermx | Evermx

Back to Open Source

TrendingFeatured

Screenpipe

Screenpipe (YC S26)MIT

View on GitHub

STT18.8K Stars1.7K Forks242 views

Screenpipe is an open-source, local-first "AI memory" daemon that continuously records what its user sees, says, and hears, transcribes audio locally with OpenAI Whisper, OCRs the screen, and indexes everything into a searchable SQLite database that AI agents can query. A Y Combinator S26 company, Screenpipe is published under the MIT license and has reached 18,837 GitHub stars and 1,739 forks with cross-platform support for macOS, Windows 10/11, and Linux. ## Continuous Local Speech Recognition For the speech-to-text portion of the pipeline, Screenpipe runs OpenAI Whisper locally on the device — no audio leaves the machine. Both system audio (anything routed through the speakers, including Zoom and Meet calls) and microphone input are captured and fed through the transcription engine in parallel streams. Speaker identification and diarization are layered on top so that the resulting transcript labels who said what, and the diarized text is written into the same SQLite store as screen content and accessibility-tree captures. ## Event-Driven Capture, Not Brute-Force Recording A naive "record everything" tool would melt the CPU. Screenpipe instead captures screenshots only when meaningful changes occur — window focus shifts, accessibility-tree mutations, scroll events — and falls back to OCR when the accessibility tree is missing or incomplete. The result is documented resource use of roughly 5-10 percent CPU and 0.5-3 GB of RAM, low enough to run all day on a laptop without thermal complaints. ## The Pipes Agent System Where Screenpipe goes beyond "a personal Rewind clone" is its Pipes architecture. A Pipe is a markdown-defined scheduled agent that can read from the local index, call a local or cloud LLM, and write back results. Pipes have per-pipe deterministic data permissions enforced via YAML across three layers (filesystem, API, and runtime), so users can grant a daily-summary pipe access to meeting transcripts while denying it the ability to read browser history. The MCP-server interface lets external tools like Claude Code, Cursor, and Continue query the index directly. ## Architecture and Tech Stack The core daemon is written in Rust with Tauri providing the cross-platform UI shell and TypeScript powering the front-end. A local REST API on `localhost:3030` exposes search and ingest endpoints, backed by SQLite with FTS5 for full-text search across transcripts and OCR text. The Rust foundation explains why a 24/7 recorder can stay inside a single-digit CPU envelope, while the Tauri-plus-TypeScript layer makes the UI fast to iterate on. ## Use Cases Beyond Personal Memory Knowledge workers use Screenpipe as a perfect-recall layer for research, with semantic search over months of meetings and tabs. Developers wire it into Cursor or Claude Code as long-term context. Users with ADHD report value from being able to reconstruct lost trains of thought without note-taking discipline. Remote teams deploy it with centralized configuration management for shared retrieval, and enterprise customers can self-host the entire pipeline with optional disk encryption. ## Trade-offs The always-recording posture demands serious user trust in the permission model, and the project addresses that with both open source code and three-layer permission enforcement. The Whisper-based transcription, while local, is not yet as accurate on noisy multi-speaker audio as the largest cloud transcribers. The macOS build is the most polished; Linux requires more manual setup. And while the daemon itself is MIT, the commercial cloud-sync offering operates on a one-time lifetime-license model rather than a subscription, which some users will see as refreshing and others as up-front friction. For anyone who has wanted a transparent, hackable alternative to Microsoft Recall, Rewind, or Limitless, Screenpipe is the most mature option in 2026.

Key Features

Continuous local audio transcription via OpenAI Whisper with system + microphone capture
Speaker identification and real-time diarization layered onto the local pipeline
Event-driven screen capture with OCR fallback, holding CPU to 5-10% and RAM to 0.5-3GB
SQLite + FTS5 backed search index combining transcripts, OCR, and accessibility-tree data
Pipes agent system: markdown-defined scheduled jobs with three-layer YAML permissions
MCP server interface for direct querying from Claude Code, Cursor, and Continue
REST API on localhost:3030 for integration with custom scripts and tools
Cross-platform Tauri + Rust + TypeScript stack covering macOS, Windows 10/11, and Linux