Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Memvid - Open Source | Evermx | Evermx

Back to Open Source

TrendingFeatured

Memvid

memvidApache-2.0

View on GitHub

Agent13.3K Stars1.1K Forks193 views

Memvid has established itself as a compelling alternative to traditional vector databases and complex RAG pipelines, reaching over 13,300 GitHub stars with its single-file memory architecture for AI agents. Rewritten from Python to Rust for 10-100x performance improvements, Memvid packages data, embeddings, search indexes, and metadata into a single portable .mv2 file that agents can carry anywhere without infrastructure dependencies. ## Why Memvid Matters Building AI agents with long-term memory traditionally requires running vector databases like Pinecone, Weaviate, or ChromaDB, along with embedding pipelines and retrieval infrastructure. Memvid eliminates this entire stack by providing a self-contained memory file that supports both vector similarity search and full-text search with sub-millisecond latency. For developers building agent systems, this means zero infrastructure overhead and fully offline-capable memory. ## Key Features ### Single-File Architecture Memvid stores everything in a single .mv2 file: raw content, vector embeddings, HNSW search indexes, BM25 lexical indexes, and metadata. This file is portable, versioned, and crash-safe thanks to an append-only design. There are no databases to manage, no servers to maintain, and no network connections required. ### Ultra-Low Latency Retrieval Performance benchmarks demonstrate exceptional speed: | Metric | Value | |--------|-------| | P50 Latency | 0.025ms | | P99 Latency | 0.075ms | | Throughput | 1,372x higher than standard approaches | | LoCoMo Benchmark | +35% over SOTA | This sub-millisecond retrieval is faster than most network round-trips to remote databases, making local memory access essentially instant. ### Smart Frame Architecture Memvid draws inspiration from video encoding to organize memory as an append-only sequence of Smart Frames. Each frame is an immutable unit storing content with timestamps, checksums, and metadata. Frames are grouped for efficient compression, indexing, and parallel reads, enabling timeline-style memory inspection and time-travel debugging. ### Multi-Modal Support Beyond text, Memvid handles images, audio (via Whisper transcription), and structured documents: - PDF extraction with layout preservation - XLSX structured extraction with table detection and OOXML metadata parsing - Image processing with embedding generation - Audio transcription and indexing ### Hybrid Search Memvid combines two complementary search strategies: - **Vector similarity search** via HNSW (Hierarchical Navigable Small World) for semantic matching - **Full-text search** via BM25 ranking for keyword-based retrieval Supported embedding models include BGE-small (384D, default), BGE-base (768D), Nomic, and GTE-large (1024D). ### Encryption and Security Memory files can be encrypted at rest, making Memvid suitable for applications handling sensitive data. The append-only architecture also provides natural audit trail capabilities. ## Multi-Language SDK Support Memvid provides official SDKs across multiple languages: ```bash # Rust (core library) cargo add memvid-core # Python pip install memvid-sdk # Node.js npm install @memvid/sdk # CLI tool cargo install memvid-cli ``` ## Practical Applications Memvid has found adoption across several use cases: - **Agent memory**: Persistent context for chatbots and coding assistants - **RAG replacement**: Eliminating vector database infrastructure for retrieval-augmented generation - **Document indexing**: Offline search across large document collections - **Edge deployment**: Running memory-intensive AI on devices without cloud connectivity ## Development History Originally written in Python, Memvid underwent a complete rewrite in Rust that delivered dramatic performance improvements. The latest release (v2.0.157, February 15, 2026) added structured XLSX extraction, improved metadata parsing, and security fixes for the Node SDK. ## Community With over 1,100 forks and an Apache 2.0 license, Memvid welcomes community contributions. The project maintains active development with regular releases and responsive issue tracking. ## Conclusion Memvid represents a paradigm shift in how AI agents handle memory and retrieval. By replacing complex database infrastructure with a single portable file, it dramatically lowers the barrier to building agents with persistent, searchable memory. For developers tired of managing vector database deployments, Memvid offers an elegant alternative that trades infrastructure complexity for raw performance and simplicity.

Key Features

Single .mv2 file architecture packaging data, embeddings, indexes, and metadata with no database required
Ultra-low latency retrieval at 0.025ms P50 and 0.075ms P99 with 1,372x higher throughput than standard approaches
Smart Frame append-only architecture inspired by video encoding with crash-safe design and time-travel debugging
Multi-modal support for text, images, audio (Whisper), PDFs, and XLSX with structured extraction
Hybrid search combining HNSW vector similarity and BM25 full-text ranking
Multi-language SDKs for Rust, Python, Node.js, and CLI
Encryption at rest for sensitive data applications

Related Projects

TrendingAgent

GitHub

366.0K75.2K

OpenClaw

OpenClaw

MIT502

Open Source

Memvid

Key Features

Tags

Related Projects

OpenClaw

OpenClaw

Superpowers

Hermes Agent