Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
MemPalace is a local-first AI memory system that has crossed 54,600 GitHub stars by doing the unfashionable thing: storing conversation history as verbatim text, retrieving it with plain semantic search over local embeddings, and refusing to make a single API call to do either. The headline number — 96.6% R@5 on LongMemEval with raw semantic search, climbing to 98.4% with hybrid tuning and ≥99% with LLM reranking — is reported on a 500-question benchmark and demonstrates that the simple, transparent approach beats the expectations set by the more elaborate "memory layer" projects that dominated 2025. ## What MemPalace Is For The project targets one specific job: giving an LLM persistent memory across sessions without sending anything to a cloud service, without an API key, and without a separate hosted vector database to operate. The system is designed to be the local memory backend for a personal assistant, a coding agent, an MCP-connected workflow, or any application that needs to remember conversation history, project context, or topical knowledge over weeks and months. Because it is local-first, the same install can serve a single user on a laptop, a homelab, or an air-gapped deployment where cloud memory is simply not an option. ## The Memory Palace Metaphor The naming choice is doing real work. Memory is organized hierarchically: individuals and projects are "wings," topics within those are "rooms," and the original content lives in "drawers." This structure is not decorative — it lets retrieval be scoped to a wing or a room rather than running every query against a flat corpus, which is what makes scaling to long histories tractable. A query about "the authentication refactor in Project X" can be scoped to the Project X wing and the authentication room, so the search runs against a few hundred drawers rather than the entire memory. ## Pluggable Backends, Verbatim Storage The retrieval layer is intentionally pluggable. ChromaDB is the default backend because it is the simplest local-first option, but the same memory layer runs on top of SQLite, Qdrant, and PostgreSQL+pgvector for teams that want richer query capabilities or already operate one of those systems. The crucial design choice is that all content is stored verbatim — MemPalace does not summarize, paraphrase, or compress memories before storage. This avoids the information loss that summarization-based memory systems suffer and means that retrieval returns the actual original text rather than a model's earlier interpretation of it. ## The Benchmark Numbers The README reports four major benchmark results. On LongMemEval, the 500-question session-level benchmark that has become the standard for long-context memory evaluation, raw semantic search hits 96.6% R@5 — meaning the correct supporting passage is in the top 5 results 96.6% of the time. Hybrid tuning lifts that to 98.4%, and adding an LLM reranking step takes it past 99%. On LoCoMo the system reports 60.3% R@10 at session level across 1,986 questions, on ConvoMem 92.9% average recall across categories, and on MemBench (ACL 2025) 80.3% R@5. The README is explicit that these numbers are not directly comparable to Mem0, Mastra, or Zep because those projects publish on different metric definitions — a refreshingly honest framing in a space that is full of cherry-picked head-to-head claims. ## Embedding Models and Disk Footprint The default embedding setup uses gemma-300m for multilingual deployments or MiniLM-L6-v2 for English-only setups, with a total disk footprint of about 300 MB for the embedding model. Python 3.9 or higher is the only software prerequisite. No API key is required for baseline functionality. For most users the resulting install is small enough to live alongside the agent or assistant it serves rather than being a separate piece of infrastructure. ## Limitations The verbatim-storage choice is principled and produces strong retrieval numbers, but it means memory size grows linearly with conversation volume — long-running power users may want to pair MemPalace with periodic external archival. The wings/rooms/drawers structure is powerful only when it is curated; MemPalace can auto-classify on ingest but a misclassified memory is harder to retrieve until corrected. The benchmarks reported are excellent on the established public datasets but real-world deployment quality depends on the embedding model's match to the user's domain, and switching from MiniLM to a stronger multilingual model has measurable effects. Finally, while ChromaDB is the easiest default backend, it is also the lightest, so teams running MemPalace at millions of memories should plan to move to Qdrant or pgvector before scaling pressure shows up. Within those bounds, MemPalace in mid-2026 is the strongest open-source answer to "give my LLM persistent memory without sending anything to a cloud," and the benchmark numbers earn the project the right to call itself best-in-class on the public memory evals.