Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
OpenMed is an open-source, on-device healthcare AI platform that runs clinical NLP — disease detection, medication identification, anatomical recognition, and HIPAA-grade PII de-identification — entirely on the user's own hardware with no cloud connectivity. Released under Apache 2.0 by Maziyar Panahi with 1,912 GitHub stars and a homepage at openmed.life, it directly addresses the most uncomfortable trade-off in healthcare AI: the best models are cloud-hosted, but the data they need is the data that legally and ethically cannot leave the hospital network. ## Why On-Device Healthcare AI Matters Patient data is regulated more strictly than almost any other category of information. HIPAA in the United States, GDPR in Europe, and equivalent regulations elsewhere all impose hard limits on where protected health information can be processed, by whom, and under what audit trail. The result is that most production healthcare AI deployments are either (1) cloud-based and limited to non-PHI use cases, or (2) on-device but using small, generic models that underperform. OpenMed argues — and demonstrates — that with the right combination of specialized models, modern on-device inference frameworks, and aggressive privacy filtering, neither trade-off is necessary. ## Privacy Filter Family The centerpiece is a Privacy Filter family that detects and de-identifies all 18 HIPAA Safe Harbor identifiers across 12 languages — English, Chinese, Spanish, French, German, Italian, Portuguese, Dutch, Arabic, Hindi, Telugu, Japanese, and Turkish — using 247 PII detection checkpoints. The filter offers four anonymization strategies: simple masking, faker-backed replacement with locale-aware formatting (so a German address gets a plausible German fake address, not a US one), cryptographic hashing for reversible workflows, and temporal date-shifting that preserves time deltas while obscuring absolute dates. Smart entity merging prevents tokenization fragmentation from splitting dates and IDs across multiple tokens, which is the failure mode that lets PHI leak through naive NER pipelines. The family has three variants — an OpenAI Privacy Filter baseline, a Nemotron-tuned variant, and a multilingual OpenMed version — all sharing the same sparse mixture-of-experts transformer architecture with local attention, RoPE+YaRN positional encoding, and tiktoken tokenization. Each variant auto-selects MLX on Apple Silicon, PyTorch on CUDA, and CPU fallback elsewhere, behind a single model name that doesn't change across platforms. ## Clinical Intelligence Layer Beyond PII handling, OpenMed ships 1,000+ specialized medical models covering disease detection, pharmaceutical identification, anatomical recognition, and genetic information extraction. These are not general-purpose LLMs prompted into the medical domain — they're transformer models trained or fine-tuned specifically on clinical corpora, which is why they fit on-device while still outperforming much larger general models on clinical NER and entity linking benchmarks. The catalog is organized by clinical domain so that a discharge-summary pipeline can compose disease + medication + anatomy models rather than running one giant model that does everything poorly. ## MLX and the iOS/Swift Story The performance story is built around MLX, Apple's array framework for Apple Silicon. OpenMed reports 24–33x speedup on Apple Silicon versus CPU PyTorch — fast enough that real-time clinical extraction inside a native iOS or iPadOS app becomes feasible. The OpenMedKit Swift framework wraps the same models for native iOS, iPadOS, and macOS development, which is unusual in the open-source clinical-NLP space and tells you who the project's user is: hospital IT teams that have invested in Apple hardware for clinical-facing tablets and want to keep PHI on the device. The Python and FastAPI deployment paths cover the server-side use cases — batch processing, integration with EHR systems, REST endpoints for distributed architectures — without requiring different model artifacts. ## Deployment and Air-Gapped Operation The full installation is split into profiles — core only, HuggingFace runtime, MLX acceleration, or complete service deployment — so the dependency footprint matches the actual deployment target. For the strictest environments, models can be pointed at local directories, bypassing HuggingFace Hub connectivity entirely, which is what makes air-gapped clinical networks a supported deployment mode rather than an afterthought. ## Why It Matters in 2026 With healthcare AI procurement under intense regulatory scrutiny — and with several high-profile cloud breaches in the past year — the demand for on-device, auditable, open-source medical NLP has accelerated faster than the supply. OpenMed is currently the most complete open-source answer in the space: Apache 2.0, specialized clinical models, comprehensive PII handling, native Apple Silicon performance, air-gapped deployment. For any healthcare team evaluating clinical NLP in 2026, it's the project to benchmark against.