Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

NexaSDK - Open Source | Evermx | Evermx

Back to Open Source

Trending

NexaSDK

QualcommApache-2.0

View on GitHub

Inference8.1K Stars1.0K Forks61 views

NexaSDK is a high-performance on-device inference framework, developed by Nexa AI and hosted under Qualcomm's GitHub organization, that runs the latest multimodal AI models locally across NPU, GPU, and CPU. With a few lines of code, developers can serve LLMs and multimodal models on Android, Windows, and Linux devices, frequently with day-0 support for new releases weeks or months ahead of other runtimes. ## Why NexaSDK Matters Most local inference engines focus on CPU and GPU execution, leaving the increasingly powerful neural processing units (NPUs) in modern devices underused. NexaSDK treats the NPU as a first-class target, enabling efficient, low-energy inference on edge hardware such as Snapdragon-powered devices. This makes it possible to run capable models on phones, laptops, and IoT devices without sending data to the cloud. The project has earned over 8,000 GitHub stars and was featured multiple times in Qualcomm's official developer blogs for its work on the Hexagon NPU. ## Day-0 Model Support A defining strength of NexaSDK is how quickly it supports new models. The framework has shipped support for models including Qwen3-VL, DeepSeek-OCR, and Gemma3n vision variants ahead of most competing runtimes, letting developers experiment with cutting-edge architectures on-device almost immediately after release. ## Multiple Interfaces NexaSDK meets developers where they work. It offers a command-line interface for quick experimentation, a Python SDK for application integration, a native Android SDK for mobile apps, and a Linux Docker path for server and IoT deployments. A simple `nexa infer` command is enough to start chatting with a model or running multimodal inputs. ## Broad Task and Format Coverage Beyond text generation, NexaSDK handles multimodal understanding, automatic speech recognition, OCR, reranking, object detection, image generation, and embeddings. It supports the widely used GGUF format alongside its own optimized NEXA format, giving teams flexibility across hardware backends. ## On-Device, Energy-Efficient AI By emphasizing minimal energy consumption and local execution, NexaSDK targets the practical constraints of edge deployment, where battery life, latency, and privacy all matter. It is a compelling option for teams building offline-capable, privacy-preserving AI features.

Key Features

On-device inference across NPU, GPU, and CPU on Android, Windows, and Linux
Day-0 support for the latest models such as Qwen3-VL, DeepSeek-OCR, and Gemma3n
Multiple interfaces: CLI, Python SDK, native Android SDK, and Linux Docker
Broad task coverage including LLM, multimodal, ASR, OCR, rerank, object detection, image generation, and embeddings
Support for both GGUF and the optimized NEXA model format
Optimized for Qualcomm Hexagon NPU and Snapdragon hardware
Minimal energy footprint designed for edge and mobile deployment
Privacy-preserving local execution with no cloud dependency

Related Projects

TrendingInference

GitHub

165.0K15.0K

Ollama

ollama

MIT388

Open Source

NexaSDK

Key Features

Tags

Related Projects

Ollama

llama.cpp

vLLM

Unsloth