Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

GLiNER2 - Open Source | Evermx | Evermx

Back to Open Source

Trending

GLiNER2

fastino-aiApache-2.0

View on GitHub

LLM1.1K Stars102 Forks375 views

GLiNER2 is a compact yet powerful information extraction model that unifies Named Entity Recognition (NER), Text Classification, Structured Data Extraction, and Relation Extraction into a single 205M parameter model. Developed by Fastino AI, it achieves performance competitive with GPT-4o on several extraction benchmarks while running entirely on CPU without any external API dependencies. The project has reached 1,079 GitHub stars and is gaining traction as a privacy-focused, cost-effective alternative to API-based extraction pipelines. ## Why GLiNER2 Matters Information extraction is a foundational capability for AI applications ranging from document processing to knowledge graph construction. Traditional approaches require separate specialized models for each task, resulting in complex pipelines that are expensive to deploy and maintain. GLiNER2 collapses this complexity into a single model that can be installed with `pip install gliner2` and runs on any machine with a CPU. ## Key Features ### Unified Multi-Task Architecture GLiNER2 handles four distinct extraction tasks through a single schema-based interface: | Task | Description | Example | |------|-------------|--------| | Entity Extraction | Identify named entities with confidence scores and spans | People, organizations, locations | | Text Classification | Single and multi-label classification with configurable thresholds | Sentiment, intent, topic | | Structured Data Extraction | Parse complex JSON structures from unstructured text | Forms, invoices, specifications | | Relation Extraction | Extract directional relationships between entities | "Tim Cook CEO-of Apple" | All four tasks can be composed in a single forward pass through unified schemas, eliminating the need for multiple model calls. ### CPU-Optimized Inference The base model runs at 205M parameters, small enough for efficient CPU inference. This design choice makes GLiNER2 deployable on standard servers, edge devices, and development machines without GPU requirements. The larger 340M parameter variant offers enhanced accuracy for applications where precision is critical. ```python from gliner2 import GLiNER2 extractor = GLiNER2.from_pretrained("fastino/gliner2-base-v1") result = extractor.extract_entities( "Apple CEO Tim Cook announced iPhone 15 in Cupertino.", ["company", "person", "product", "location"] ) ``` ### Schema-Flexible Configuration Rather than hardcoding entity types, GLiNER2 accepts custom schemas with entity descriptions that improve extraction accuracy for domain-specific tasks. This flexibility means a single model deployment can serve multiple use cases by simply changing the schema definition. ### Privacy-First Design All processing happens locally with zero data leaving the machine. This makes GLiNER2 suitable for sensitive domains like healthcare, legal, and financial document processing where data privacy is non-negotiable. ## Benchmark Performance GLiNER2 closely matches GPT-4o in overall F1 score: | Metric | GLiNER2 | GPT-4o | |--------|---------|--------| | Overall F1 | 0.590 | 0.599 | | AI Domain F1 | 0.547 | 0.526 | | Literature F1 | 0.564 | 0.561 | | SNIPS Intent | 0.83 | - | | Banking77 Intent | 0.70 | - | For intent classification specifically, GLiNER2 scores 0.83 on SNIPS and 0.70 on Banking77, outperforming DeBERTa's 0.77 and 0.42 respectively. ## Available Models - **fastino/gliner2-base-v1**: 205M parameters, general extraction and classification - **fastino/gliner2-large-v1**: 340M parameters, enhanced accuracy - **fastino/gliner2-multi-v1**: Multilingual support - **GLiNER XL 1B**: Cloud API for maximum performance ## Customization and Fine-Tuning GLiNER2 supports LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning on domain-specific data. The adapter switching capability allows maintaining multiple domain-specific adapters on a single base model, and custom regex validators can filter extracted spans based on pattern matching. ## Practical Applications - **Document Processing**: Extract structured data from invoices, contracts, and reports - **Knowledge Graph Construction**: Identify entities and relationships for graph databases - **Content Moderation**: Classify text and extract problematic entities - **Customer Support**: Intent classification and entity extraction from support tickets - **Research**: Named entity recognition for academic papers and datasets ## Conclusion GLiNER2 represents a significant step toward making production-grade information extraction accessible without cloud API dependencies. Its unified multi-task architecture eliminates pipeline complexity, while CPU-efficient inference makes deployment straightforward. For teams that need NER, classification, and relation extraction without sending data to external services, GLiNER2 offers a compelling alternative to API-based solutions at a fraction of the operational cost.

Key Features

Unified multi-task architecture for NER, text classification, structured extraction, and relation extraction in a single model
CPU-optimized 205M parameter base model with no GPU requirements
Schema-flexible configuration with custom entity descriptions for domain-specific accuracy
Privacy-first local processing with zero external API dependencies
GPT-4o-competitive F1 scores on extraction and classification benchmarks
LoRA fine-tuning support with adapter switching for domain specialization
Multi-task composition enabling all extraction types in a single forward pass

Related Projects

TrendingLLM

GitHub

159.1K32.8K

Hugging Face Transformers

huggingface

Apache-2.0375

Open Source

GLiNER2

Key Features

Tags

Related Projects

Hugging Face Transformers

Hermes Agent

LangChain

Open WebUI