T
ToolShelf
KREUZBERG
// A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from ...

kreuzberg

A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from ...

13EmergingUnknown
License
MIT
Updated
Today

What it does

Extract text and metadata from a wide range of file formats (75+), generate embeddings and post-process at native speeds without needing a GPU. - Extensible architecture – Plugin system for custom OCR backends, validators, post-processors, and document extractors - Polyglot – Native bindings for Rust, Python, TypeScript/Node.js, Ruby, Go, Java, C#, PHP, Elixir, R, and C - 75+ file formats – PDF,

Getting Started

git
git clone https://github.com/kreuzberg-dev/kreuzberg

Platforms

🪟windows🍎mac🐧linux

Install Difficulty

moderate

Built With

html

Community Reactions