Kreuzberg is a an open-source (MIT license) polyglot document intelligence framework written in Rust, with bindings for Python, Typescript/Javascript (Node/Bun/WASM), PHP, Ruby, Java, C#, Golang and Elixir. Its also available as a docker image and standalone CLI tool you can install via homebrew.
Kreuzberg allows users to extract text from 75+ formats (plus because we are adding more over time), perform OCR, create embeddings, and quite a few other things as well. This is necessary for many AI applications, data pipelines, machine learning, and basically any use case that requires processing documents and images as sources for textual outputs.
Comments (0)
No comments yet. Be the first to share your thoughts!