LLMware
3.2k 279What is Llmware ?
llmware is a unified framework for developing LLM-based application patterns including Retrieval Augmented Generation (RAG). This project provides an integrated set of tools that anyone can use – from beginner to the most sophisticated AI developer – to rapidly build industrial-grade, knowledge-based enterprise LLM applications with specific focus on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely to LLMs in private cloud.
Llmware Key features
Llmware is an integrated framework comprised of four major components:
Retrieval : Assemble and Query knowledge base
-
High-performance document parsers to rapidly ingest, text chunk and ingest common document types.
-
Comprehensive intuitive querying methods: semantic, text, and hybrid retrieval with integrated metadata.
-
Ranking and filtering strategies to enable semantic search and rapid retrieval of information.
-
Web scrapers, Wikipedia integration, and Yahoo Finance API integration.
Prompt: Simple, Unified Abstraction across 50+ Models
-
Connect Models: Simple high-level interface with support for 50+ models out of the box.
-
Prompts with Sources: Powerful abstraction to easily package a wide range of materials into prompts.
-
Post Processing: tools for evidence verification, classification of a response, and fact-checking.
-
Human in the Loop: Ability to enable user ratings, feedback, and corrections of AI responses.
-
Auditability: A flexible state mechanism to analyze and audit the LLM prompt lifecycle.
Vector Embeddings: swappable embedding models and vector databases
-
Industry Bert: out-of-the-box industry finetuned open source Sentence Transformers.
-
Wide Model Support: Custom trained HuggingFace, sentence transformer embedding models and leading commercial models.
-
Mix-and-match among multiple options to find the right solution for any particular application.
-
Out-of-the-box support for 7 vector databases - Milvus, Postgres (PG Vector), Redis, FAISS, Qdrant, Pinecone and Mongo Atlas.
Parsing and Text Chunking: Scalable Ingestion
-
Integrated High-Speed Parsers for: PDF, PowerPoint, Word, Excel, HTML, Text, WAV, AWS Transcribe transcripts.
-
Text-chunking tools to separate information and associated metadata to a consistent block format.
📚 Explore additional llmware capabilities and 🎬 Check out these videos on how to quickly get started with RAG:
-
Use small LLMs for RAG for Contract Analysis (feat. LLMWare)
-
RAG using CPU-based (No-GPU required) Hugging Face Models with LLMWare on your laptop
🌱 Getting Started
1. Install llmware:
or
See Working with llmware for other options to get up and running.
2. MongoDB and Milvus
MongoDB and Milvus are optional and used to provide production-grade database and vector embedding capabilities. The fastest way to get started is to use the provided Docker Compose file (note: requires Docker Compose / Docker desktop to be installed) which takes care of running them both:
and then run the containers:
Not ready to install MongoDB or Milvus? Check out what you can do without them in our examples section.
See Running MongoDB and Milvus for other options to get up and running with these optional dependencies.
3. 🔥 Start coding - Quick Start for RAG 🔥
📚 See 50+ llmware examples for more RAG examples and other code samples and ideas.
4. Accessing LLMs and setting-up API keys & secrets
To use LLMWare, you do not need to use any proprietary LLM - we would encourage you to experiment with BLING, DRAGON, Industry-BERT, the GGUF examples, along with bringing in your favorite models from HuggingFace and Sentence Transformers.
If you would like to use a proprietary model, you will need to provide your own API Keys. API keys and secrets for models, aws, and pinecone can be set-up for use in environment variables or passed directly to method calls.
🔹 Alternate options for running MongoDB and Milvus
There are several options for getting MongoDB running
🐳 A. Run mongo container with docker
🐳 B. Run container with docker compose
Create a docker-compose.yaml file with the content:
and then run:
📖 C. Install MongoDB natively
See the Official MongoDB Installation Guide
🔗 D. Connect to an existing MongoDB deployment
You can connect to an existing MongoDB deployment by setting the connection string to the environment variable, COLLECTION_DB_URI
. See the example script, Using Mongo Atlas, for detailed information on how to use Mongo Atlas as the NoSQL and/or Vector Database for llmware
.
Additional information on finding and formatting connection strings can be found in the MongoDB Connection Strings Documentation.
✍️ Working with the llmware Github repository
The llmware repo can be pulled locally to get access to all the examples, or to work directly with the latest version of the llmware code.
Pull the repo locally
or download/extract a zip of the llmware repository
Run llmware natively
Update the local copy of the repository:
Download the shared llmware native libraries and dependencies by running the load_native_libraries.sh script. This pulls the right wheel for your platform and extracts the llmware native libraries and dependencies into the proper place in the local repository.
At the top level of the llmware repository run the following command: