Type something to search...



3.2k 279
03 May, 2024

What is Llmware ?

llmware is a unified framework for developing LLM-based application patterns including Retrieval Augmented Generation (RAG). This project provides an integrated set of tools that anyone can use – from beginner to the most sophisticated AI developer – to rapidly build industrial-grade, knowledge-based enterprise LLM applications with specific focus on making it easy to integrate open source small specialized models and connecting enterprise knowledge safely and securely to LLMs in private cloud.

Llmware Key features

Llmware is an integrated framework comprised of four major components:

Retrieval : Assemble and Query knowledge base

  • High-performance document parsers to rapidly ingest, text chunk and ingest common document types.

  • Comprehensive intuitive querying methods: semantic, text, and hybrid retrieval with integrated metadata.

  • Ranking and filtering strategies to enable semantic search and rapid retrieval of information.

  • Web scrapers, Wikipedia integration, and Yahoo Finance API integration.

Prompt: Simple, Unified Abstraction across 50+ Models

  • Connect Models: Simple high-level interface with support for 50+ models out of the box.

  • Prompts with Sources: Powerful abstraction to easily package a wide range of materials into prompts.

  • Post Processing: tools for evidence verification, classification of a response, and fact-checking.

  • Human in the Loop: Ability to enable user ratings, feedback, and corrections of AI responses.

  • Auditability: A flexible state mechanism to analyze and audit the LLM prompt lifecycle.

Vector Embeddings: swappable embedding models and vector databases

  • Industry Bert: out-of-the-box industry finetuned open source Sentence Transformers.

  • Wide Model Support: Custom trained HuggingFace, sentence transformer embedding models and leading commercial models.

  • Mix-and-match among multiple options to find the right solution for any particular application.

  • Out-of-the-box support for 7 vector databases - Milvus, Postgres (PG Vector), Redis, FAISS, Qdrant, Pinecone and Mongo Atlas.

Parsing and Text Chunking: Scalable Ingestion

  • Integrated High-Speed Parsers for: PDF, PowerPoint, Word, Excel, HTML, Text, WAV, AWS Transcribe transcripts.

  • Text-chunking tools to separate information and associated metadata to a consistent block format.

📚 Explore additional llmware capabilities and 🎬 Check out these videos on how to quickly get started with RAG:

🌱 Getting Started

1. Install llmware:

pip install llmware


python3 -m pip install llmware

See Working with llmware for other options to get up and running.

2. MongoDB and Milvus

MongoDB and Milvus are optional and used to provide production-grade database and vector embedding capabilities. The fastest way to get started is to use the provided Docker Compose file (note: requires Docker Compose / Docker desktop to be installed) which takes care of running them both:

curl -o docker-compose.yaml https://raw.githubusercontent.com/llmware-ai/llmware/main/docker-compose.yaml

and then run the containers:

docker compose up -d

Not ready to install MongoDB or Milvus? Check out what you can do without them in our examples section.

See Running MongoDB and Milvus for other options to get up and running with these optional dependencies.

3. 🔥 Start coding - Quick Start for RAG 🔥

Terminal window
# This example illustrates a simple contract analysis
# using a small RAG-optimized LLM running locally
import os
import re
from llmware.prompts import Prompt, HumanInTheLoop
from llmware.setup import Setup
from llmware.configs import LLMWareConfig
def contract_analysis_on_laptop (model_name):
# Load the llmware sample files
print (f"\n > Loading the llmware sample files...")
sample_files_path = Setup().load_sample_files()
contracts_path = os.path.join(sample_files_path,"Agreements")
# query list
query_list = {"executive employment agreement": "What are the name of the two parties?",
"base salary": "What is the executive's base salary?",
"governing law": "What is the governing law?"}
print (f"\n > Loading model {model_name}...")
prompter = Prompt().load_model(model_name)
for i, contract in enumerate(os.listdir(contracts_path)):
# excluding Mac file artifact
if contract != ".DS_Store":
print("\nAnalyzing contract: ", str(i+1), contract)
print("LLM Responses:")
for key, value in query_list.items():
# contract is parsed, text-chunked, and then filtered by topic key
source = prompter.add_source_document(contracts_path, contract, query=key)
# calling the LLM with 'source' information from the contract automatically packaged into the prompt
responses = prompter.prompt_with_source(value, prompt_name="just_the_facts", temperature=0.3)
for r, response in enumerate(responses):
print(key, ":", re.sub("[\n]"," ", response["llm_response"]).strip())
# We're done with this contract, clear the source from the prompt
# Save jsonl report to jsonl to /prompt_history folder
print("\nPrompt state saved at: ", os.path.join(LLMWareConfig.get_prompt_path(),prompter.prompt_id))
# Save csv report that includes the model, response, prompt, and evidence for human-in-the-loop review
csv_output = HumanInTheLoop(prompter).export_current_interaction_to_csv()
print("csv output saved at: ", csv_output)
if __name__ == "__main__":
# use local cpu model - smallest, fastest (use larger BLING models for higher accuracy)
model = "llmware/bling-1b-0.1"

📚 See 50+ llmware examples for more RAG examples and other code samples and ideas.

4. Accessing LLMs and setting-up API keys & secrets

To use LLMWare, you do not need to use any proprietary LLM - we would encourage you to experiment with BLING, DRAGON, Industry-BERT, the GGUF examples, along with bringing in your favorite models from HuggingFace and Sentence Transformers.

If you would like to use a proprietary model, you will need to provide your own API Keys. API keys and secrets for models, aws, and pinecone can be set-up for use in environment variables or passed directly to method calls.

🔹 Alternate options for running MongoDB and Milvus

There are several options for getting MongoDB running

🐳 A. Run mongo container with docker

Terminal window
docker run -d -p 27017:27017 -v mongodb-volume:/data/db --name=mongodb mongo:latest

🐳 B. Run container with docker compose

Create a docker-compose.yaml file with the content:

version: "3"
container_name: mongodb
image: 'mongo:latest'
- mongodb-volume:/data/db
- '27017:27017'
driver: local

and then run:

Terminal window
docker compose up

📖 C. Install MongoDB natively

See the Official MongoDB Installation Guide

🔗 D. Connect to an existing MongoDB deployment

You can connect to an existing MongoDB deployment by setting the connection string to the environment variable, COLLECTION_DB_URI . See the example script, Using Mongo Atlas, for detailed information on how to use Mongo Atlas as the NoSQL and/or Vector Database for llmware .

Additional information on finding and formatting connection strings can be found in the MongoDB Connection Strings Documentation.

✍️ Working with the llmware Github repository

The llmware repo can be pulled locally to get access to all the examples, or to work directly with the latest version of the llmware code.

Pull the repo locally

git clone [email protected]:llmware-ai/llmware.git

or download/extract a zip of the llmware repository

Run llmware natively

Update the local copy of the repository:

Terminal window
git pull

Download the shared llmware native libraries and dependencies by running the load_native_libraries.sh script. This pulls the right wheel for your platform and extracts the llmware native libraries and dependencies into the proper place in the local repository.

Terminal window

At the top level of the llmware repository run the following command:

Terminal window
pip install .