Ludwig

10.8k 1.2k

LLMs Framework

03 May, 2024

Python

What is Ludwig?

Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks.

Ludwig Features

🛠 Build custom models with ease: a declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data. Support for multi-task and multi-modality learning. Comprehensive config validation detects invalid parameter combinations and prevents runtime failures.
⚡ Optimized for scale and efficiency: automatic batch size selection, distributed training (DDP, DeepSpeed), parameter efficient fine-tuning (PEFT), 4-bit quantization (QLoRA), paged and 8-bit optimizers, and larger-than-memory datasets.
📐 Expert level control: retain full control of your models down to the activation functions. Support for hyperparameter optimization, explainability, and rich metric visualizations.
🧱 Modular and extensible: experiment with different model architectures, tasks, features, and modalities with just a few parameter changes in the config. Think building blocks for deep learning.
🚢 Engineered for production: prebuilt Docker containers, native support for running with Ray on Kubernetes, export models to Torchscript and Triton, upload to HuggingFace with one command.

Ludwig is hosted by the

Linux Foundation AI & Data.

💾 Installation

Install from PyPi. Be aware that Ludwig requires Python 3.8+.

pip install ludwig

Or install with all optional dependencies:

pip install ludwig[full]

Please see contributing for more detailed installation instructions.

🚂 Getting Started

Want to take a quick peak at some of the Ludwig 0.8 features? Check out this Colab Notebook 🚀 !Open In Colab

Looking to fine-tune Llama-2 or Mistral? Check out these notebooks:

Fine-Tune Llama-2-7b: !Open In Colab
Fine-Tune Llama-2-13b: !Open In Colab
Fine-Tune Mistral-7b: !Open In Colab

For a full tutorial, check out the official getting started guide, or take a look at end-to-end Examples.

Large Language Model Fine-Tuning

!Open In Colab

Let’s fine-tune a pretrained LLaMA-2-7b large language model to follow instructions like a chatbot (“instruction tuning”).

Prerequisites

HuggingFace API Token
Access approval to Llama2-7b-hf
GPU with at least 12 GiB of VRAM (in our tests, we used an Nvidia T4)

Running

We’ll use the Stanford Alpaca dataset, which will be formatted as a table-like file that looks like this:

instruction	input	output
Give three tips for staying healthy.		1. Eat a balanced diet and make sure to include…
Arrange the items given below in the order to …	cake, me, eating	I eating cake.
Write an introductory paragraph about a famous…	Michelle Obama	Michelle Obama is an inspirational woman who r…
…	…	…

Create a YAML config file named model.yaml with the following:

model_type: llm

base_model: meta-llama/Llama-2-7b-hf

quantization:

  bits: 4

adapter:

  type: lora

prompt:

  template: |

    Below is an instruction that describes a task, paired with an input that may provide further context.

    Write a response that appropriately completes the request.

    ### Instruction:

    {instruction}

    ### Input:

    {input}

    ### Response:

input_features:

  - name: prompt

    type: text

output_features:

  - name: output

    type: text

trainer:

  type: finetune

  learning_rate: 0.0001

  batch_size: 1

  gradient_accumulation_steps: 16

  epochs: 3

  learning_rate_scheduler:

    decay: cosine

    warmup_fraction: 0.01

preprocessing:

  sample_ratio: 0.1

backend:

  type: local

And now let’s train the model:

export HUGGING_FACE_HUB_TOKEN = "<api_token>"

ludwig train --config model.yaml --dataset "ludwig://alpaca"

Supervised ML

Let’s build a neural network that predicts whether a given movie critic’s review on Rotten Tomatoes was positive or negative.

Our dataset will be a CSV file that looks like this:

movie_title	content_rating	genres	runtime	top_critic	review_content	recommended
Deliver Us from Evil	R	Action & Adventure, Horror	117.0	TRUE	Director Scott Derrickson and his co-writer, Paul Harris Boardman, deliver a routine procedural with unremarkable frights.	0
Barbara	PG-13	Art House & International, Drama	105.0	FALSE	Somehow, in this stirring narrative, Barbara manages to keep hold of her principles, and her humanity and courage, and battles to save a dissident teenage girl whose life the Communists are trying to destroy.	1
Horrible Bosses	R	Comedy	98.0	FALSE	These bosses cannot justify either murder or lasting comic memories, fatally compromising a farce that could have been great but ends up merely mediocre.	0
…	…	…	…	…	…	…

Download a sample of the dataset from here.

wget https://ludwig.ai/latest/data/rotten_tomatoes.csv

Next create a YAML config file named model.yaml with the following:

input_features:

  - name: genres

    type: set

    preprocessing:

      tokenizer: comma

  - name: content_rating

    type: category

  - name: top_critic

    type: binary

  - name: runtime

    type: number

  - name: review_content

    type: text

    encoder:

      type: embed

output_features:

  - name: recommended

    type: binary

That’s it! Now let’s train the model:

ludwig train --config model.yaml --dataset rotten_tomatoes.csv

Happy modeling

Try applying Ludwig to your data. Reach out

if you have any questions.

❓ Why you should use Ludwig

Minimal machine learning boilerplate

Ludwig takes care of the engineering complexity of machine learning out of

the box, enabling research scientists to focus on building models at the

highest level of abstraction. Data preprocessing, hyperparameter

optimization, device management, and distributed training for

torch.nn.Module models come completely free.

Easily build your benchmarks

Creating a state-of-the-art baseline and comparing it with a new model is a

simple config change.
Easily apply new architectures to multiple problems and datasets

Apply new models across the extensive set of tasks and datasets that Ludwig

supports. Ludwig includes a

full benchmarking toolkit accessible to

any user, for running experiments with multiple models across multiple

datasets with just a simple configuration.
Highly configurable data preprocessing, modeling, and metrics

Any and all aspects of the model architecture, training loop, hyperparameter

search, and backend infrastructure can be modified as additional fields in

the declarative configuration to customize the pipeline to meet your

requirements. For details on what can be configured, check out

Ludwig Configuration

docs.
Multi-modal, multi-task learning out-of-the-box

Mix and match tabular data, text, images, and even audio into complex model

configurations without writing code.
Rich model exporting and tracking

Automatically track all trials and metrics with tools like Tensorboard,

Comet ML, Weights & Biases, MLFlow, and Aim Stack.
Automatically scale training to multi-GPU, multi-node clusters

Go from training on your local machine to the cloud without code changes.
Low-code interface for state-of-the-art models, including pre-trained Huggingface Transformers

Ludwig also natively integrates with pre-trained models, such as the ones

available in Huggingface Transformers.

Users can choose from a vast collection of state-of-the-art pre-trained

PyTorch models to use without needing to write any code at all. For example,

training a BERT-based sentiment analysis model with Ludwig is as simple as:

  ludwig train --dataset sst5 --config_str "{input_features: [{name: sentence, type: text, encoder: bert}], output_features: [{name: label, type: category}]}"

Low-code interface for AutoML

Ludwig AutoML

allows users to obtain trained models by providing just a dataset, the

target column, and a time budget.

  auto_train_results = ludwig.automl.auto_train(dataset=my_dataset_df, target=target_column_name, time_limit_s=7200)

Easy productionisation

Ludwig makes it easy to serve deep learning models, including on GPUs.

Launch a REST API for your trained Ludwig model.

  ludwig serve --model_path=/path/to/model

Ludwig supports exporting models to efficient Torchscript bundles.

  ludwig export_torchscript -–model_path=/path/to/model

📚 Tutorials

🔬 Example Use Cases

💡 More Information

Read our publications on Ludwig, declarative ML, and Ludwig’s SoTA benchmarks.

Learn more about how Ludwig works, how to get started, and work through more examples.

If you are interested in contributing, have questions, comments, or thoughts to share, or if you just want to be in the

know, please consider joining the Ludwig Slack and follow us on Twitter!

<img src="https://github.com/ludwig-ai/ludwig-docs/raw/master/docs/images/ludwig_hero_smaller.jpg" height="150">