SLM Training Data Generator

/ DATA-CENTRIC AI

Build the training set
before you train the model.

A purpose-built generator for JSONL conversation pairs used to LoRA-fine-tune a small language model, convert to GGUF, and serve through Ollama.

Read the tutorial Skip to workspace

How to train a Small Language Model for Ollama

STEP 01

Collect conversation pairs

Each training example is one JSON object on one line — a JSONL file. The schema follows the OpenAI / Hugging Face chat template: a messages array with role/content turns.

// One example per line
{"messages":[
  {"role":"user","content":"What is LoRA?"},
  {"role":"assistant","content":"Low-Rank Adaptation..."}
]}

Aim for 100–1000+ high-quality pairs. Diversity matters more than volume.

STEP 02

LoRA fine-tune in PyTorch

LoRA (Low-Rank Adaptation) freezes the base weights and trains small adapter matrices. You can fine-tune a 1–3B parameter model on a single consumer GPU.

# Using PEFT + Transformers
from peft import LoraConfig, get_peft_model
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B")
config = LoraConfig(r=8, lora_alpha=16,
  target_modules=["q_proj","v_proj"])
model = get_peft_model(model, config)
# ... train on your .jsonl ...
model.save_pretrained("./adapter")

STEP 03

Merge & convert to GGUF

GGUF (GPT-Generated Unified Format) is llama.cpp's binary format — quantized, mmap-friendly, single-file. Merge the LoRA adapter into the base model, then convert.

# Merge adapter into base
model = model.merge_and_unload()
model.save_pretrained("./merged")

# Convert with llama.cpp
$ python convert_hf_to_gguf.py ./merged \
    --outfile model.gguf --outtype q4_k_m

q4_k_m ≈ 4-bit, balanced size/quality. Try q8_0 for higher fidelity.

STEP 04

Serve with Ollama Modelfile

A Modelfile tells Ollama how to load your GGUF: base file, chat template, system prompt, sampling parameters.

# Modelfile
FROM ./model.gguf
TEMPLATE """{{ if .System }}<|system|>{{ .System }}
{{ end }}<|user|>{{ .Prompt }}
<|assistant|>"""
SYSTEM "You are a helpful assistant."
PARAMETER temperature 0.7
PARAMETER stop "<|user|>"

# Then:
$ ollama create my-slm -f Modelfile
$ ollama run my-slm

The bottleneck is your dataset. A model trained on 500 well-crafted, diverse, on-topic pairs will outperform one trained on 50,000 noisy scraped ones. That's why this tool exists.

Workspace

Total entries

Avg user tokens

Avg asst. tokens

Storage size

0 B

User message *

0 chars 0 tokens (est.)

Assistant message *

0 chars 0 tokens (est.)

Optional system prompt

If set, a system role will be prepended to this entry only.

#	User	Assistant	Actions
No entries yet. Add your first pair above.

Deploy locally with Docker. Configure your training run below, then download a complete kit (Dockerfile, compose file, training script, GGUF converter, Modelfile, README) bundled with your current dataset. Unzip, docker compose up, and you'll have a fine-tuned GGUF model loaded into Ollama.

Model

Base model

Smaller = faster training, less VRAM. 0.5B works on a 6GB GPU.

Ollama model name

System prompt

Quantization

LoRA & Training

LoRA r

LoRA alpha

Epochs

Batch

Hardware target

CUDA requires NVIDIA Container Toolkit on the host.

Bundle current dataset (0 entries) into the kit Include Ollama service in docker-compose

Kit contents

slm-training-kit/
├── docker-compose.yml      orchestrate trainer + ollama services
├── Dockerfile              PyTorch + PEFT + llama.cpp toolchain
├── train.py                LoRA fine-tune on JSONL chat data
├── convert.sh              merge adapter → GGUF → quantize
├── run_all.sh              one-command pipeline
├── Modelfile               Ollama model definition
├── README.md               step-by-step instructions
├── .env.example            HF_TOKEN, config overrides
└── data/
    └── training.jsonl      your dataset

Or download individual files ▾

Run it after unzipping

# 1. Build image and train (LoRA fine-tune on your JSONL)
$ docker compose run --rm trainer bash run_all.sh

# 2. Register the resulting GGUF with the ollama service
$ docker compose up -d ollama
$ docker compose exec ollama ollama create my-slm -f /models/Modelfile

# 3. Chat with your fine-tuned model
$ docker compose exec ollama ollama run my-slm

# Or call the HTTP API at http://localhost:11434
$ curl http://localhost:11434/api/generate -d '{"model":"my-slm","prompt":"hello"}'

Preview generated files

Select a file above.

Build the training set
before you train the model.

How to train a Small Language Model for Ollama

Collect conversation pairs

LoRA fine-tune in PyTorch

Merge & convert to GGUF

Serve with Ollama Modelfile

Workspace

Parameters

Last match

Export

Import

Live preview (first 5 entries as JSONL)

Model

LoRA & Training

Kit contents

Build the training set before you train the model.

How to train a Small Language Model for Ollama

Collect conversation pairs

LoRA fine-tune in PyTorch

Merge & convert to GGUF

Serve with Ollama Modelfile

Workspace

Parameters

Last match

Export

Import

Live preview (first 5 entries as JSONL)

Model

LoRA & Training

Kit contents

Build the training set
before you train the model.