SLM Training Data Generator

JSONL · LoRA · GGUF · Ollama

/ DATA-CENTRIC AI

Build the training set
before you train the model.

A purpose-built generator for JSONL conversation pairs used to LoRA-fine-tune a small language model, convert to GGUF, and serve through Ollama.

01

How to train a Small Language Model for Ollama

STEP 01

Collect conversation pairs

Each training example is one JSON object on one line — a JSONL file. The schema follows the OpenAI / Hugging Face chat template: a messages array with role/content turns.

// One example per line
{"messages":[
  {"role":"user","content":"What is LoRA?"},
  {"role":"assistant","content":"Low-Rank Adaptation..."}
]}

Aim for 100–1000+ high-quality pairs. Diversity matters more than volume.

STEP 02

LoRA fine-tune in PyTorch

LoRA (Low-Rank Adaptation) freezes the base weights and trains small adapter matrices. You can fine-tune a 1–3B parameter model on a single consumer GPU.

# Using PEFT + Transformers
from peft import LoraConfig, get_peft_model
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B")
config = LoraConfig(r=8, lora_alpha=16,
  target_modules=["q_proj","v_proj"])
model = get_peft_model(model, config)
# ... train on your .jsonl ...
model.save_pretrained("./adapter")
STEP 03

Merge & convert to GGUF

GGUF (GPT-Generated Unified Format) is llama.cpp's binary format — quantized, mmap-friendly, single-file. Merge the LoRA adapter into the base model, then convert.

# Merge adapter into base
model = model.merge_and_unload()
model.save_pretrained("./merged")

# Convert with llama.cpp
$ python convert_hf_to_gguf.py ./merged \
    --outfile model.gguf --outtype q4_k_m

q4_k_m ≈ 4-bit, balanced size/quality. Try q8_0 for higher fidelity.

STEP 04

Serve with Ollama Modelfile

A Modelfile tells Ollama how to load your GGUF: base file, chat template, system prompt, sampling parameters.

# Modelfile
FROM ./model.gguf
TEMPLATE """{{ if .System }}<|system|>{{ .System }}
{{ end }}<|user|>{{ .Prompt }}
<|assistant|>"""
SYSTEM "You are a helpful assistant."
PARAMETER temperature 0.7
PARAMETER stop "<|user|>"

# Then:
$ ollama create my-slm -f Modelfile
$ ollama run my-slm
The bottleneck is your dataset. A model trained on 500 well-crafted, diverse, on-topic pairs will outperform one trained on 50,000 noisy scraped ones. That's why this tool exists.
02

Workspace

Total entries

0

Avg user tokens

0

Avg asst. tokens

0

Storage size

0 B

0 chars 0 tokens (est.)
0 chars 0 tokens (est.)
Optional system prompt

If set, a system role will be prepended to this entry only.