Learning together how to build and fine tune language models from scratch

Keywords: Generative AI, language models, self-supervised learning

Little Language Machine is a browser-based learning environment where learners can train and experiment with a small language model themselves. Instead of treating AI as a black box, the tool makes the workings of language models visible by showing how making them combines data, machine learning, and human design choices.

By using the tool to make their own language models (real transformers like ChatGPT, but much smaller), learners discover that a language model does not understand language. Instead, it generates text by predicting the most likely next character or word based on previous text. Learners can explore how training data, model size, and randomness affect the generated output, and observe the model improve step by step—from random character sequences to increasingly coherent text.

Little Language Machine supports AI literacy by helping learners understand...

The basic steps in the language model training workflow.

How data and data quality shape model behavior?

Why language models can produce incorrect or misleading output?

How AI systems reflect human choices made during their design and training?

The tool runs entirely in the browser, locally on the user’s device, and does not send data to external servers.

Key concepts: Language model, model size, training data, model training, context, hallucination and bias, probability and randomness, human design choices in AI systems

Open beta available since 2025. Estimated public release Autumn 2026