Build your first AI model — without the overwhelm.
You don't need a PhD, a supercomputer, or years of experience. With an open-source model like Mistral and a free tool called Ollama, you can have an AI running on your own computer today. This guide walks you through it, one calm step at a time.
First, what even is a model?
Before touching anything, let's clear up the words. Two minutes here saves hours later.
The "brain"
A large file of learned patterns. Mistral is one of these, trained on huge amounts of text. On its own it just sits there as data, waiting to be run.
The "engine"
Software that loads the model and lets you talk to it. Ollama is the friendliest one for beginners — download, click, done.
The "face"
How a person interacts — a terminal, a chat box, a web page. You add this last, once the brain and engine are working.
"Making a model" means three very different things.
Most beginners mix these up. Here's what's realistic to start with — and what to save for later.
Using a model as-is
Take Mistral exactly as it comes and put it to work. This is where you start, and it's genuinely powerful on its own. → the Build It tab.
Fine-tuning
Teaching an existing model your own style or data. Very doable as a second project using a free online GPU. → the Train It tab.
Training from scratch
Building a model from nothing. This costs millions and needs warehouses of hardware — not a beginner project, and that's fine.
When people say "we built an AI," they almost always mean A — wrapping a ready-made model in something useful. That's a brilliant first project. Do that first, then graduate to B when you're curious.
What you'll need on your machine.
Nothing exotic. Most laptops from the last few years can do this.
Windows 10 or 11
Windows 10 needs version 1903 or newer. Windows 11 is fine as-is. Mac and Linux work too.
8 GB minimum
8 GB runs Mistral 7B; 16 GB is comfier for doing other things at the same time.
~10 GB
The model file is about 4 GB; leave headroom. An SSD loads models noticeably faster than an old hard drive.
A graphics card (GPU) is optional. Without one, Mistral still runs on your CPU — just slower, generating a few words per second. That's completely fine for learning and testing. If responses feel sluggish, that's why, and it's not a mistake on your part.
Let's get Mistral running on your computer.
Follow these in order. Each step is small. Do them once yourself before you teach anyone else — it makes everything click.
Install Ollama
Go to ollama.com/download and grab the Windows installer. Run it like any normal program — double-click, click through, done. Tip: right-click and choose "Run as administrator" to avoid path issues.
Ollama is the "engine" from the Start Here tab. It runs quietly in the background and handles all the hard parts of loading a model for you.
Open your command line
Press the Windows key, type PowerShell, and hit Enter. A dark window opens. This is where you'll type a couple of commands — don't worry, it's just two of them. (If you installed as administrator, open a fresh PowerShell window now so it picks up the change.)
Download and run Mistral
Type this one line and press Enter. The first time, it downloads the model (about 4 GB, so give it a few minutes). After that it's instant.
PS> ollama run mistral pulling manifest... success — talk to your model below >>> Hello! Who are you?
When you see >>>, the model is alive and waiting. Type a question and press Enter. That's it — you're running your own AI.
Play, then take notes
Ask it things. Ask it to write, explain, brainstorm. Notice what it's good and bad at. Keep PowerShell open and jot down what surprises you — these notes become the heart of anything you teach or build next.
Type /bye and press Enter, or just close the window. To come back later, open PowerShell and run ollama run mistral again — no re-download needed.
Add a simple web face (optional)
Once the model works in the terminal, you can put a real chat box in front of it using a free Python tool like Gradio or Streamlit. A few lines of code turns your local model into a proper little app you can show people.
# install once PS> pip install gradio ollama # ~10 lines later, a web chat box opens in your browser PS> python app.py Running on http://127.0.0.1:7860
This is the natural next milestone after you're comfortable chatting in the terminal.
Will your computer handle this?
Short answer: almost certainly yes, to start. Here's the honest breakdown so you know where you stand and what (if anything) is worth upgrading later.
If your computer has 16 GB of RAM, you can run Mistral 7B and even do free fine-tuning in the cloud. You probably already have enough to begin. Don't buy anything until you've hit a real wall.
One idea makes sense of all the hardware talk: a model runs fast when it fits entirely in fast memory, and slow when it doesn't. A graphics card's memory (VRAM) is the fastest. Your system RAM is the fallback. When a model is too big for what's available, it "spills over" and slows down a lot — often 10 times slower. That's the whole game. Everything below is just detail on that one rule.
System RAM, tier by tier
This is your computer's main memory — the number most laptops advertise. Here's what each level lets you do.
8 GB — the bare entry
Runs a 7B model like Mistral, but it'll be tight and you won't want much else open. Fine for a first taste; you'll feel the squeeze quickly.
16 GB — the comfortable start
The realistic sweet spot for beginners. Runs 7–8B models smoothly and lets you keep a browser and notes open at the same time. If you're buying nothing, aim to at least have this.
32 GB — breathing room
Comfortably handles bigger models and heavier multitasking. A great target if you're buying a machine you want to grow into without overspending.
64 GB — serious headroom
For large models or running several things at once. More than a first project needs — don't pay for this unless you know you'll use it.
The graphics card (GPU)
Optional for running, the big speed-up for training. This is what turns "a few words per second" into "instant."
Runs on CPU
Totally fine for learning. Mistral still works, just slower — a few words per second. Most laptops are here, and that's okay to start.
The sweet spot
An NVIDIA card in this range (e.g. an RTX 4060-class) runs 7–8B models fast and is the most practical target if you choose to buy.
Room to grow
Runs larger models and makes a genuinely capable fine-tuning machine. More than you need on day one, but future-proof.
On Windows, an NVIDIA graphics card has the smoothest software support. On Mac, Apple Silicon (M-series) chips share memory between the system and graphics, so a 32–64 GB Mac can punch above its weight. Both are fully supported by Ollama — it's a preference, not a right-or-wrong.
You can do the entire learning journey — running Mistral and fine-tuning on Google Colab's free GPU — without buying anything. Start on what you own. Buy hardware only after you've confirmed you're hooked and hit a real limit. That's the order that saves money.
Teaching the model your data.
Once running Mistral feels easy, this is the exciting next step: shaping it toward a voice, a topic, or a task you care about.
First, the honest truth about words. "Training from scratch" — building a brain from nothing — is not what you'll do, and you shouldn't want to; it costs millions. What you'll actually do is fine-tuning: taking the smart model that already exists and nudging it with your own examples so it leans in a direction you choose. Think of it as coaching a talented employee, not raising a child from birth.
Instead of rewriting the whole 7-billion-parameter brain (which needs monstrous hardware), a technique called LoRA trains a tiny set of "adapter" layers on top — like sticky notes on a textbook rather than rewriting the book. This is what makes fine-tuning possible on free, everyday hardware.
The beginner's path, in five honest steps
Decide what you're teaching
Be specific and small. "Answer questions about our school's rules in a friendly tone" beats "be smarter." A narrow goal is far easier to reach and to test.
Build a small dataset
Fine-tuning learns from examples, usually pairs of "here's an input, here's the ideal response." Even 50–200 good examples can teach a style or a topic. You and your teammate can write these in a simple file. Quality beats quantity every time.
{"instruction": "When does the library close?", "output": "The library closes at 9pm on weekdays!"} {"instruction": "Can I bring food inside?", "output": "Drinks with lids are fine, but please no hot food."}
Borrow a free GPU
You don't need to buy hardware. Google Colab gives you a free graphics card in your browser. This is genuinely the part that makes the whole thing accessible to beginners — no purchase, no setup, just a web page.
Use a ready-made notebook
You don't write the training code from scratch. A free tool called Unsloth publishes beginner notebooks where you essentially drop in your dataset and click "Run All." It's built to be fast and to fit inside Colab's free tier. Search for "Unsloth Mistral Colab notebook" to find the current one.
Your first run might take 30–60 minutes and may hit a few errors — that's normal and part of learning. Change one thing at a time, re-run, repeat. This is where you'll learn the most.
Bring it home to Ollama
Here's the satisfying part: the notebook can export your fine-tuned model in a format (called GGUF) that Ollama understands. You copy it back to your computer, register it with Ollama, and now ollama run launches your custom model — the same simple workflow from the Build It tab, but it's yours.
Train in the cloud (free GPU) → export → run locally with the exact same commands you already learned. Nothing you learned in Build It goes to waste.
Before full fine-tuning, try a system prompt — a few sentences telling the model how to behave ("You are a friendly library assistant. Keep answers short."). It's free, instant, and often gets you 80% of the way. Reach for fine-tuning only when prompting isn't enough.
What could you actually make with this?
A model on its own is a curiosity. Pointed at a real problem, it becomes a project. Here are starter ideas, sorted by how approachable they are for a first team.
A themed chatbot
A chat box with a personality you design via system prompt — a study buddy, a recipe helper, a polite customer-service demo. The classic, satisfying first build.
A writing assistant
Paste in rough notes, get back a tidy email, summary, or social post. Great because you can judge quality instantly and it's genuinely useful day to day.
"Chat with your documents"
Feed in a PDF or notes and ask questions about them. This pattern is called RAG — the model reads your files before answering. Hugely practical for studying or a small business.
A language practice partner
A patient bot to practice a new language with — it corrects you gently and never gets tired. Add a system prompt that sets the language level.
An auto-sorter / tagger
Feed it messages or reviews and have it label them (happy/unhappy, topic, urgency). A gentle intro to using AI for data instead of chat.
A quiz generator
Give it a topic or your class notes; it writes practice questions and checks answers. A fine-tuning showcase if you train it on your own subject.
A text adventure game
The model narrates a story that responds to player choices. Playful, shareable, and a great way to learn how to keep "memory" of what happened.
A mini help desk
Fine-tune on a club, school, or small business's FAQs so it answers in the right voice. This is the natural payoff of the Train It tab.
Pick the smallest idea that you'd personally find useful or fun. Real motivation beats an impressive-sounding project you abandon. You can always grow it once the first version works.
The jargon, demystified.
Every scary word on this site, in one place, explained like you're a smart friend — not a computer science exam.
- LLM
- "Large Language Model." The kind of AI that reads and writes text. Mistral is one. The "large" refers to how much it learned, not its file size.
- Mistral
- A family of free, open-source LLMs made by a French company. "Open" means anyone can download and run it — that's why it's perfect for learning.
- Ollama
- The free app that downloads and runs models on your computer with one command. The "engine" that powers everything here.
- Parameters (e.g. "7B")
- The model's adjustable knobs, learned during training. "7B" = 7 billion of them. More usually means smarter but heavier to run.
- Prompt
- What you type to the model. A "system prompt" is a hidden instruction that sets its behavior before the conversation starts.
- Token
- A chunk of text the model reads and writes — roughly ¾ of a word. "Tokens per second" is how speed is measured.
- Fine-tuning
- Adjusting an existing model with your own examples so it leans toward your style or topic. Coaching, not rebuilding.
- LoRA / QLoRA
- A clever shortcut that fine-tunes a small "adapter" instead of the whole model — so it fits on free hardware. QLoRA is the memory-saving version.
- GPU
- A graphics card. Great at the math AI needs, so it makes models much faster. Optional for running, very helpful for training.
- RAM / VRAM
- Your computer's working memory (RAM) and your graphics card's memory (VRAM). Models need enough of it to fit while running.
- Quantization
- Shrinking a model by storing its numbers less precisely. Slightly less sharp, but far smaller and faster — how a 7B model fits on a laptop.
- RAG
- "Retrieval-Augmented Generation." The model looks things up in your documents before answering, so it can talk about your specific stuff.
- GGUF
- A model file format Ollama can run. When you fine-tune in the cloud, you export to GGUF to bring the result home.
- Google Colab
- A free website that gives you a borrowed GPU inside your browser. Where beginners do fine-tuning without buying hardware.
- Gradio / Streamlit
- Free Python tools that turn a few lines of code into a working web page with buttons and chat boxes. How you give your model a "face."
- Inference
- The fancy word for "the model actually running and producing an answer." Training is learning; inference is doing.
Common snags, and how to fix them.
Everyone hits these. They're not failures — they're rites of passage. Tap a question to open it.
"ollama is not recognized" in PowerShell
The model is painfully slow
The download keeps failing or stalling
"Out of memory" errors
Colab disconnected in the middle of training
My fine-tuned model didn't really change
How do my teammate and I work on this together?
Is any of this going to cost money?
The "we did it" checklist.
Keep this open and tick each box with your teammate.
- Both of us installed Ollama on our computers
- We opened PowerShell and ran ollama run mistral
- We saw the >>> prompt and chatted with the model
- We asked it 5+ different things and noted what it did well
- We picked one idea from the App Ideas tab to aim for
- (Bonus) We tried writing a system prompt to change its personality
- (Stretch) We read through the Train It tab together
You're closer than you think.
The hardest part of any project is the first command. Open PowerShell, type one line, and you've already started. Everything after that is just curiosity.
Get Ollama → start now