What is the difference between object detection, classification, and segmentation?

Classification answers 'what is in the image?' (one label for the whole image). Detection answers 'what is in the image and where?' (bounding boxes around each object). Segmentation answers 'what is in the image, where, and exactly what shape?' (pixel-level masks). Use classification for simple yes/no or category tasks, detection for locating multiple objects, and segmentation when you need precise outlines (medical imaging, autonomous driving).

Which YOLO model size should I use?

Choose YOLO11n (nano) for real-time inference on CPU, edge devices, or Raspberry Pi — it has 2.6M parameters and is the fastest. Use YOLO11s (small) for mobile or low-power devices needing slightly better accuracy. YOLO11m (medium) is a good default for production with a GPU. YOLO11l and YOLO11x offer the highest accuracy and are best for research or non-real-time applications where a powerful GPU is available.

Where do I get datasets for training YOLO?

The easiest source is Roboflow (roboflow.com) which lets you download datasets directly in YOLO format. Kaggle (kaggle.com/datasets) has thousands of image datasets. Ultralytics also hosts curated datasets at docs.ultralytics.com/datasets. For your own custom dataset, use Roboflow's free annotation tool to label images and export in YOLO format.

Do I need a GPU to train YOLO models?

For inference (using a trained model), a CPU is fine. For training, a GPU is strongly recommended — training on CPU is 10-50x slower. If you don't have a GPU, use Google Colab (colab.research.google.com) which provides free T4 GPU access. Small datasets (a few hundred images) can be fine-tuned in 15-30 minutes on a free Colab GPU.

Computer Vision with Ultralytics YOLO: Complete Beginner's Guide

Imagine a camera that can identify every car on a highway, detect a tumour in an X-ray in milliseconds, or recognise a defective part on a factory line — all without a human looking at each frame. That is Computer Vision. And with Ultralytics YOLO11, you can build systems like this in a few lines of Python.

This guide takes you from zero — installing Python — all the way to training your own YOLO model on a custom dataset. Every code block here is verified and runs. No skipping steps, no assuming you already know things.

✅ What you will build by the end of this guide: A working Python environment with Ultralytics installed, a Jupyter notebook that runs object detection, image classification, and instance segmentation — and the knowledge to train your own model on any custom dataset.

In This Article

What Is Computer Vision?
CV Frameworks: Quick Overview
Why Ultralytics YOLO?
Complete Environment Setup (Python → venv → install → notebook)
The 3 Core CV Tasks at a Glance
Task 1: Object Detection
Task 2: Image Classification
Task 3: Instance Segmentation
How to Get & Prepare Datasets
Which YOLO Model to Choose?
Exporting Your Trained Model
Frequently Asked Questions

What Is Computer Vision?

Computer Vision (CV) is the field of AI that enables machines to interpret and understand visual information — images, videos, and live camera streams. Just as humans use their eyes and brain together to understand a scene, CV uses cameras and deep learning models to do the same, often faster and with greater consistency than a human expert.

Some things CV can do right now, in production:

Detect objects in video at 100+ frames per second (YOLO11n on GPU)
Identify plant diseases in drone footage over entire farms
Count people in a crowd from CCTV footage
Read license plates on a moving vehicle
Detect cancer in radiology scans with accuracy matching specialist doctors
Guide a robot arm to pick and place objects it has never seen before

Computer Vision Frameworks: Quick Overview

Before diving into Ultralytics, here is a quick map of the landscape so you know what exists and when to use each:

🔥 PyTorch Research / Advanced

Meta's deep learning framework. Most research papers implement in PyTorch. Maximum flexibility, but requires building your own training loop. Ultralytics itself is built on PyTorch.

🟠 TensorFlow / Keras Production / Beginners

Google's framework. Keras (now part of TensorFlow) offers a simpler API. Great for mobile and edge deployment. Larger corporate ecosystem.

👁️ OpenCV Image Processing

The classic computer vision library. Does not do deep learning training itself, but handles reading images, video streams, drawing boxes, colour conversion, and preprocessing. Used alongside YOLO.

⚡ Ultralytics YOLO Best for Beginners + Production

Built on PyTorch. Provides detection, classification, segmentation, pose, and tracking in one package — with a 5-line training API. The most widely adopted CV framework in the world for practical applications.

🎯 Which should you learn first? Start with Ultralytics. It gives you real working results immediately, teaches you the concepts (datasets, YAML files, training, evaluation metrics), and the underlying framework (PyTorch) can be explored later when you need more control. This is the top-down learning approach — build something first, understand the internals second.

Why Ultralytics YOLO?

YOLO stands for You Only Look Once. Unlike older detection systems that scanned an image multiple times, YOLO processes the entire image in a single forward pass through the network — making it fast enough to run in real time.

Ultralytics develops and maintains YOLO and has built it into a complete, beginner-friendly platform. The numbers speak for themselves:

129.8K+

GitHub Stars

254M+

Downloads

2.7B+

Daily Usages

1K+

Contributors

Trusted by Duolingo, Shell, Siemens, Renault, Philips, Intel and thousands of other companies. YOLO11 is their latest generation — faster, more accurate, and supports 5 vision tasks from a single install.

Complete Environment Setup

We will set up a clean, professional Python environment for this project. Follow every step exactly — no skipping.

Install Python (if not already installed)

Open a terminal (Command Prompt on Windows, Terminal on Mac/Linux) and check if Python is installed:

bash

python --version
# or try:
python3 --version

If you see something like Python 3.10.12 or higher — you are good. If you get an error, install Python:

📥 Install Python 3.10 or higher: Go to python.org/downloads → Download the latest stable release → Run the installer.

⚠️ Windows users: During installation, tick the checkbox "Add Python to PATH" before clicking Install. This is the most common beginner mistake — if you miss it, Python commands won't work in the terminal.

Create Your Project Folder

A clean folder for your Computer Vision work. Run these commands in your terminal:

bash

# Create the project directory
mkdir learning_computer_vision

# Enter the folder
cd learning_computer_vision

Create a Virtual Environment

A virtual environment (venv) is an isolated Python installation for this project. It keeps your project's packages separate from other Python projects — a professional best practice.

bash

# Create the virtual environment (a folder named 'venv' will appear)
python -m venv venv

⚠️ If python -m venv fails: Try python3 -m venv venv. On some systems, python points to Python 2. Always use the version that returns 3.10+ in Step 1.

Activate the Virtual Environment

Activating "switches" your terminal into the isolated environment. You must do this every time you open a new terminal for this project.

bash — Windows

venv\Scripts\activate

bash — Mac / Linux

source venv/bin/activate

After activating, your terminal prompt will show (venv) at the start — that is how you know it is active:

(venv) C:\Users\YourName\learning_computer_vision> _

Install Ultralytics and Jupyter

With the venv active, install the required packages:

bash

# Install Ultralytics (includes PyTorch, OpenCV, and all dependencies)
pip install ultralytics

# Install Jupyter for interactive notebooks
pip install jupyter

⏱️ How long does this take? Depending on your internet speed, the full installation (including PyTorch) takes 3–10 minutes. Ultralytics automatically installs the correct version of PyTorch for your system. You will see many lines of output — this is normal.

Verify the installation worked:

python

python -c "import ultralytics; ultralytics.checks()"

You should see output like this (verified on our machine):

Ultralytics 8.3.241 🚀 Python-3.11.14 torch-2.1.0 CPU (Apple M2) Setup complete ✅ (8 CPUs, 8.0 GB RAM, 392.3/460.4 GB disk)

Create Your Jupyter Notebook

Launch Jupyter in your project folder:

bash

jupyter notebook

Your browser will open at http://localhost:8888. Click New → Python 3 (ipykernel) to create a new notebook. Rename it to cv_yolo_demo.ipynb.

💡 Using VS Code instead? If you have VS Code installed, you can open your folder with code . and use the built-in Jupyter extension. It is the same experience without the browser. Install the "Jupyter" extension from the VS Code marketplace.

The 3 Core CV Tasks at a Glance

Ultralytics YOLO11 supports 5 vision tasks. Here we focus on the 3 most important ones for beginners:

📦

Object Detection

Draws a bounding box around each object and labels it. Tells you what is in the image and where it is.

Use for: surveillance, counting, locating objects

🏷️

Image Classification

Assigns a single label to the whole image. Tells you what is in the image — no location information.

Use for: quality control (pass/fail), medical categories

🎭

Instance Segmentation

Draws a pixel-level mask around each object — exact shape, not just a box. The most detailed output.

Use for: medical imaging, autonomous driving, fashion

A simple rule of thumb: if a bounding box is enough, use detection. If you only need one label per image, use classification. If you need the exact shape of each object, use segmentation.

Task 1: Object Detection

What it does

Object detection finds all instances of known objects in an image and draws a rectangular bounding box around each one, with a label and confidence score. For example, given a photo of a street, it might find: car (0.97), person (0.89), traffic light (0.82).

Running detection with a pretrained model (5 lines of code)

In your Jupyter notebook, create a new cell and type:

python — Cell 1: Verify setup

import ultralytics
ultralytics.checks()
# Expected: "Setup complete ✅"

python — Cell 2: Object Detection

from ultralytics import YOLO

# Load YOLO11 nano — pretrained on COCO (80 object classes)
# First run downloads the model (~6 MB) automatically
model = YOLO("yolo11n.pt")

# Run detection on a sample image
results = model("https://ultralytics.com/images/bus.jpg")

# Print what was detected
for r in results:
    for box in r.boxes:
        cls_name = model.names[int(box.cls[0])]
        conf = float(box.conf[0])
        print(f"Detected: {cls_name} ({conf:.0%} confidence)")

Verified output from our machine (Ultralytics 8.3.241):

image 1/1 bus.jpg: 640×480 Detected: bus (94% confidence) Detected: person (89% confidence) Detected: person (88% confidence) Detected: person (86% confidence) Detected: person (62% confidence)

python — Cell 3: Save result image

# Save image with boxes drawn on it
results[0].save("detection_result.jpg")
print("Saved! Open detection_result.jpg to see the boxes.")

# OR display it directly in the notebook
results[0].show()

Understanding the result

The results[0] object contains everything about the detection:

results[0].boxes — list of all detected bounding boxes
box.cls — class ID (integer). Use model.names[int(box.cls[0])] to get the name
box.conf — confidence score (0.0 to 1.0). 0.94 means 94% sure
box.xyxy — box coordinates as [x1, y1, x2, y2] in pixels
results[0].orig_shape — original image size (height, width)

Training YOLO detection on your own dataset

Using a pretrained model on your own data is called fine-tuning. You take the model that already knows about 80 classes from COCO, and teach it your specific classes — like "phone" and "laptop" on a desk, or "crack" and "healthy" on a wall.

Step A — Dataset folder structure

YOLO expects this exact structure:

my_dataset/ ├── images/ │ ├── train/ │ │ ├── img001.jpg │ │ ├── img002.jpg │ │ └── ... # your training images │ └── val/ │ ├── img101.jpg │ └── ... # your validation images (~20% of total) └── labels/ ├── train/ │ ├── img001.txt # annotations for img001.jpg │ ├── img002.txt │ └── ... └── val/ ├── img101.txt └── ...

📌 The naming rule: Each image and its label file must have the same name, just different extensions. images/train/cat001.jpg → labels/train/cat001.txt. YOLO finds the label by replacing images/ with labels/ in the path.

Step B — Label file format

Each .txt label file has one line per object. Each line:

# format: class_id center_x center_y width height # all values are NORMALISED (0 to 1, relative to image size) 0 0.512 0.412 0.301 0.587 1 0.250 0.700 0.180 0.220

class_id — integer starting from 0 (0=cat, 1=dog, etc.)
center_x, center_y — centre of the box (0.5 = middle of image)
width, height — size of the box (1.0 = full image width/height)
If an image has no objects, the label file should exist but be empty

💡 Don't annotate by hand! Use Roboflow — a free annotation tool. Upload your images, draw boxes, and export directly in YOLO format. It generates the labels/ folder and the YAML file automatically. This saves hours of work.

Step C — The data.yaml file

Place this file inside your dataset folder. It tells YOLO where the data is and what classes exist:

yaml — my_dataset/data.yaml

# Where your dataset lives (absolute or relative path)
path: my_dataset

# Paths to images (relative to 'path')
train: images/train
val: images/val
test: images/test  # optional

# Number of classes
nc: 2

# Class names — index 0 must match label id 0
names:
  0: cat
  1: dog

Step D — Train the model

python — Training Cell

from ultralytics import YOLO

# Start from a pretrained YOLO11 nano model
model = YOLO("yolo11n.pt")

# Train on your custom dataset
results = model.train(
    data="my_dataset/data.yaml",
    epochs=50,          # number of training passes over data
    imgsz=640,          # image size (resize all images to 640×640)
    batch=16,           # images processed at once (lower if RAM is limited)
    device="0",         # use "0" for GPU, "cpu" for CPU-only
    project="runs/train",
    name="my_experiment"
)

After training, your best model weights are saved at runs/train/my_experiment/weights/best.pt. Load and use it like this:

python — Using your trained model

best_model = YOLO("runs/train/my_experiment/weights/best.pt")
results = best_model("path/to/new_image.jpg")
results[0].show()

Task 2: Image Classification

What it does

Classification assigns a single label to the entire image. There are no boxes — the model simply answers "what is this image of?" with a probability for each class. Use it when you don't need to know where something is, only what it is.

Real examples: grading fruit quality (fresh vs rotten), classifying X-rays as normal or pneumonia, distinguishing between plant species.

python — Image Classification

from ultralytics import YOLO

# Load classification model (pretrained on ImageNet — 1000 classes)
model_cls = YOLO("yolo11n-cls.pt")

# Classify an image
results = model_cls("https://ultralytics.com/images/bus.jpg")

# Print top-5 predictions
for r in results:
    top5_idx = r.probs.top5           # list of top 5 class indices
    top5_conf = r.probs.top5conf       # their confidence scores
    print("Top 5 predictions:")
    for idx, conf in zip(top5_idx, top5_conf):
        print(f"  {r.names[idx]}: {float(conf):.1%}")

Classification dataset structure (different from detection!)

Classification uses a folder-per-class structure — the folder name is the label. No label files needed.

my_clf_dataset/ ├── train/ │ ├── cat/ # folder name = class name │ │ ├── cat001.jpg │ │ ├── cat002.jpg │ │ └── ... │ └── dog/ │ ├── dog001.jpg │ └── ... └── val/ ├── cat/ │ └── cat_val001.jpg └── dog/ └── dog_val001.jpg

✅ Classification is the simplest task to set up! Just create folders named after your classes and put the images inside. No labelling tool needed, no YAML file for basic setups. Ultralytics auto-discovers the classes from folder names.

Training a classification model

python — Classification Training

from ultralytics import YOLO

model = YOLO("yolo11n-cls.pt")  # pretrained on ImageNet

results = model.train(
    data="my_clf_dataset/",  # path to folder with train/ and val/ subfolders
    epochs=30,
    imgsz=224,             # classification typically uses smaller images
    batch=32
)

Task 3: Instance Segmentation

What it does

Segmentation goes further than detection: instead of a rectangular box, it draws a pixel-perfect mask around each instance of an object. If you have two overlapping cars, you get two separate masks — one for each car. This is called instance segmentation (as opposed to semantic segmentation which doesn't distinguish instances).

This requires more detailed annotations and more compute, but gives much richer output — critical for robotics, medical imaging, and precise manufacturing inspection.

python — Segmentation with pretrained model

from ultralytics import YOLO

# YOLO11 nano segmentation model
model_seg = YOLO("yolo11n-seg.pt")

results = model_seg("https://ultralytics.com/images/bus.jpg")

for r in results:
    print(f"Objects with masks: {len(r.masks.data) if r.masks else 0}")
    for box in r.boxes:
        name = model_seg.names[int(box.cls[0])]
        conf = float(box.conf[0])
        print(f"  {name}: {conf:.0%}")

# Save with masks drawn
results[0].save("segmentation_result.jpg")

Segmentation dataset structure

Same folder structure as detection, but the label format is different — instead of 4 box coordinates, you provide a polygon traced around the object:

# Segmentation label format (one line per object) # class_id x1 y1 x2 y2 x3 y3 x4 y4 ... (polygon points, normalised) 0 0.10 0.20 0.30 0.15 0.45 0.30 0.40 0.60 0.20 0.65 0.08 0.45 1 0.55 0.22 0.75 0.18 0.80 0.50 0.60 0.58

⚠️ Creating segmentation labels by hand is very tedious. Always use an annotation tool. Roboflow and CVAT both support polygon annotation and export to YOLO segmentation format automatically.

Real training example: Car Parts Segmentation

Here is a real training run from the official Ultralytics notebook on the Car Parts Segmentation dataset (23 classes, 3,156 training images). This is what your terminal will look like when training runs:

Downloading https://github.com/ultralytics/assets/.../yolo11n-seg.pt ... 6.0MB YOLO11n-seg summary: 355 layers, 2,847,093 parameters, 10.4 GFLOPs Transferred 510/561 items from pretrained weights Epoch GPU_mem box_loss seg_loss cls_loss dfl_loss Instances Size 1/10 6.16G 1.319 2.775 4.069 1.475 110 640 2/10 6.09G 1.066 1.904 2.492 1.232 106 640 3/10 6.2G 0.9372 1.651 1.655 1.134 88 640 5/10 6.18G 0.8182 1.407 1.205 1.057 99 640 7/10 6.11G 0.7399 1.248 1.015 1.009 82 640 10/10 6.15G 0.6523 1.103 0.8649 0.9568 87 640 10 epochs completed in 0.229 hours. Results saved to runs/segment/train Final mAP50: 0.676 | Mask mAP50: 0.686 | mAP50-95: 0.561

📊 Understanding the training metrics:
• box_loss / seg_loss / cls_loss — Loss values. Lower = better. Watch these decrease across epochs.
• mAP50 — Mean Average Precision at 50% IoU overlap. Think of it as the model's overall accuracy score. 0.676 = 67.6% — good for 10 epochs on a complex dataset.
• mAP50-95 — Stricter accuracy measure (averaged over multiple IoU thresholds). Always lower than mAP50.
• Mask mAP50 — Same as mAP50 but evaluating the quality of the predicted masks (not just boxes).

Training code for segmentation

python — Segmentation Training

from ultralytics import YOLO

# Start from pretrained YOLO11 nano segmentation model
model = YOLO("yolo11n-seg.pt")

results = model.train(
    data="my_dataset/data.yaml",  # same YAML as detection
    epochs=50,
    imgsz=640,
    batch=16,
    project="runs/segment",
    name="carparts_experiment"
)

# Run inference with the best trained model
best = YOLO("runs/segment/carparts_experiment/weights/best.pt")
results = best("test_car.jpg", save=True)

How to Get and Prepare Datasets

The biggest challenge for beginners is always: "Where do I get images and how do I label them?" Here are your best options:

🥇 Roboflow Universe

100,000+ public datasets, all downloadable in YOLO format. Also has a free annotation tool for your own images.

universe.roboflow.com →

🥈 Kaggle Datasets

Thousands of image datasets across every domain. Many are already in YOLO format or can be converted easily.

kaggle.com/datasets →

🥉 Ultralytics Datasets

Curated datasets officially maintained by Ultralytics — COCO, ImageNet, VOC, and specialty datasets. Download in one command.

docs.ultralytics.com/datasets →

📸 Your Own Camera

For custom applications, collect your own images with a phone. Even 200–300 annotated images can produce a useful model when fine-tuning from pretrained weights.

Annotate with Roboflow →

Minimum dataset sizes (rule of thumb)

Fine-tuning from pretrained: as few as 50–100 images per class can work
Good results: 500–1,000 images per class
State-of-the-art custom model: 5,000+ images per class
Always use an 80/20 train/validation split
Vary lighting, angles, distances in your training images — diversity matters more than quantity

Which YOLO11 Model Size to Choose?

YOLO11 comes in 5 sizes — nano (n), small (s), medium (m), large (l), and extra-large (x). Each is a trade-off between speed and accuracy. Here is a practical guide:

Model	Params	Speed	mAP_50-95	Best for
yolo11n nano	2.6M	⚡⚡⚡⚡⚡ Fastest	39.5	Raspberry Pi, real-time on CPU, mobile apps, edge devices
yolo11s small	9.4M	⚡⚡⚡⚡	47.0	Low-power devices needing better accuracy, Jetson Nano
yolo11m medium	20.1M	⚡⚡⚡	51.5	✅ Best starting point. Good GPU, production use, balanced
yolo11l large	25.3M	⚡⚡	53.4	High-accuracy requirements, RTX 3080+ GPU available
yolo11x extra-large	56.9M	⚡	54.7	Maximum accuracy, research, powerful GPU (A100/H100) required

✅ Quick decision guide:
• Running on a laptop CPU with no GPU → yolo11n
• Running on Google Colab (free T4 GPU) → yolo11s or yolo11m
• Training your first custom model → yolo11n (fast feedback)
• Deploying in a real product → yolo11m (best trade-off)
• Accuracy is the only thing that matters → yolo11x

⚠️ Always start small. When building a new model, always start with nano (yolo11n). It trains in minutes, so you can quickly verify your dataset is correct, your YAML is set up properly, and the training loop works. Once everything is confirmed, scale up to medium or large.

Exporting Your Trained Model

Once trained, you can export your model to 17+ formats for deployment. The most important ones:

python — Export to ONNX (universal format)

from ultralytics import YOLO

model = YOLO("runs/train/my_experiment/weights/best.pt")

# Export to ONNX — works everywhere, up to 3x faster on CPU
model.export(format="onnx")

# Export to TensorRT — up to 5x faster on NVIDIA GPU
model.export(format="engine")

# Export to TFLite — for Android/embedded devices
model.export(format="tflite")

Format	Argument	Best for	Speed boost
PyTorch (.pt)	default	Development, fine-tuning	—
ONNX (.onnx)	`onnx`	Universal deployment, CPU speedup	~3x CPU
TensorRT (.engine)	`engine`	NVIDIA GPU production	~5x GPU
CoreML (.mlpackage)	`coreml`	Apple iOS / macOS apps	Native
TFLite (.tflite)	`tflite`	Android, embedded Linux	Optimised
OpenVINO	`openvino`	Intel CPU / VPU devices	~3x Intel

What to Learn Next

Ultralytics official documentation — full API reference, all arguments
Pose Estimation — detect human body keypoints (joints)
Oriented Bounding Boxes (OBB) — rotated boxes for aerial imagery
Ultralytics official notebooks — dozens of real working examples on Colab
Roboflow Universe — find and annotate datasets for your project idea
Google Colab — free GPU for training (no local GPU needed)

Frequently Asked Questions

What is the difference between detection, classification, and segmentation?+

Classification: one label for the whole image (no location). Detection: bounding boxes around each object + label. Segmentation: pixel-perfect outline (mask) around each object. Choose based on what your application actually needs — classification is simplest to set up, segmentation is most powerful but needs more data and compute.

Which YOLO model size should I start with?+

Always start with yolo11n (nano) — it has 2.6M parameters and trains in minutes, so you can verify your dataset and setup quickly. Once everything works, upgrade to yolo11m (medium) for better accuracy. Only use large/xlarge if you have a powerful GPU and accuracy is critical.

Do I need a GPU to use YOLO?+

For inference (using a trained model), no — CPU is fine for real-time detection with the nano model. For training, a GPU is strongly recommended. Use Google Colab for free T4 GPU access. Training yolo11n on a small dataset takes 10–30 minutes on a free Colab GPU.

Where do I get datasets for YOLO training?+

Roboflow Universe is the best starting point — 100,000+ datasets downloadable directly in YOLO format. Kaggle also has thousands of CV datasets. For custom work, collect your own images and annotate with Roboflow's free annotation tool, which exports directly in YOLO format.

What does mAP50 mean in training results?+

mAP50 (Mean Average Precision at 50% IoU) is the primary accuracy metric for object detection and segmentation. It measures how well your model finds objects and how accurately it boxes them. A score of 0.67 means 67% — good for a medium-sized dataset. Values above 0.8 are considered excellent. It increases as training progresses; if it plateaus, you may need more data or more epochs.

How many images do I need to train a YOLO model?+

When fine-tuning from a pretrained model (which is always recommended), 50–100 images per class can produce a working model. For reliable production use, aim for 500+ images per class. Diversity (different angles, lighting, backgrounds) matters more than raw quantity. A dataset of 200 diverse images outperforms 1,000 near-identical ones.

Computer Vision with Ultralytics YOLO: Detection, Classification & Segmentation — Complete Beginner's Guide

What Is Computer Vision?

Computer Vision Frameworks: Quick Overview

🔥 PyTorch Research / Advanced

🟠 TensorFlow / Keras Production / Beginners

👁️ OpenCV Image Processing

⚡ Ultralytics YOLO Best for Beginners + Production

Why Ultralytics YOLO?

Complete Environment Setup

Install Python (if not already installed)

Create Your Project Folder

Create a Virtual Environment

Activate the Virtual Environment

Install Ultralytics and Jupyter

Create Your Jupyter Notebook

The 3 Core CV Tasks at a Glance

Object Detection

Image Classification

Instance Segmentation

Task 1: Object Detection

What it does

Running detection with a pretrained model (5 lines of code)

Understanding the result

Training YOLO detection on your own dataset

Step A — Dataset folder structure

Step B — Label file format

Step C — The data.yaml file

Step D — Train the model

Task 2: Image Classification

What it does

Classification dataset structure (different from detection!)

Training a classification model

Task 3: Instance Segmentation

What it does

Segmentation dataset structure

Real training example: Car Parts Segmentation

Training code for segmentation

How to Get and Prepare Datasets

🥇 Roboflow Universe

🥈 Kaggle Datasets

🥉 Ultralytics Datasets

📸 Your Own Camera

Minimum dataset sizes (rule of thumb)

Which YOLO11 Model Size to Choose?

Exporting Your Trained Model

What to Learn Next

Frequently Asked Questions

Continue Learning AI

AI for Everyone: Beginner's Guide (Matric, FSC & University)

Best Study Techniques for FSC Students: Score 90%+

FSC Physics Part 1 — 406 MCQs with Hints & Solutions