Week 3: Teaching Machines to See

🎯 In this lesson, you will:

Understand how machines learn to “see” using labelled image data
Learn how to accurately annotate and label images
Explore real-world applications of computer vision, including a live hand detection demo

📂 Resources

👉 Lesson 3 Slides

👉 Lesson 3 Worksheet

🖍️ Starter Activity: Teach the Machine What You See

Tool: makesense.ai

(Download image pack in advance — animals, objects, or hand gestures)

🎯 Goal: Learn how machines understand images by manually labelling them.

Instructions:

Go to makesense.ai → click “Get Started”
Upload your sample images
Use the box tool to draw around objects and label them (e.g. dog, tree, hand)
Compare results in groups — did everyone label the same way?

💬 Discussion:

What was easy or tricky about labelling?
Why do you think this is such a crucial step for AI?
What happens if labels are messy, inconsistent, or incorrect?

🧠 Key Concepts:

Labelling: Adding human-readable tags to image content
Annotating: Drawing boxes (or polygons) to mark areas of interest in an image

🔎 Without high-quality labelled data, a machine learning model can’t learn to recognise patterns accurately!

Accurately labelled… and adorable

👀 Theory: What Is Computer Vision?

Computer vision is a field of machine learning where computers are trained to “see” and interpret visual information — like recognising shapes, objects, and people.

📌 This is often powered by a special kind of model called a convolutional neural network (CNN), designed to extract patterns from pixels.

🔍 Examples of computer vision in the real world:

Real-time hand or body gesture tracking
Identifying animals in wildlife camera footage
Diagnosing diseases from X-rays or MRIs
Self-driving car navigation
Monitoring deforestation via satellite images

⚡ Case Study 1: Teachable Plug

The teachable plug lets you train objects in your home to respond to your own movements - like turning on a radio with a dance move. It uses computer vision and machine learning to link actions to behaviours, all without needing to code.

👉 Teachable Plug

👓 Case Study 2: Seeing AI

Seeing AI is a free app that uses AI to describe the world for people who are blind or have low vision. It helps with everyday tasks like reading text, identifying products, and describing photos.

Seeing AI - Talking Camera for the Blind

💭 Discussion: Why Is Image Detection Useful?

Some examples include:

Cancer screening
Self-driving cars
Wildlife monitoring
Satellite collision avoidance
Anti-missile systems
Augmented reality

🧠 Theory: Generalisation

Generalisation is how well a model performs on new, unseen data—like recognizing hands from different people, not just the ones it was trained on. A good model can adapt and work well beyond its training examples.

🧪 Activity: Real-Time Hand Detection

Tools:

Task: Try out both tools — either in pairs, small groups, or as a class.

💬 Questions to explore:

How many hands can be detected at once?
How accurate is the model?
Does it work differently depending on lighting or background?

🔧 Extension: Polygon Annotation

Optionally use polygon tool in makesense.ai to trace more complex shapes (e.g. leaves, hands). Compare the accuracy vs box labelling.

🪑 Philosopher’s Armchair

Take a moment to imagine and discuss:

What if machines could see and understand the world as humans do?
What might be the benefits or dangers of AI that “sees” everything?
How would society change if everyone could easily train AI to recognise things around them?
What ethical questions does computer vision raise?

⌚ Just a Minute!

Discuss in pairs/small groups:

What surprised you about how machines “see”?
Why is annotation so important for AI?
What AI vision application interests you the most - and why?