Free Course Image Convolutional Neural Networks for Computer Vision: Detection, Recognition

Free online courseConvolutional Neural Networks for Computer Vision: Detection, Recognition

Duration of the online course: 6 hours and 1 minutes

New

Free course on CNNs for computer vision: learn convolutions, modern architectures, transfer learning, object detection, face recognition, and style transfer.

In this free course, learn about

  • Foundations of Convolutional Neural Networks
  • Classic CNN Architectures and Practical Techniques
  • Object Localization and Detection Methods
  • Face Recognition and Neural Style Transfer

Course Description

Convolutional Neural Networks for Computer Vision: Detection, Recognition is a free online course in Technology and Programming, focused on Artificial Intelligence and Machine Learning. It is designed to help you understand how CNNs power modern computer vision systems, from core image operations to practical model design choices used in real-world applications.

You will build intuition for fundamental convolution concepts, including edge detection, padding, strided convolutions, and working with multi-channel volumes. The course explains how a convolutional network layer operates, how pooling helps with representation, and why convolutions are especially effective for visual data. Along the way, you will connect these building blocks to end-to-end CNN examples to see how the pieces fit together.

The course then shifts toward influential architectures and proven engineering patterns in deep vision. You will explore classic networks as well as modern approaches like residual connections, network-in-network ideas, and inception-style designs. Practical strategies such as using open-source implementations, transfer learning, and data augmentation are introduced to help you train stronger models with less data and effort, while keeping an eye on broader trends shaping the current state of computer vision.

For detection tasks, you will learn the key ideas behind object localization and landmark detection, then move into object detection workflows. Topics include sliding-window style convolutional implementations, Intersection over Union, non-max suppression, anchor boxes, and the intuition behind popular methods such as YOLO and region proposal approaches.

The final portion introduces face recognition concepts, including one-shot learning, Siamese networks, triplet loss, and face verification. You will also be introduced to neural style transfer, breaking down content and style cost functions and considering generalizations beyond standard 2D images. By the end, you will have a strong conceptual map of CNN-based vision systems and the tools to confidently approach recognition and detection problems.

Course content

  • Video class: C4W1L01 Computer Vision 05m
  • Exercise: Why are convolutional neural networks preferred over fully connected networks for large images?
  • Video class: C4W1L02 Edge Detection Examples 11m
  • Video class: C4W1L03 More Edge Detection 07m
  • Exercise: What does a negative output from a vertical edge detector typically indicate?
  • Video class: C4W1L04 Padding 09m
  • Exercise: In a “same” convolution with an odd-sized filter, what padding value P keeps the output size equal to the input size (stride 1)?
  • Video class: C4W1L05 Strided Convolutions 09m
  • Exercise: What is the output spatial size for an N×N input convolved with an F×F filter using padding P and stride S?
  • Video class: C4W1L06 Convolutions Over Volumes 10m
  • Exercise: When convolving an RGB image with multiple filters, what determines the number of channels (depth) in the output volume?
  • Video class: C4W1L07 One Layer of a Convolutional Net 16m
  • Exercise: How many parameters are in a convolutional layer with 10 filters of size 3×3×3 (including biases)?
  • Video class: C4W1L08 Simple Convolutional Network Example 08m
  • Exercise: After flattening a 7×7×40 activation volume, how many units are in the resulting vector fed to logistic regression or softmax?
  • Video class: C4W1L09 Pooling Layers 10m
  • Exercise: In max pooling, which statement best describes how output channels relate to input channels?
  • Video class: C4W1L10 CNN Example 11m
  • Video class: C4W1L11 Why Convolutions 09m
  • Video class: C4W2L01 Why look at case studies? 03m
  • Exercise: Why are case studies of classic CNN architectures useful when building a model for a new computer vision task?
  • Video class: C4W2L02 Classic Network 18m
  • Video class: C4W2L03 Resnets 07m
  • Exercise: In a residual block with a skip connection, how is the activation computed two layers later (a^{[l+2]})?
  • Video class: C4W2L04 Why ResNets Work 09m
  • Exercise: Why do residual (skip) connections help very deep networks train effectively?
  • Video class: C4W2L05 Network In Network 06m
  • Video class: C4W2L06 Inception Network Motivation 10m
  • Exercise: In an Inception module, why is a 1×1 convolution often placed before a 5×5 convolution?
  • Video class: C4W2L07 Inception Network 08m
  • Exercise: In an Inception module, why is a 1×1 convolution often applied after the pooling branch before concatenation?
  • Video class: C4W2L08 Using Open Source Implementation 04m
  • Exercise: Why is using an open-source implementation (e.g., from GitHub) often recommended when applying a published CNN architecture like ResNet?
  • Video class: C4W2L09 Transfer Learning 08m
  • Video class: C4W2L10 Data Augmentation 09m
  • Exercise: What is a key reason data augmentation often improves computer vision model performance?
  • Video class: C4W2L11 State of Computer Vision 12m
  • Exercise: Why do computer vision models for object detection often have more complex architectures than image classification models?
  • Video class: C4W3L01 Object Localization 11m
  • Exercise: In classification with localization, what extra outputs are added to a standard image classifier to localize the object?
  • Video class: C4W3L02 Landmark Detection 05m
  • Video class: C4W3L03 Object Detection 05m
  • Video class: C4W3L04 Convolutional Implementation Sliding Windows 11m
  • Video class: C4W3L06 Intersection Over Union 04m
  • Video class: C4W3L07 Nonmax Suppression 08m
  • Exercise: What is the main purpose of non-max suppression in object detection?
  • Video class: C4W3L08 Anchor Boxes 09m
  • Video class: C4W3L09 YOLO Algorithm 07m
  • Video class: C4W3L10 Region Proposals 06m
  • Exercise: What is the key idea behind R-CNN compared with sliding windows for object detection?
  • Video class: C4W4L01 What is face recognition 04m
  • Video class: C4W4L02 One Shot Learning 04m
  • Video class: C4W4L03 Siamese Network 04m
  • Exercise: In a Siamese network for face recognition, what should training encourage about the distance between encodings f(Xi) and f(Xj)?
  • Video class: C4W4L04 Triplet loss 15m
  • Exercise: In triplet loss, what is the role of the margin (alpha)?
  • Video class: C4W4L05 Face Verification 06m
  • Video class: C4W4L06 What is neural style transfer? 02m
  • Video class: C4W4L07 What are deep CNs learning? 08m
  • Exercise: When visualizing a CNN by finding image patches that maximally activate a hidden unit, what do early (layer 1) units typically respond to?
  • Video class: C4W4L08 Cost Function 04m
  • Exercise: In neural style transfer, what is the optimization step used to generate the final image?
  • Video class: C4W4L09 Content Cost Function 03m
  • Exercise: In neural style transfer, how is the content cost J_content(C, G) commonly defined using a chosen layer L?
  • Video class: C4W4L10 Style Cost Function 17m
  • Exercise: In neural style transfer, what does the style matrix (Gram matrix) at layer L measure?
  • Video class: C4W4L11 1D and 3D Generalizations 09m
  • Exercise: In a 1D convolution example, what is the output length when a 14-dimensional input is convolved with a 5-dimensional filter (valid convolution, stride 1)?

This free course includes:

6 hours and 1 minutes of online video course

Digital certificate of course completion (Free)

Exercises to train your knowledge

100% free, from content to certificate

Ready to get started?Download the app and get started today.

Install the app now

to access the course
Icon representing technology and business courses

Over 5,000 free courses

Programming, English, Digital Marketing and much more! Learn whatever you want, for free.

Calendar icon with target representing study planning

Study plan with AI

Our app's Artificial Intelligence can create a study schedule for the course you choose.

Professional icon representing career and business

From zero to professional success

Improve your resume with our free Certificate and then use our Artificial Intelligence to find your dream job.

You can also use the QR Code or the links below.

QR Code - Download Cursa - Online Courses

More free courses at Artificial Intelligence and Machine Learning

Free Ebook + Audiobooks! Learn by listening or reading!

Download the App now to have access to + 5000 free courses, exercises, certificates and lots of content without paying anything!

  • 100% free online courses from start to finish

    Thousands of online courses in video, ebooks and audiobooks.

  • More than 60 thousand free exercises

    To test your knowledge during online courses

  • Valid free Digital Certificate with QR Code

    Generated directly from your cell phone's photo gallery and sent to your email

Cursa app on the ebook screen, the video course screen and the course exercises screen, plus the course completion certificate