Free Course Image Convolutional Neural Networks for Computer Vision: Detection, Recognition

Free online courseConvolutional Neural Networks for Computer Vision: Detection, Recognition

Duration of the online course: 6 hours and 1 minutes

New

Build computer vision skills with a free CNN course—learn detection and recognition techniques, from convolution basics to YOLO and face ID, with a certificate option.

In this free course, learn about

  • Why CNNs beat fully connected nets on images: local connectivity, parameter sharing, translation invariance
  • Edge detection with conv filters; interpreting sign/magnitude (e.g., negative output indicates opposite edge direction)
  • Convolution sizing math: padding for “same”, and output size formula with filter F, stride S, padding P
  • Strided convolutions and their effect on downsampling and receptive field growth
  • Convolutions over volumes: multi-channel (RGB) filters and how #filters sets output depth
  • Compute conv layer parameter counts (filter dims × #filters + biases) and flattening dimensions
  • Pooling layers (max/avg): channel behavior and spatial reduction while keeping depth unchanged
  • Case studies of classic CNNs and why architectures like ResNet/Inception generalize to new tasks
  • Residual networks: skip-connection activations and why they ease optimization in very deep nets
  • Inception modules: 1×1 conv for bottlenecking and post-pooling channel reduction before concat
  • Using open-source implementations; transfer learning and data augmentation to improve performance
  • Object localization/detection: bounding box outputs, sliding windows via conv, IoU, NMS, anchors, YOLO, R-CNN
  • Face recognition: one-shot learning with Siamese nets; triplet loss margin; verification via embedding distances
  • Neural style transfer: CNN feature visualization; content/style (Gram) costs and optimizing pixels by gradient descent

Course Description

Want to teach machines to see and understand images? This free online course dives into Convolutional Neural Networks (CNNs), the foundation behind many modern computer vision systems used for object detection, image recognition, face verification, and even neural style transfer. You will build intuition for why convolutions scale so well to large images, how filters capture visual patterns such as edges and textures, and how design choices like padding, stride, pooling, and volumetric convolutions shape what a network can learn.

As you progress, you will connect the core building blocks to practical, battle-tested architectures. You will explore why classic network designs matter, what made very deep networks finally trainable, and how ideas like skip connections, bottleneck layers, and multi-branch modules help models become both stronger and more efficient. Along the way, the course emphasizes the kind of reasoning used in real projects: choosing proven baselines, adapting open-source implementations responsibly, and applying transfer learning to reach high performance even when data or training time is limited. You will also see how data augmentation can improve robustness by exposing a model to the variety it will face in the real world.

From there, the focus shifts from recognizing a whole image to pinpointing what is inside it. You will learn how localization augments classification with additional outputs, how detection systems evaluate predictions using IoU, and why post-processing steps like non-max suppression are essential for turning dense candidate predictions into clean final boxes. Modern detection approaches are covered conceptually, helping you understand the key tradeoffs between sliding-window style computation, region proposal pipelines, and single-shot methods such as YOLO.

The final part broadens your toolkit with techniques for identity and similarity learning, including one-shot learning for face recognition using Siamese networks and triplet loss. You will also look inside CNN representations to understand what different layers tend to respond to, then apply that insight to neural style transfer, where optimization balances content and style statistics to generate new images. By the end, you will be able to reason clearly about CNN design choices, evaluation signals, and the components that power detection and recognition systems—skills that transfer directly to many AI and machine learning roles in technology.

Course content

  • Video class: C4W1L01 Computer Vision 05m
  • Exercise: Why are convolutional neural networks preferred over fully connected networks for large images?
  • Video class: C4W1L02 Edge Detection Examples 11m
  • Video class: C4W1L03 More Edge Detection 07m
  • Exercise: What does a negative output from a vertical edge detector typically indicate?
  • Video class: C4W1L04 Padding 09m
  • Exercise: In a “same” convolution with an odd-sized filter, what padding value P keeps the output size equal to the input size (stride 1)?
  • Video class: C4W1L05 Strided Convolutions 09m
  • Exercise: What is the output spatial size for an N×N input convolved with an F×F filter using padding P and stride S?
  • Video class: C4W1L06 Convolutions Over Volumes 10m
  • Exercise: When convolving an RGB image with multiple filters, what determines the number of channels (depth) in the output volume?
  • Video class: C4W1L07 One Layer of a Convolutional Net 16m
  • Exercise: How many parameters are in a convolutional layer with 10 filters of size 3×3×3 (including biases)?
  • Video class: C4W1L08 Simple Convolutional Network Example 08m
  • Exercise: After flattening a 7×7×40 activation volume, how many units are in the resulting vector fed to logistic regression or softmax?
  • Video class: C4W1L09 Pooling Layers 10m
  • Exercise: In max pooling, which statement best describes how output channels relate to input channels?
  • Video class: C4W1L10 CNN Example 11m
  • Video class: C4W1L11 Why Convolutions 09m
  • Video class: C4W2L01 Why look at case studies? 03m
  • Exercise: Why are case studies of classic CNN architectures useful when building a model for a new computer vision task?
  • Video class: C4W2L02 Classic Network 18m
  • Video class: C4W2L03 Resnets 07m
  • Exercise: In a residual block with a skip connection, how is the activation computed two layers later (a^{[l+2]})?
  • Video class: C4W2L04 Why ResNets Work 09m
  • Exercise: Why do residual (skip) connections help very deep networks train effectively?
  • Video class: C4W2L05 Network In Network 06m
  • Video class: C4W2L06 Inception Network Motivation 10m
  • Exercise: In an Inception module, why is a 1×1 convolution often placed before a 5×5 convolution?
  • Video class: C4W2L07 Inception Network 08m
  • Exercise: In an Inception module, why is a 1×1 convolution often applied after the pooling branch before concatenation?
  • Video class: C4W2L08 Using Open Source Implementation 04m
  • Exercise: Why is using an open-source implementation (e.g., from GitHub) often recommended when applying a published CNN architecture like ResNet?
  • Video class: C4W2L09 Transfer Learning 08m
  • Video class: C4W2L10 Data Augmentation 09m
  • Exercise: What is a key reason data augmentation often improves computer vision model performance?
  • Video class: C4W2L11 State of Computer Vision 12m
  • Exercise: Why do computer vision models for object detection often have more complex architectures than image classification models?
  • Video class: C4W3L01 Object Localization 11m
  • Exercise: In classification with localization, what extra outputs are added to a standard image classifier to localize the object?
  • Video class: C4W3L02 Landmark Detection 05m
  • Video class: C4W3L03 Object Detection 05m
  • Video class: C4W3L04 Convolutional Implementation Sliding Windows 11m
  • Video class: C4W3L06 Intersection Over Union 04m
  • Video class: C4W3L07 Nonmax Suppression 08m
  • Exercise: What is the main purpose of non-max suppression in object detection?
  • Video class: C4W3L08 Anchor Boxes 09m
  • Video class: C4W3L09 YOLO Algorithm 07m
  • Video class: C4W3L10 Region Proposals 06m
  • Exercise: What is the key idea behind R-CNN compared with sliding windows for object detection?
  • Video class: C4W4L01 What is face recognition 04m
  • Video class: C4W4L02 One Shot Learning 04m
  • Video class: C4W4L03 Siamese Network 04m
  • Exercise: In a Siamese network for face recognition, what should training encourage about the distance between encodings f(Xi) and f(Xj)?
  • Video class: C4W4L04 Triplet loss 15m
  • Exercise: In triplet loss, what is the role of the margin (alpha)?
  • Video class: C4W4L05 Face Verification 06m
  • Video class: C4W4L06 What is neural style transfer? 02m
  • Video class: C4W4L07 What are deep CNs learning? 08m
  • Exercise: When visualizing a CNN by finding image patches that maximally activate a hidden unit, what do early (layer 1) units typically respond to?
  • Video class: C4W4L08 Cost Function 04m
  • Exercise: In neural style transfer, what is the optimization step used to generate the final image?
  • Video class: C4W4L09 Content Cost Function 03m
  • Exercise: In neural style transfer, how is the content cost J_content(C, G) commonly defined using a chosen layer L?
  • Video class: C4W4L10 Style Cost Function 17m
  • Exercise: In neural style transfer, what does the style matrix (Gram matrix) at layer L measure?
  • Video class: C4W4L11 1D and 3D Generalizations 09m
  • Exercise: In a 1D convolution example, what is the output length when a 14-dimensional input is convolved with a 5-dimensional filter (valid convolution, stride 1)?

This free course includes:

6 hours and 1 minutes of online video course

Digital certificate of course completion (Free)

Exercises to train your knowledge

100% free, from content to certificate

Ready to get started?Download the app and get started today.

Install the app now

to access the course
Icon representing technology and business courses

Over 5,000 free courses

Programming, English, Digital Marketing and much more! Learn whatever you want, for free.

Calendar icon with target representing study planning

Study plan with AI

Our app's Artificial Intelligence can create a study schedule for the course you choose.

Professional icon representing career and business

From zero to professional success

Improve your resume with our free Certificate and then use our Artificial Intelligence to find your dream job.

You can also use the QR Code or the links below.

QR Code - Download Cursa - Online Courses

More free courses at Artificial Intelligence and Machine Learning

Free Ebook + Audiobooks! Learn by listening or reading!

Download the App now to have access to + 5000 free courses, exercises, certificates and lots of content without paying anything!

  • 100% free online courses from start to finish

    Thousands of online courses in video, ebooks and audiobooks.

  • More than 60 thousand free exercises

    To test your knowledge during online courses

  • Valid free Digital Certificate with QR Code

    Generated directly from your cell phone's photo gallery and sent to your email

Cursa app on the ebook screen, the video course screen and the course exercises screen, plus the course completion certificate