Free Course Image Modern Computer Vision and Deep Learning for Images - CNNs, RNNs, 3D Vision, Detection

Free online courseModern Computer Vision and Deep Learning for Images - CNNs, RNNs, 3D Vision, Detection

Duration of the online course: 31 hours and 33 minutes

New

Build real-world computer vision skills with this free online course—train CNNs/RNNs, detect objects, and explore 3D vision with a practical deep learning focus.

In this free course, learn about

Modern CV pipeline: metric vs semantic info; how deep learning fits classic vision stages
Neurons/perceptrons/MLPs: XOR limits, perceptron updates, regression and loss functions
Training deep nets: non-convexity, gradient descent, backprop gradients, activations (sigmoid issues)
Optimization & regularization: Adagrad intuition, weight init (Xavier), dropout, preprocessing
CNN fundamentals: convolutions, channels/feature maps, properties, where parameters live (AlexNet FC)
CNN architectures: AlexNet, Inception; 1x1 conv for dimension reduction and efficiency
Sequence models: RNN temporal dependence, LSTM, encoder-decoder for tasks like image captioning
Low-level vision: filtering (spatial/frequency), edge detection (Canny NMS), Hough line detection
Feature detection/description: Harris corners, sub-pixel refinement, blobs (scale-normalized LoG), SIFT/SURF
Single-view geometry: affine transforms, camera intrinsics, homographies, panorama assumptions/conditions
Two-view stereo & epipolar geometry: F matrix mapping point-to-epipolar line; depth from disparity
Structure from Motion: essential matrix decomposition, scale ambiguity, factorization centering, bundle adj.
Dense 3D & deep stereo: plane sweep depth selection; unsupervised training via photometric consistency
Mid-level vision: optical flow (Lucas–Kanade constraints), segmentation via k-means, GMM, mean shift, FCNs

Course Description

Computer vision has moved from handcrafted features to end-to-end learning systems that can recognize, segment, track, and reconstruct the world from pixels. This free online course guides you through that modern shift, connecting core image fundamentals with the deep learning techniques that power today’s products—from mobile cameras and autonomous systems to medical imaging and industrial inspection. You will build an intuitive understanding of what visual information means, how it is represented, and how models learn to extract it reliably under real-world variability.

You will start by grounding deep learning in the essential building blocks: neurons, multilayer networks, losses, gradients, and backpropagation. Along the way, you will learn why training can be challenging, how optimization choices influence convergence, and how regularization techniques help models generalize beyond the training set. Preprocessing and activation/initialization decisions are treated as practical levers that affect stability and performance, helping you reason about why a network works, not just how to run it.

From there, you will connect classic and modern vision. You will revisit low-level methods such as filtering, edges, corners, blobs, and descriptors, and see how these ideas relate to learned representations. You will also study CNNs in depth, understand what convolutional channels capture, and explore influential architectural concepts that shaped performance breakthroughs. Sequence modeling is introduced through RNNs and LSTMs, including encoder–decoder ideas used in tasks that connect images and language.

The course expands to geometric vision and 3D perception, covering the intuition behind homographies, camera intrinsics, two-view stereo, epipolar geometry, and structure from motion, culminating in concepts like bundle adjustment and dense reconstruction. Finally, you will bridge to mid-level and high-level tasks such as optical flow, segmentation, and object detection, linking deep networks to the practical goals of locating, classifying, and delineating objects in complex scenes.

To reinforce learning, you will encounter frequent conceptual checks that sharpen your reasoning about model capacity, training dynamics, and the assumptions behind vision pipelines. By the end, you will be able to choose appropriate approaches for common vision problems, explain key trade-offs, and follow the technical language used in modern computer vision and machine learning work.

Course content

Video class: #1 Course Introduction | Part 1 | Modern Computer Vision 18m
Exercise: Which option best describes the difference between metric and semantic information extracted from images?
Video class: #2 Course Introduction | Part 2 | Modern Computer Vision 28m
Video class: #3 Introduction to Deep Learning | Part 1 | Modern Computer Vision 15m
Exercise: In a modern vision pipeline, which statement best describes how early CNN layers relate to traditional low-level vision?
Video class: #4 Introduction to Deep Learning | Part 2 | Modern Computer Vision 19m
Video class: #5 Introduction to Deep Learning | Part 3 | Modern Computer Vision 13m
Exercise: What key factor enabled the major performance leap in image classification around 2012, alongside the introduction of AlexNet?

Video class: #6 Introduction to Neuron | Part 1 | Modern Computer Vision 11m
Video class: #7 Introduction to Neuron | Part 2 | Modern Computer Vision 26m
Exercise: Why can a single perceptron (single linear decision boundary) not model the XOR function?
Video class: #8 Introduction to Neuron | Part 3 | Modern Computer Vision 15m
Video class: #9 Multilayer Perceptron | Modern Computer Vision 24m
Exercise: In the perceptron update rule, what change is applied when a positive example is misclassified (i.e., wᵀx < 0)?
Video class: #10 Regression 16m
Video class: #11 Training a Neural Network | Modern Computer Vision 12m
Exercise: Why can the loss surface of a deep network be non-convex even if all activations are linear?
Video class: #12 Gradient Descent | Modern Computer Vision 28m
Video class: #13 Activation Function | Modern Computer Vision 26m
Exercise: Why is the standard sigmoid activation often avoided in early hidden layers of deep networks?
Video class: #14 Backpropagation in MLP | Part 1 | Modern Computer Vision 27m
Video class: #15 Backpropagation in MLP | Part 2 | Modern Computer Vision 22m
Exercise: In backpropagation for a network with loss L = (1/2)∑(ŷᵢ − yᵢ)², what is the gradient of the loss with respect to a bias term Bᵢᴸ (bias feeding into Zᵢᴸ⁺¹)?
Video class: #16 Optimization 26m
Video class: #17 Optimization 27m
Exercise: In adaptive gradient methods like Adagrad, what is the main purpose of dividing the learning rate by a term involving accumulated past squared gradients (e.g., √r_t)?
Video class: #18 Regularization | Modern Computer Vision 25m
Video class: #19 Dropout | Modern Computer Vision 17m
Video class: #20 Pre Processing | Modern Computer Vision 09m

Video class: #21 Convolutional Neural Networks | Part 1 | Modern Computer Vision 14m
Exercise: In Xavier initialization, from which range are weights typically drawn (uniformly) to help keep activation variance stable across layers?
Video class: #22 Convolutional Neural Networks | Part 2 | Modern Computer Vision 17m
Video class: #23 Convolutional Neural Networks | Part 3 | Modern Computer Vision 15m
Exercise: In a CNN used for digit classification, what does each output channel after a convolution typically represent?
Video class: #24 CNN Properties | Modern Computer Vision 30m
Video class: #25 Alexnet | Modern Computer Vision 14m
Exercise: In AlexNet, where do the majority of learnable parameters (unknown weights) reside?
Video class: #26 CNN Architectures | Part 1 | Modern Computer Vision 15m
Video class: #27 CNN Architectures | Part 2 | Modern Computer Vision 22m
Exercise: In an Inception module, what is the main purpose of using a 1×1 convolution before larger filters like 3×3 and 5×5?
Video class: #28 CNN Architectures | Part 3 | Modern Computer Vision 13m

Video class: #29 Introduction to RNN | Part 1 | Modern Computer Vision 27m
Exercise: Which key property of an RNN enables it to model temporal dependence in sequences?
Video class: #30 Introduction to RNN | Part 2 | Modern Computer Vision 19m
Video class: #31 Encoder | Decoder | Models in RNN | Modern Computer Vision 27m
Exercise: In an encoder–decoder setup for image captioning, which pairing of models is most appropriate for the encoder and decoder?
Video class: #32 LSTM | Modern Computer Vision 21m

Video class: #33 Low Level Vision | Part 1 | Modern Computer Vision 14m
Exercise: Why are local features like corners considered useful in low-level vision?
Video class: #34 Low Level Vision | Part 2 | Modern Computer Vision 22m
Video class: #35 Low Level Vision | Part 3 | Modern Computer Vision 09m
Video class: #36 Spatial Domain Filtering | Modern Computer Vision 26m
Video class: #37 Frequency Domain Filtering | Modern Computer Vision 23m
Exercise: In frequency-domain filtering, what operation is typically performed to apply a filter mask to an image?
Video class: #38 Edge Detection | Part 1 | Modern Computer Vision 23m
Video class: #39 Edge Detection | Part 2 | Modern Computer Vision 26m
Exercise: In the Canny edge detector, what is the main purpose of non-maxima suppression (NMS)?
Video class: #40 DeepNets for Edge Detection | Modern Computer Vision 21m
Video class: #41 Line Detection | Modern Computer Vision 27m
Exercise: Why does the Hough transform for line detection prefer the normal form over the slope-intercept form?
Video class: #42 Feature Detectors | Modern Computer Vision 26m
Video class: #43 Harris Corner Detector | Part 1 | Modern Computer Vision 23m
Exercise: In the Harris corner detector intuition, how does the appearance of a small patch change when the patch is centered on a corner and shifted slightly?
Video class: #44 Harris Corner Detector | Part 2 | Modern Computer Vision 19m
Video class: #45 Harris Corner Detector | Part 3 | Modern Computer Vision 21m
Exercise: In sub-pixel corner refinement, what condition is used at the true corner location (the maximum of the corner response) to solve for (Δx, Δy) using a Taylor expansion?
Video class: #46 Blob Detection | Part 1 | Modern Computer Vision 17m
Video class: #47 Blob Detection | Part 2 | Modern Computer Vision 26m
Exercise: Why is the Laplacian of Gaussian (LoG) often multiplied by c3b2 to form a scale-normalized LoG?
Video class: #48 Blob Detection | Part 3 | Modern Computer Vision 08m
Video class: #49 SIFT | Part 1 | Modern Computer Vision 22m
Video class: #50 SIFT | Part 2 | Modern Computer Vision 23m
Video class: #51 Feature Descriptors | Part 1 | Modern Computer Vision 20m
Exercise: How does the SIFT descriptor become a 128-dimensional vector?
Video class: #52 Feature Descriptors | Part 2 | Modern Computer Vision 25m
Video class: #53 SURF | Part 1 | Modern Computer Vision 22m
Exercise: In SURF-style keypoint detection, which quantity is used as the strength measure for non-maxima suppression across a 3×3×3 neighborhood?
Video class: #54 SURF | Part 2 | Modern Computer Vision 16m

Video class: #55 Single View Geometry | Part 1 | Modern Computer Vision 21m
Exercise: Why does panorama stitching typically assume a planar (or sufficiently far) scene?
Video class: #56 Single View Geometry | Part 2 | Modern Computer Vision 30m
Video class: #57 2D Geometric Transformations | Part 1 | Modern Computer Vision 23m
Exercise: Which statement best describes a general affine transformation in 2D?
Video class: #58 2D Geometric Transformations | Part 2 | Modern Computer Vision 29m

Video class: #59 Camera Intrinsics 13m
Exercise: Under what condition can a single homography be used to stitch images of a 3D scene into a panorama?
Video class: #60 Camera Intrinsics 36m
Video class: #61 Two View Stereo | Part 1 | Modern Computer Vision 13m
Exercise: In a stereo vision pipeline for estimating depth, which pair of steps is essential to recover a 3D point from its two image observations?
Video class: #62 Two View Stereo | Part 2 | Modern Computer Vision 20m
Video class: #63 Two View Stereo | Part 3 | Modern Computer Vision 12m
Exercise: In epipolar geometry, what does the fundamental matrix mainly provide for a point in the left image?
Video class: #64 Algebraic Representation of Epipolar Geometry | Part 1 | Modern Computer Vision 25m
Video class: #65 Algebraic Representation of Epipolar Geometry | Part 2 | Modern Computer Vision 26m
Exercise: In epipolar geometry, what does the fundamental matrix F do to a point \(\tilde{x}\) in the left image?
Video class: #66 Fundamental Matrix Computation | Part 1 | Modern Computer Vision 29m
Video class: #67 Fundamental Matrix Computation | Part 2 | Modern Computer Vision 18m
Exercise: In a rectified (parallel) stereo camera setup, how is depth z related to disparity delta?

Video class: #68 Structure from Motion | Part 1 | Modern Computer Vision 09m
Video class: #69 Structure from Motion | Part 2 | Modern Computer Vision 30m
Exercise: When decomposing an essential matrix, how is the correct (R, T) solution selected from the four candidates?
Video class: #70 Structure from Motion | Part 3 | Modern Computer Vision 14m
Video class: #71 Batch Processing in SFM | Modern Computer Vision 32m
Exercise: In structure from motion, what is the inherent ambiguity in reconstructed 3D structure for calibrated cameras (known intrinsics)?
Video class: #72 Multi View SFM | Modern Computer Vision 20m
Video class: #73 Factorization Methods in SFM | Modern Computer Vision 15m
Exercise: In the factorization method for structure from motion, what is the main purpose of forming the mean-centered measurement matrix \(\tilde{W} = W\left(I - \frac{1}{Q}\mathbf{1}\mathbf{1}^T\right)\)?
Video class: #74 Bundle Adjustment | Modern Computer Vision 21m
Video class: #75 Dense 3D Reconstruction | Modern Computer Vision 13m
Exercise: In plane sweep stereo for dense 3D reconstruction, how is the depth for a pixel typically selected?
Video class: #76 Some Results in Stereo 14m
Video class: #77 Deepnets for Stereo 21m
Exercise: In unsupervised stereo depth estimation, what key idea can be used to train a network without ground-truth disparity?
Video class: #78 Deepnets for Stereo 16m

Video class: #79 Mid Level Vision | Part 1 | Modern Computer Vision 25m
Video class: #80 Mid Level Vision | Part 2 | Modern Computer Vision 16m
Video class: #81 Lucas Kanade Method for OF | Modern Computer Vision 09m
Exercise: In the Lucas–Kanade optical flow method, what extra constraint is added beyond brightness constancy to make the problem better-posed?
Video class: #82 Handling Large Motion in Optical Flow | Modern Computer Vision 07m
Video class: #83 Image Segmentation | Modern Computer Vision 23m
Exercise: In k-means clustering used for image segmentation, how is a data point assigned to a cluster?
Video class: #84 GMM for Clustering | Modern Computer Vision 19m
Video class: #85 Deepnets for Segmentation 29m
Exercise: In mean shift segmentation, what does the algorithm iteratively do in the feature space?
Video class: #86 Deepnets for Segmentation 17m
Video class: #87 Deepnets for Segmentation 17m
Exercise: In upsampling for segmentation, what is the key idea behind a transposed convolution (often used in deconvolution networks)?

Video class: #88 Deepnets for Object Detection | Part 1 | Modern Computer Vision 36m
Video class: #89 Deepnets for Object Detection | Part 2 | Modern Computer Vision 28m
Exercise: In Faster R-CNN, what is the primary role of the Region Proposal Network (RPN)?
Video class: #90 Vision 22m

Learn aboutDeep Learning

Explore free Deep Learning courses, a key subcategory of Artificial Intelligence. Learn neural networks, algorithms, and more to advance your AI skills.

Learn aboutComputer Vision

Explore free Computer Vision courses, a key subfield of Artificial Intelligence, and master techniques for image and video analysis.

This free course includes:

31 hours and 33 minutes of online video course

Digital certificate of course completion (Free)

Exercises to train your knowledge

100% free, from content to certificate

Ready to get started?Download the app and get started today.

Install the app now

to access the course

Over 5,000 free courses

Programming, English, Digital Marketing and much more! Learn whatever you want, for free.

Study plan with AI

Our app's Artificial Intelligence can create a study schedule for the course you choose.

From zero to professional success

Improve your resume with our free Certificate and then use our Artificial Intelligence to find your dream job.

You can also use the QR Code or the links below.

QR Code - Download Cursa - Online Courses

More free courses at Artificial Intelligence and Machine Learning

Free Course Image Deep Learning With PyTorch

Free CourseDeep Learning With PyTorch

(6)

3h39m

19 exercises

Free Course Image Chat GPT and OpenAI API course

Free CourseChat GPT and OpenAI API course

(6)

5h17m

Free Course Image Machine Learning tutorial

Free CourseMachine Learning tutorial

(1)

10h20m

6 exercises

Free CourseData Science

(3)

5h58m

38 exercises

Free Course Image Artificial intelligence

Free CourseArtificial intelligence

4.72

(65)

12h40m

7 exercises

Free Course Image Data Science full course

Free CourseData Science full course

4.67

(9)

11h22m

Free Course Image Google Prompting Essentials

Free CourseGoogle Prompting Essentials

4.67

(3)

3h24m

10 exercises

Free Course Image Fundamentals of Artificial Intelligence

Free CourseFundamentals of Artificial Intelligence

4.6

(10)

25h26m

34 exercises

Free Course Image Machine Learning for complete beginners

Free CourseMachine Learning for complete beginners

4.56

(9)

1h09m

17 exercises

Free CourseGoogle AI Essentials

4.53

(15)

3h40m

13 exercises

Free Ebook + Audiobooks! Learn by listening or reading!

Free Ebook cover How to use GPT Chat in your company

3.88

(8)

Free Ebook cover Machine Learning and Deep Learning with Python

4.75

(4)

Free Ebook cover GPT Chat in your Company, how to use it in the best way to boost your business for small and medium-sized companies

(5)

Free Ebook cover Edge AI in Practice: Building Privacy-Preserving, Low-Latency Intelligence on Devices

New

Free Ebook cover Prompt Engineering for Educators: Designing AI-Powered Lessons, Quizzes, and Feedback (Without Coding)

New

Download the App now to have access to + 5000 free courses, exercises, certificates and lots of content without paying anything!

100% free online courses from start to finish

Thousands of online courses in video, ebooks and audiobooks.
More than 60 thousand free exercises

To test your knowledge during online courses
Valid free Digital Certificate with QR Code

Generated directly from your cell phone's photo gallery and sent to your email

Download our app via QR Code or the links below::.