Free Course Image Deep Neural Network Optimization: Hyperparameter Tuning, Regularization, and Training Tricks

Free online courseDeep Neural Network Optimization: Hyperparameter Tuning, Regularization, and Training Tricks

Duration of the online course: 4 hours and 44 minutes

New

Build faster, more accurate models with this free deep learning course on tuning, regularization, and training tricks—learn to stabilize training and boost results.

In this free course, learn about

How to split data into train/dev/test and size dev/test in big data settings
Diagnose bias vs variance from train/dev errors; apply the basic ML improvement recipe
Regularization: L2 updates, why it reduces overfitting, dropout, early stopping methods
Input normalization and using train-set stats to normalize dev/test consistently
Vanishing/exploding gradients; ReLU-friendly weight initialization (He) to mitigate them
Gradient checking: centered difference formula, computing numeric grads, disable dropout
Mini-batch GD mechanics: epochs, noisy cost curves, and practical batch training behavior
Exponentially weighted averages, bias correction, and momentum update equations
RMSProp and Adam: adaptive learning rates using EWAs of gradients/squared gradients
Learning rate decay to improve convergence during mini-batch optimization
Hyperparameter tuning: random search vs grid, log-scale sampling, panda vs caviar strategy
Batch normalization: gamma/beta roles, where applied, why it speeds training, test-time stats
Softmax regression: outputs, cross-entropy gradient for final layer, optimization challenges
TensorFlow basics: placeholders and feed_dict usage in a training loop

Course Description

Getting a deep neural network to train reliably is often harder than designing the architecture. Small choices like how you split data, set learning rates, initialize weights, or apply regularization can be the difference between a model that generalizes well and one that overfits or fails to converge. This free online course helps you develop the practical intuition and methods needed to optimize neural networks in real-world machine learning work.

You will learn how to diagnose training issues through the lens of bias and variance, and how to set up train, dev, and test sets so your evaluation truly matches your deployment goals. From there, the course builds the skills to reduce overfitting with techniques such as L2 regularization, dropout, and early stopping, while keeping model performance strong and training time reasonable. You will also explore why input normalization matters, and how thoughtful preprocessing can make optimization smoother and more predictable.

As networks get deeper, training instability can show up as vanishing or exploding gradients. The course explains how these problems arise and how to mitigate them using principled weight initialization strategies and gradient checking to verify correctness. You will then move into modern optimization approaches that make large-scale training practical, including mini-batch gradient descent, momentum, RMSProp, Adam, and learning rate decay, with an emphasis on understanding what each method is doing and when it is most useful.

Hyperparameter tuning is treated as a disciplined process rather than guesswork. You will practice choosing effective search strategies, sampling on appropriate scales, and making tradeoffs based on your project constraints. Finally, you will see how batch normalization and softmax-based classifiers fit into efficient training pipelines, and how frameworks like TensorFlow structure training loops in practice. By the end, you will be able to troubleshoot training behavior, select optimization and regularization techniques with confidence, and deliver models that learn faster and generalize better.

Course content

Video class: Train/Dev/Test Sets (C2W1L01) 12m
Exercise: In the Big Data era, why can the dev and test sets be much smaller percentages of the total dataset (e.g., ~1% each)?
Video class: Bias/Variance (C2W1L02) 08m
Exercise: You train a cat classifier and get 1% training error and 11% dev error (assume near-zero Bayes error and same distribution). What does this most likely indicate?
Video class: Basic Recipe for Machine Learning (C2W1L03) 06m
Exercise: Using the basic recipe for improving a neural network, what is the most appropriate next step if the model has high variance?

Video class: Regularization (C2W1L04) 09m
Exercise: In L2 regularization for a neural network, how does the gradient descent update for a weight matrix change?
Video class: Why Regularization Reduces Overfitting (C2W1L05) 07m
Exercise: Why can increasing L2 regularization (large λ) reduce overfitting in a deep neural network?
Video class: Dropout Regularization (C2W1L06) 09m
Exercise: In inverted dropout, why are activations divided by the keep_prob during training?
Video class: Understanding Dropout (C2W1L07) 07m
Exercise: Why does dropout tend to reduce overfitting in a neural network?
Video class: Other Regularization Methods (C2W1L08) 08m
Exercise: What is the main idea behind early stopping as a way to reduce overfitting?
Video class: Normalizing Inputs (C2W1L09) 05m
Exercise: When normalizing inputs for training, what should be used to normalize the test set?
Video class: Vanishing/Exploding Gradients (C2W1L10) 06m
Exercise: In a very deep network with linear activations and zero biases, what happens if each weight matrix is slightly larger than the identity (e.g., like 1.5·I)?
Video class: Weight Initialization in a Deep Network (C2W1L11) 06m
Exercise: Which weight initialization scaling is commonly recommended when using ReLU activations to help reduce vanishing/exploding gradients?

Video class: Numerical Approximations of Gradients (C2W1L12) 06m
Exercise: Which numerical formula is preferred for gradient checking because it gives a more accurate gradient approximation?
Video class: Gradient Checking (C2W1L13) 06m
Exercise: In gradient checking, how is the numerical gradient for a single parameter computed?
Video class: Gradient Checking Implementation Notes (C2W1L14) 05m
Exercise: When using gradient checking, what is a recommended practice regarding dropout?

Video class: Mini Batch Gradient Descent (C2W2L01) 11m
Exercise: In mini-batch gradient descent, what does one epoch correspond to when the training set is split into 5000 mini-batches?
Video class: Understanding Mini-Batch Gradient Dexcent (C2W2L02) 11m
Exercise: Why can the cost curve look noisy during mini-batch gradient descent?
Video class: Exponentially Weighted Averages (C2W2L03) 05m
Exercise: In an exponentially weighted average, what is the approximate number of days being averaged when β = 0.9?
Video class: Understanding Exponentially Weighted Averages (C2W2L04) 09m
Exercise: In exponentially weighted averages used in optimization (e.g., momentum), what is the rule-of-thumb for the effective number of recent steps being averaged when using parameter β?
Video class: Bias Correction of Exponentially Weighted Averages (C2W2L05) 04m
Exercise: What is the purpose of bias correction when computing exponentially weighted averages?
Video class: Gradient Descent With Momentum (C2W2L06) 09m
Exercise: In gradient descent with momentum, which update correctly describes how the exponentially weighted average of gradients is computed for the weights?
Video class: RMSProp (C2W2L07) 07m
Exercise: In RMSProp, why are parameter updates divided by the square root of an exponentially weighted average of squared gradients?
Video class: Adam Optimization Algorithm (C2W2L08) 07m
Exercise: What best describes how the Adam optimization algorithm works?
Video class: Learning Rate Decay (C2W2L09) 06m
Exercise: What is the main purpose of using learning rate decay during mini-batch gradient descent?

Video class: Tuning Process (C2W3L01) 07m
Exercise: Why is random sampling often preferred over grid search for hyperparameter tuning?
Video class: Using an Appropriate Scale (C2W3L02) 08m
Exercise: When tuning the learning rate b1 over a wide range (e.g., 0.0001 to 1), what is the recommended way to sample values?
Video class: Hyperparameter Tuning in Practice (C2W3L03) 06m
Exercise: How should you decide between the “panda” approach and the “caviar” approach for hyperparameter search?

Video class: Normalizing Activations in a Network (C2W3L04) 08m
Exercise: In batch normalization, what is the main role of the learnable parameters γ (gamma) and β (beta) after normalizing z?
Video class: Fitting Batch Norm Into Neural Networks (C2W3L05) 12m
Exercise: In a deep network using batch normalization, where is batch normalization applied within a layer’s computations?
Video class: Why Does Batch Norm Work? (C2W3L06) 11m
Exercise: Which best describes a key reason batch normalization helps speed up training in deep networks?
Video class: Batch Norm At Test Time (C2W3L07) 05m
Exercise: In batch normalization, what is commonly used at test time to normalize a single example when mini-batch statistics aren’t available?
Video class: Softmax Regression (C2W3L08) 11m
Exercise: In a neural network using a softmax output layer with C classes, what does the softmax function produce?
Video class: Training Softmax Classifier (C2W3L09) 10m
Exercise: In softmax classification with cross-entropy loss, what is the gradient for the last layer pre-activation (dZ^L)?

Video class: The Problem of Local Optima (C2W3L10) 05m
Exercise: In high-dimensional neural network optimization, which issue is typically more problematic than getting stuck in bad local optima?
Video class: TensorFlow (C2W3L11) 16m
Exercise: In a typical TensorFlow training loop, what is the main purpose of using a placeholder with a feed_dict?

Learn aboutMachine learning

Unlock the power of AI with our free online machine learning courses. Master algorithms, data science, and neural networks to boost your tech skills today!

This free course includes:

4 hours and 44 minutes of online video course

Digital certificate of course completion (Free)

Exercises to train your knowledge

100% free, from content to certificate

Ready to get started?Download the app and get started today.

Install the app now

to access the course

Over 5,000 free courses

Programming, English, Digital Marketing and much more! Learn whatever you want, for free.

Study plan with AI

Our app's Artificial Intelligence can create a study schedule for the course you choose.

From zero to professional success

Improve your resume with our free Certificate and then use our Artificial Intelligence to find your dream job.

You can also use the QR Code or the links below.

QR Code - Download Cursa - Online Courses

More free courses at Artificial Intelligence and Machine Learning

Free Course Image Deep Learning With PyTorch

Free CourseDeep Learning With PyTorch

(6)

3h39m

19 exercises

Free Course Image Machine Learning tutorial

Free CourseMachine Learning tutorial

(1)

10h20m

6 exercises

Free Course Image Google Prompting Essentials

Free CourseGoogle Prompting Essentials

4.75

(4)

3h24m

10 exercises

Free CourseData Science

4.75

(4)

5h58m

38 exercises

Free Course Image Artificial intelligence

Free CourseArtificial intelligence

4.72

(65)

12h40m

7 exercises

Free Course Image Data Science full course

Free CourseData Science full course

4.67

(9)

11h22m

Free Course Image Fundamentals of Artificial Intelligence

Free CourseFundamentals of Artificial Intelligence

4.6

(10)

25h26m

34 exercises

Free Course Image Machine Learning for complete beginners

Free CourseMachine Learning for complete beginners

4.56

(9)

1h09m

17 exercises

Free CourseGoogle AI Essentials

4.53

(15)

3h40m

13 exercises

Free CourseMachine Learning

4.5

(2)

3h51m

6 exercises

Free Ebook + Audiobooks! Learn by listening or reading!

Free Ebook cover How to use GPT Chat in your company

3.88

(8)

Free Ebook cover Machine Learning and Deep Learning with Python

4.75

(4)

Free Ebook cover GPT Chat in your Company, how to use it in the best way to boost your business for small and medium-sized companies

(5)

Free Ebook cover Edge AI in Practice: Building Privacy-Preserving, Low-Latency Intelligence on Devices

New

Free Ebook cover Prompt Engineering for Educators: Designing AI-Powered Lessons, Quizzes, and Feedback (Without Coding)

New

Download the App now to have access to + 5000 free courses, exercises, certificates and lots of content without paying anything!

100% free online courses from start to finish

Thousands of online courses in video, ebooks and audiobooks.
More than 60 thousand free exercises

To test your knowledge during online courses
Valid free Digital Certificate with QR Code

Generated directly from your cell phone's photo gallery and sent to your email

Download our app via QR Code or the links below::.

Free online courseDeep Neural Network Optimization: Hyperparameter Tuning, Regularization, and Training Tricks

Build faster, more accurate models with this free deep learning course on tuning, regularization, and training tricks—learn to stabilize training and boost results.

In this free course, learn about

Course Description

Course content

Learn aboutMachine learning

This free course includes:

More free courses at Artificial Intelligence and Machine Learning

Free Ebook + Audiobooks! Learn by listening or reading!

Download the App now to have access to + 5000 free courses, exercises, certificates and lots of content without paying anything!

100% free online courses from start to finish

More than 60 thousand free exercises

Valid free Digital Certificate with QR Code