Free Ebook cover Machine Learning and Deep Learning with Python

Machine Learning and Deep Learning with Python

4.75

(4)

112 pages

Convolutional Neural Networks (CNNs)

Capítulo 83

Estimated reading time: 5 minutes

+ Exercise
Audio Icon

Listen in audio

0:00 / 0:00

22. Convolutional Neural Networks (CNNs)

Convolutional Neural Networks, better known by the English acronym CNN (Convolutional Neural Networks), are a special class of deep neural networks that have proven to be extremely effective in the field of image processing and computer vision. Inspired by the organization of the human visual cortex, CNNs are capable of capturing the spatial hierarchy of features in visual data, making them suitable for tasks such as image recognition, object detection, and semantic segmentation.

CNN architecture

The architecture of a typical CNN is composed of several layers that can be grouped into three main types: convolutional layers, pooling layers and fully connected layers.

Convolutional Layers

Convolutional layers are the core of CNNs. They apply a set of filters (also known as kernels) to the input to produce feature maps. Each filter is designed to detect a specific type of feature, such as edges, textures, or more complex patterns. Filters are applied through a mathematical operation called convolution, which involves elementary multiplication and the sum of the image's pixel values ​​with the filter's values, sliding the filter over the entire area of ​​the image.

Pooling Layers

After convolution, a pooling (or subsampling) layer usually follows, which reduces the spatial dimensionality of the feature maps. This helps make the network more efficient and less prone to overfitting, as well as making detected features more robust to position variations. The most common pooling is max pooling, which selects the maximum value of a group of pixels within a sliding window.

Fully Connected Layers

After several convolutional and pooling layers, the network usually includes one or more fully connected layers. These layers are similar to the layers of a traditional neural network, where each neuron is connected to all the neurons in the previous layer. The purpose of these layers is to combine the features learned by the previous layers to perform the final task, such as image classification.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

Learning and Training a CNN

Training a CNN involves adjusting the weights of the filters and neurons in the fully connected layers. This is done through an optimization algorithm such as stochastic gradient descent (SGD) along with a suitable loss function such as cross-entropy for classification tasks. During training, the network learns to extract features that are increasingly abstract and relevant to the task at hand, as information flows from the input layers to the deeper layers.

Innovations and Improvements in CNNs

Over the years, many innovations have been proposed to improve the performance of CNNs. Some of the most notable include:

  • Activation Functions: Functions like ReLU (Rectified Linear Unit) have been preferred over sigmoids and tanh for speeding up training and mitigating the problem of gradient fading.
  • Weight Initialization: Techniques such as He and Glorot initialization help start training with weights on an appropriate scale, promoting faster convergence.
  • Batch Normalization: This technique normalizes the inputs to each layer in order to stabilize learning and allow for higher learning rates.
  • Dropout: It consists of randomly deactivating neurons during training to avoid overfitting and force the network to learn more robust representations.
  • Advanced Architectures: Models such as AlexNet, VGG, ResNet, Inception and DenseNet have brought significant improvements in terms of depth, efficiency and generalization ability.

Applications of CNNs

CNNs have a wide range of applications, including but not limited to:

  • Image recognition and classification
  • Object detection and tracking
  • Semantic and instance segmentation
  • Video analysis and action recognition
  • Medical imaging diagnosis
  • Autonomous vehicles
  • Natural language processing (when adapted as 1D CNNs)

Challenges and Future Considerations

Despite their success, CNNs still face challenges such as the need for large amounts of labeled data for training, model interpretability, and computational efficiency. Research continues to be conducted to address these challenges, including semi-supervised learning methods, transfer learning, explainable neural networks, and hardware optimization for time inference.the real.

In summary, CNNs represent one of the greatest achievements in the field of machine learning and continue to drive advances in a variety of applied domains. With the continued evolution of technology and ongoing research, we can expect CNNs to play an even more significant role in solving complex data perception and analysis problems.

Now answer the exercise about the content:

Which of the following statements about Convolutional Neural Networks (CNNs) is true?

You are right! Congratulations, now go to the next page

You missed! Try again.

The correct statement is CNNs use a mathematical operation called convolution in their convolutional layers to detect features such as edges and textures. Convolutional layers, which are essential to the architecture of CNNs, apply filters to detect various features in visual data. This process involves elementary multiplication and the sum of pixel values, identifying features like edges and textures effectively for image processing tasks.

Next chapter

Transfer Learning and Fine-tuning

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.