How to train a neural network-Introduction to CNN "Pytorch basics"

2020/01/2014:26:08 technology 1144

How to train a neural network-Introduction to CNN

Photo by Paul Skorupskas on Unsplash [Image [0]]


How to train your neural network
This blog post will take you through the different types of CNN operations in PyTorch. In this blog post, we will use torch.nn to implement 1D and 2D convolution.


What is CNN?

Convolutional Neural Network is a kind of neural network, mainly used in image processing applications. Other applications of CNN are in sequential data, such as audio, time series and NLP. Convolution is one of the main building blocks of CNN. The term convolution refers to the mathematical combination of two functions to produce a third function. It merges two sets of information.

Here we will not discuss too many theories. There are many great materials to choose from online. The type of

CNN operation

CNN is mainly used for applications around image, audio, video, text and time series modeling. There are 3 types of convolution operations.

one-dimensional convolution

filter with one-dimensional input slides along a single size to produce an output. The image below is from this Stackoverflow answer.

How to train a neural network-Introduction to CNN

1D Convolution for 1D Input [Image [1]]

2D input 1D convolution

How to train a neural network-Introduction to CNN

1D Convolution for 2D Input [Image [2]]

2D Convolution for 1D Input [Image [Image [1]]

2D Answers to get more information about different types of CNN operations.

several key terms

explain the terms of 2D convolution and 2D input. Picture, because I can't find the relevant visualization of 1D convolution.

Convolution operation

In order to calculate the output size after convolution operation, we can use the following formula.

How to train a neural network-Introduction to CNN

Convolution Output Formula [Image [4]]

The kernel/filter slides on the input signal as shown in the figure below. You can see that the filter (green square) slides on our input (blue square) and the sum of the convolution enters the feature map (red square).

How to train a neural network-Introduction to CNN

Convolution Operation [Image [5]]

filter/kernel

uses a filter to perform convolution on the input image. The output of convolution is called a feature map.

How to train a neural network-Introduction to CNN

Filter [Image [6]]

In CNN terms, the 3×3 matrix is ​​called "filter" or "kernel" or "feature detector", and the matrix formed by sliding the filter on the image and calculating the dot product is called For "convolution features" or "activate" maps" or "feature maps". It is important to note that the filter acts as a feature detector for the original input image.

more filters = more features map = more features. The

filter is nothing but a matrix of numbers. The following are different types of filters-

How to train a neural network-Introduction to CNN

Different types of filters [Image [7]]

stride

stride is specified in eachHow much to move the convolution filter in each step.

How to train a neural network-Introduction to CNN

Stride of 1 [Image [8]]

If we want the overlap to be reduced, we can have a larger stride. Since we skipped potential locations, this also makes the generated feature maps smaller. The figure below demonstrates a stride of 2. Note that the feature map becomes smaller.

How to train a neural network-Introduction to CNN

Stride of 2 [Image [9]]

padding

Here, we keep more information from the border and keep the size of the image.

How to train a neural network-Introduction to CNN

Padding [Image [10]]

We see that the size of the feature map is smaller than the input because the convolution filter needs to be included in the input. If you want to keep the same size, you can use padding to surround the input with zeros.

Pooling

We apply pooling to reduce the dimensionality.

How to train a neural network-Introduction to CNN

Max Pooling [Image [11]]

import library

import numpy as npimport torchimport torch.nn as nnimport torch.optim as optimfrom torch.utils.data import Dataset, DataLoader


input data

. .

input_1d is a one-dimensional floating-point tensor. input_2d is a two-dimensional floating point tensor. input_2d_img is a 3-dimensional floating point tensor representing an image.

input_1d = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype = torch.float)input_2d = torch.tensor([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]], dtype = torch.float)input_2d_img = torch.tensor([[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], [[1 , 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 4 , 5, 6, 7, 8, 9, 10]], [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6 , 7, 8, 9, 10], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]], dtype = torch.float) OUTPUT Input 1D:input_1d.shape: torch. Size([10])input_1d: tensor([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])======== ============================================================Input 2D:input_2d. shape: torch.Size([2, 5])input_2d: tensor([[ 1., 2., 3., 4., 5.], [6., 7., 8., 9., 10.] ])=============================================== ====================input_2d_img:input_2d_img.shape: torch.Size([3, 3, 10])input_2d_img: tensor([[[ 1., 2. , 3., 4., 5., 6., 7., 8., 9., 10.], [1., 2., 3., 4., 5., 6., 7., 8. , 9., 10.], [1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]], [[ 1., 2., 3 ., 4., 5., 6., 7., 8., 9., 10.], [1., 2., 3., 4., 5., 6., 7., 8., 9 ., 10.], [1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]], [[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.], [1., 2., 3., 4., 5., 6., 7., 8., 9., 10.], [1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]]]) 


one-dimensional convolution

nn.Conv1d() on the input Apply one-dimensional convolution. nn.Conv1d() expects the input to be [batch_size, input_channels, signal_length] shape.

You can view the complete parameter list in the official PyTorch documentation. The required parameter is-

Conv1d-input 1d

How to train a neural network-Introduction to CNN

Conv1d-Input1d Example [Image [12]]


The input is a one-dimensional signal consisting of 10 numbers. We convert it to a tensor of size [1, 1, 10].

input_1d = input_1d.unsqueeze(0).unsqueeze(0)input_1d.shape OUTPUT torch.Size([1, 1, 10])


CNN output, where out_channels = 1, kernel_size = 3, and stride = 1.

cnn1d_1 = nn.Conv1d(in_channels=1, out_channels=1, kernel_size=3, stride=1)print("cnn1d_1: \n")print(cnn1d_1(input_1d).shape, "\n")print(cnn1d_1(input_1d)) OUTPUT cnn1d_1 : torch.Size([1, 1, 8]) tensor([[[-1.2353, -1.4051, -1.5749, -1.7447, -1.9145, -2.0843, -2.2541, -2.4239]]], grad_fn=)


CNN Output, where out_channels = 1, kernel_size = 3 and stride = 2.

cnn1d_2 = nn.Conv1d(in_channels=1, out_channels=1, kernel_size=3, stride=2)print("cnn1d_2: \n")print(cnn1d_2(input_1d).shape, "\n")print(cnn1d_2(input_1d )) OUTPUT cnn1d_2: torch.Size([1, 1, 4]) tensor([[[0.5107, 0.3528, 0.1948, 0.0368]]], grad_fn=)


CNN output, where out_channels = 1, kernel_size = 2 and stride = 1.

cnn1d_3 = nn.Conv1d(in_channels=1, out_channels=1, kernel_size=2, stride=1)print("cnn1d_3: \n")print(cnn1d_3(input_1d).shape, "\n")print(cnn1d_3(input_1d )) OUTPUT cnn1d_3: torch.Size([1, 1, 9]) tensor([[[0.0978, 0.2221, 0.3465, 0.4708, 0.5952, 0.7196, 0.8439, 0.9683, 1.0926]]], grad_fn=)


CNN output, Where out_channels = 5, kernel_size = 3, stride = 2.

cnn1d_4 = nn.Conv1d(in_channels=1, out_channels=5, kernel_size=3, stride=1)print("cnn1d_4: \n")print(cnn1d_4(input_1d).shape, "\n")print(cnn1d_4(input_1d )) OUTPUT cnn1d_4: torch.Size([1, 5, 8]) tensor([[[-1.8410e+00, -2.8884e+00, -3.9358e+00, -4.9832e+00, -6.0307e+ 00,-7.0781e+00, -8.1255e+00, -9.1729e+00], [-4.6073e-02, -3.4436e-02, -2.2799e-02, -1.1162e-02, 4.7439e-04 ,1.2111e-02, 2.3748e-02, 3.5385e-02], [-1.5541e+00, -1.8505e+00, -2.1469e+00, -2.4433e+00, -2.7397e+00, -3.0361e+00, -3.3325e+00, -3.6289e +00], [6.6593e-01, 1.2362e+00, 1.8066e+00, 2.3769e+00, 2.9472e+00, 3.5175e+00, 4.0878e+00, 4.6581e+00], [2.0414e- 01, 6.0421e-01, 1.0043e+00, 1.4044e+00, 1.8044e+00, 2.2045e+00, 2.6046e+00, 3.0046e+00])), grad_fn=)


Conv1d-Enter 2d

1D convolution is applied to 2d input signal, we can perform the following operations. First, we define an input tensor of size [1, 2, 5], where batch_size = 1, input_channels = 2 and signal_length = 5.

input_2d = input_2d.unsqueeze(0)input_2d.shape OUTPUT torch.Size([1, 2, 5])


CNN output in_channels = 2, out_channels = 1, kernel_size = 3, stride = 1.

cnn1d_5 = ​​nn.Conv1d(in_channels=2, out_channels=1, kernel_size=3, stride=1)print("cnn1d_5: \n")print(cnn1d_5(input_2d).shape, "\n")print(cnn1d_5(input_2d )) OUTPUT cnn1d_5: torch.Size([1, 1, 3]) tensor([[[-6.6836, -7.6893, -8.6950]]], grad_fn=)


CNN output in_channels = 2, out_channels = 1, kernel_size = 3. Stride = 2.

cnn1d_6 = nn.Conv1d(in_channels=2, out_channels=1, kernel_size=3, stride=2)print("cnn1d_6: \n")print(cnn1d_6(input_2d).shape, "\n")print(cnn1d_6(input_2d )) OUTPUT cnn1d_6: torch.Size([1, 1, 2]) tensor([[[-3.4744, -3.7142]]], grad_fn=)


CNN output, where in_channels = 2, out_channels = 1, kernel_size = 2 , Stride = 1.

cnn1d_7 = nn.Conv1d(in_channels=2, out_channels=1, kernel_size=2, stride=1)print("cnn1d_7: \n")print(cnn1d_7(input_2d).shape, "\n")print(cnn1d_7(input_2d)) OUTPUT cnn1d_7: torch.Size([1, 1, 4]) tensor([[[0.5619, 0.6910, 0.8201, 0.9492]]], grad_fn =)


CNN output, where in_channels = 2, out_channels = 5, kernel_size = 3, stride = 1.

cnn1d_8 = nn.Conv1d(in_channels=2, out_channels=5, kernel_size=3, stride=1)print("cnn1d_8: \n")print(cnn1d_8(input_2d).shape, "\n")print(cnn1d_8(input_2d )) OUTPUT cnn1d_8: torch.Size([1, 5, 3]) tensor([[[ 1.5024, 2.4199, 3.3373], [0.2980, -0.0873, -0.4727], [1.5443, 1.7086, 1.8729], [2.6177, 3.2974, 3.9772], [-2.5145, -2.2906, -2.0668]]], grad_fn=)


2D convolution

nn.Conv2d() applies 2D convolution on the input. nn.Conv2d() expects the input shape to be [batch_size, input_channels, input_height, input_width].

You can view the complete parameter list in the official PyTorch documentation. The required parameter is —

Conv2d-input 2d

How to train a neural network-Introduction to CNN

Convolution with 3 channels [Image [13] credits]

To apply 2D convolution to a 2d input signal (such as an image), we can do the following . First, we define an input tensor of size [1, 3, 3, 10], where batch_size = 1, input_channels = 3, input_height = 3, and input_width = 10.

input_2d_img = input_2d_img.unsqueeze(0)input_2d_img.shape OUTPUT torch.Size([1, 3, 3, 10])


CNN output in_channels = 3, out_channels = 1, kernel_size = 3, stride = 1.

cnn2d_1 = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=3, stride=1)print("cnn2d_1: \n")print(cnn2d_1(input_2d_img).shape, "\n")print(cnn2d_1(input_2d_img)) OUTPUT cnn2d_1: torch.Size([1, 1, 1, 8]) tensor([[[[-1.0716, -1.5742, -2.0768, -2.5793, -3.0819, -3.5844, -4.0870,-4.5896]]]], grad_fn=)


CNN output in_channels = 3, out_channels = 1, kernel_size = 3, stride = 2.

cnn2d_2 = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=3, stride=2)print("cnn2d_2: \n")print(cnn2d_2(input_2d_img).shape, "\n")print(cnn2d_2(input_2d) )) OUTPUT cnn2d_2: torch.Size([1, 1, 1, 4]) tensor([[[[-0.7407, -1.2801, -1.8195, -2.3590]]]], grad_fn=)

CNN output, where in_channels = 3, out_channels = 1, kernel_size = 2, stride = 1.

cnn2d_3 = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=2, stride=1)print("cnn2d_3: \n")print(cnn2d_3(input_2d_img).shape, "\n")print(cnn2d_3(input_2d_img )) OUTPUT cnn2d_3: torch.Size([1, 1, 2, 9]) tensor([[[[-0.8046, -1.5066, -2.2086, -2.9107, -3.6127, -4.3147, -5.0167, -5.7188,- 6.4208], [-0.8046, -1.5066, -2.2086, -2.9107, -3.6127, -4.3147, -5.0167,-5.7188, -6.4208]]]], grad_fn=)


CNN output, where in_channels = 3, out_channels = 5 , Kernel_size = 3, stride = 1.

cnn2d_4 = nn.Conv2d(in_channels=3, out_channels=5, kernel_size=3, stride=1)print("cnn2d_4: \n")print(cnn2d_4(input_2d_img).shape, "\n")print(cnn2d_4(input_2d) )) OUTPUT cnn2d_4: torch.Size([1, 5, 1, 8]) tensor([[[[-2.0868e+00, -2.7669e+00, -3.4470e+00, -4.1271e+00,- 4.8072e+00, -5.4873e+00, -6.1673e+00, -6.8474e+00]],[[-4.5052e-01, -5.5917e-01, -6.6783e-01, -7.7648e-01, -8.8514e-01, -9.9380e-01, -1.1025e+00, -1.2111e+00] ], [[ 6.6228e-01, 8.3826e-01, 1.0142e+00, 1.1902e+00, 1.3662e+00, 1.5422e+00, 1.7181e+00, 1.8941e+00]], [[-5.4425 e-01, -1.2149e+00, -1.8855e+00, -2.5561e+00, -3.2267e+00, -3.8973e+00, -4.5679e+00, -5.2385e+00]], [[ 2.0564e-01, 1.6357e-01, 1.2150e-01, 7.9434e-02, 3.7365e-02, -4.7036e-03, -4.6773e-02, -8.8842e-02]])], grad_fn= )


Thank you for reading. Suggestions and constructive criticism are welcome. :) You can find me on LinkedIn. You can view the complete code here. Check the Github repository here and star if you like.


(This article is translated from Akshaj Verma's article "[Pytorch Basics] How to train your Neural Net — Intro to CNN", reference: https://towardsdatascience.com/pytorch-basics-how-to-train-your-neural- net-intro-to-cnn-26a14c2ea29)

technology Category Latest News