1
00:00:01,060 --> 00:00:04,780
In this lecture, we will understand the concept of a very good.

2
00:00:06,740 --> 00:00:14,610
We have been saying they know that a cell in the Convolutional did get information from a set of big

3
00:00:14,610 --> 00:00:22,010
cells or a set of cells in the previously, for example, that red cell in the Convolutional.

4
00:00:22,010 --> 00:00:27,290
It is getting information from these nine cells and this writer, Daniel.

5
00:00:29,550 --> 00:00:30,810
But what does this mean?

6
00:00:32,190 --> 00:00:35,580
How is it getting the information from all these pixels?

7
00:00:36,960 --> 00:00:39,080
We have 25 pixels here.

8
00:00:40,440 --> 00:00:48,630
And our cell here can have only one value, which should be the representative value for these 25 pixels.

9
00:00:50,340 --> 00:00:56,700
So we need to find a way to convert these 25 values of pixels into one value.

10
00:00:58,550 --> 00:01:07,320
This is done by using a feed filter is a matrix of same dimensions as our window of receptive feed.

11
00:01:08,730 --> 00:01:11,580
So if the window is five Crossfade.

12
00:01:12,570 --> 00:01:14,950
Think that is also a dimension five.

13
00:01:14,990 --> 00:01:15,460
CROSSFIRE.

14
00:01:17,050 --> 00:01:21,790
If it is up three close three feet, there will also be of three across three dimensions.

15
00:01:24,870 --> 00:01:33,690
No, we have a window of five into five pixels containing pixel value and we have a five in two five

16
00:01:34,560 --> 00:01:36,870
matrix containing some values.

17
00:01:39,000 --> 00:01:43,860
We multiply each pixel value with the corresponding filter value.

18
00:01:45,280 --> 00:01:47,500
And add all of these products up.

19
00:01:49,540 --> 00:01:51,370
So the pixel value hit.

20
00:01:53,500 --> 00:02:00,760
Will be multiplied with zero point for the next pixel value will be multiplied with zero point three.

21
00:02:01,120 --> 00:02:01,750
And so on.

22
00:02:02,770 --> 00:02:05,850
And all these products will be added to.

23
00:02:07,610 --> 00:02:09,620
This will give us one number.

24
00:02:10,370 --> 00:02:15,230
And this number will represent information in these 25 pixels.

25
00:02:18,190 --> 00:02:23,310
Now the question comes, how do we decide the values and district that?

26
00:02:25,290 --> 00:02:27,360
The answer to this is very pleasing.

27
00:02:28,290 --> 00:02:30,330
We do not have to decide these values.

28
00:02:31,600 --> 00:02:34,370
I'd network will learn these values also.

29
00:02:35,620 --> 00:02:39,610
So when we are training our model, these values will be self-reliant.

30
00:02:43,330 --> 00:02:50,290
Now to demonstrate how it does work and how they are able to extract certain features out.

31
00:02:51,900 --> 00:02:54,990
I have taken a five into five foot image.

32
00:02:56,510 --> 00:03:01,670
With zero one type pixel values and a three by three feet that.

33
00:03:05,690 --> 00:03:06,680
Look at the speed that.

34
00:03:07,810 --> 00:03:10,600
This figure looks like a cross.

35
00:03:11,810 --> 00:03:16,610
That is the diagonal values are one and the other are Z2.

36
00:03:18,600 --> 00:03:24,280
If we use this figure with a straight up one, we get this output.

37
00:03:27,570 --> 00:03:30,720
The D.A. below shows you how we get this output.

38
00:03:33,020 --> 00:03:40,040
How the picture values are multiplied and their product values are added up to good first value.

39
00:03:40,610 --> 00:03:44,020
Then the next value and then the next and so on.

40
00:03:49,370 --> 00:03:54,800
This final output, which we get after applying, if we get is called a feature map.

41
00:03:56,630 --> 00:04:05,110
A feature map, because each way that highlights some feature of the input image, the images on the

42
00:04:05,110 --> 00:04:10,780
right are demonstrating how particular features are highlighted by Fragos.

43
00:04:12,870 --> 00:04:21,330
For example, if we use a vertical feet, that that is the middle column of this matrix is one one one,

44
00:04:22,470 --> 00:04:24,990
and these side columns are zero zero zero.

45
00:04:27,100 --> 00:04:31,150
This type of thing that transforms the image to this image.

46
00:04:33,350 --> 00:04:40,580
Notice that vertical white lanes are enhanced and the rest of the image is blurred.

47
00:04:42,810 --> 00:04:45,870
Similarly, if we use the horizontal for that.

48
00:04:47,130 --> 00:04:50,230
That is this middle role will be one one, one.

49
00:04:52,800 --> 00:04:56,310
And top and bottom row will consist of Zeitels.

50
00:04:57,870 --> 00:05:00,060
If we use such a horizontal filter.

51
00:05:01,180 --> 00:05:02,260
We got this image.

52
00:05:03,610 --> 00:05:09,460
You can notice that horizontal white lines are highlighted and the rest is blurred.

53
00:05:11,980 --> 00:05:13,450
This is what Peter does.

54
00:05:14,760 --> 00:05:20,910
A fate that is a set of values which transforms the window by doing some of products.

55
00:05:22,520 --> 00:05:29,180
What we get after the playing of that is called a feature map, each feature map has some particular

56
00:05:29,180 --> 00:05:30,400
feature highlighted.

57
00:05:33,320 --> 00:05:37,770
So what we will do is we will use many types of quader.

58
00:05:38,810 --> 00:05:44,600
So that each filter creates different feature maps containing different features.

59
00:05:46,340 --> 00:05:51,560
This means our convolutional live is going to be a bundle of feature maps.

60
00:05:52,870 --> 00:05:56,470
And each region map has some particular highlighted feature.

61
00:05:57,910 --> 00:06:01,870
Important thing to notice here is what happens in the next lit.

62
00:06:03,390 --> 00:06:04,650
So this cell.

63
00:06:05,640 --> 00:06:09,000
In the first feature map of Convolutional led to.

64
00:06:10,260 --> 00:06:11,250
What does this seat.

65
00:06:12,310 --> 00:06:17,280
Is it only this rectangle on the first feature map of previously?

66
00:06:18,250 --> 00:06:22,500
Or this rectangle on all feature maps in the previously.

67
00:06:24,410 --> 00:06:32,210
The answer is that each sale on Convolutional leered two will be getting information of all the featured

68
00:06:32,210 --> 00:06:34,070
maps and the previously.

69
00:06:35,610 --> 00:06:43,050
Because only then can these cells combine the different features to find more high level features.

70
00:06:46,050 --> 00:06:47,870
Let's summarize again for clarity.

71
00:06:49,280 --> 00:06:55,130
We apply a filter on the previous list of data to extract features.

72
00:06:58,100 --> 00:07:01,730
The output after a playing field is called a feature map.

73
00:07:03,490 --> 00:07:08,710
We apply many different types of windows to extract many different types of features.

74
00:07:09,920 --> 00:07:12,710
This gives us a bundle of feature maps.

75
00:07:14,660 --> 00:07:18,480
The first Bundalong feature maps is called Convolutional Leered One.

76
00:07:21,860 --> 00:07:26,210
Congressional leered to Volks on these extracted features.

77
00:07:27,530 --> 00:07:30,110
To extract even higher level features.

78
00:07:33,670 --> 00:07:36,670
Next, we are going to discuss about the input layer.

79
00:07:38,250 --> 00:07:41,690
Input, it also has multiple layers of information.

80
00:07:42,790 --> 00:07:44,390
These layers are called Jenelle's.

81
00:07:44,950 --> 00:07:47,290
We talk about tunnels in the next video.