1
00:00:01,590 --> 00:00:07,590
By now, you you'd have noticed that every architecture has in general two parts.

2
00:00:09,300 --> 00:00:13,140
The first part of the architecture is a convolutional base.

3
00:00:14,610 --> 00:00:17,820
This includes convolutional layers and pooling layers.

4
00:00:20,650 --> 00:00:24,310
It could be one convolutional and one willingly.

5
00:00:24,460 --> 00:00:25,690
It could be tens of them.

6
00:00:25,810 --> 00:00:26,960
It could be hundreds of them.

7
00:00:28,120 --> 00:00:34,390
However, if you look at the architectures, most of them in the first part had convolutional.

8
00:00:34,490 --> 00:00:41,800
Basically, the output of that convolutional base goes into a fully connected neural network.

9
00:00:43,480 --> 00:00:46,720
So the job of convolutional base is very generic.

10
00:00:47,430 --> 00:00:52,180
It is to find out and highlight certain features from the input images.

11
00:00:53,860 --> 00:01:02,920
For example, if you are inputting cats and dogs data, the job of convolutional base would be to highlight

12
00:01:03,070 --> 00:01:04,440
eyes, ears.

13
00:01:04,760 --> 00:01:07,670
This goes Glaus, etc..

14
00:01:08,980 --> 00:01:16,180
So all of these individual features of the image are highlighted by the conditional base.

15
00:01:17,890 --> 00:01:25,510
The job of fully connected neural network is to use these identified features to classify the image,

16
00:01:25,630 --> 00:01:27,910
whether it is a dog or whether it is a cat.

17
00:01:30,460 --> 00:01:40,090
So if you have a neural network which is already trained in identifying certain features and then classifying

18
00:01:40,090 --> 00:01:47,920
those images and now you have a new problem in which you are also trying to find the same features,

19
00:01:48,630 --> 00:01:51,310
maybe you are trying to have a different classification.

20
00:01:51,580 --> 00:01:58,240
But if the import images have the same features, in that case, you can use these same convolutional

21
00:01:58,240 --> 00:02:00,870
base of retrain models.

22
00:02:03,040 --> 00:02:12,550
For example, in 2014, the AI, unless we asked each challenge, had one million images of different

23
00:02:12,610 --> 00:02:19,000
animals and there were 1000 different animals to which these images belong.

24
00:02:21,710 --> 00:02:30,920
The convolutional base of the winning networks were identifying features of different animals and declassifies.

25
00:02:30,990 --> 00:02:36,720
In the end, were only classifying those features and do which animal it is.

26
00:02:37,260 --> 00:02:38,880
And what is the breed of that animal?

27
00:02:43,020 --> 00:02:51,850
So now if we're only learning a model to classify cats and dogs, this is the similar kind of input

28
00:02:51,870 --> 00:02:54,600
images that that particular talent had.

29
00:02:55,590 --> 00:03:03,300
So a model that is trained on 2014, data that can be used in our problem also.

30
00:03:06,120 --> 00:03:11,070
So this is the concept of crosswell learning or feature extraction.

31
00:03:11,940 --> 00:03:19,560
We are going to use some part of our pre train model, mostly the convolutional base, because convolutional

32
00:03:19,560 --> 00:03:21,270
base is more genetic.

33
00:03:21,690 --> 00:03:29,220
It is only finding features and we will put a new classifier in front of the convolutional base.

34
00:03:29,790 --> 00:03:38,790
That classifier will be trained by our system and that will be trained to classify and identify are

35
00:03:39,540 --> 00:03:42,390
images into the classes that we have.

36
00:03:44,690 --> 00:03:46,820
So this convolutional base will remain the same.

37
00:03:47,330 --> 00:03:51,620
We will have a new classified on top of it to classify our images.

38
00:03:53,150 --> 00:04:00,950
The advantages of doing this is that it saves a lot of time because we do not have to train this part

39
00:04:00,950 --> 00:04:01,700
of the network.

40
00:04:03,230 --> 00:04:06,320
Another good thing is these are proven models.

41
00:04:06,800 --> 00:04:09,710
They are one of the best in finding the features.

42
00:04:10,580 --> 00:04:18,320
So when we take their convolutional base, we can be assured that the features extracted from the images

43
00:04:18,740 --> 00:04:19,610
would be the best.

44
00:04:21,670 --> 00:04:25,330
Also, these models are trained on huge does it.

45
00:04:26,140 --> 00:04:29,740
They had input data of millions of images.

46
00:04:31,000 --> 00:04:37,120
So even if you have a small dataset from which featured extraction could have been difficult.

47
00:04:38,200 --> 00:04:44,710
These models are already trained to extract features on large amount of data.

48
00:04:46,540 --> 00:04:49,030
And the best thing is they're very easy to use.

49
00:04:49,570 --> 00:04:51,690
They are part of the get us library.

50
00:04:52,240 --> 00:04:59,350
It only takes a few lines of good to download all the weight of all the neurons in the convolutional

51
00:04:59,350 --> 00:04:59,710
base.

52
00:04:59,950 --> 00:05:01,720
And those can be used straightaway.

53
00:05:02,890 --> 00:05:11,110
So in that project, we will see how using pre train models, we can achieve higher level of accuracy

54
00:05:11,320 --> 00:05:14,770
even if we have a small amount of data to train model.