1
00:00:00,450 --> 00:00:06,240
Now we're going to start with a complete end to end project in this project.

2
00:00:06,240 --> 00:00:13,040
We will try to classify colored images of cats and dogs.

3
00:00:14,030 --> 00:00:17,730
So we'll take up dataset from Google.

4
00:00:17,730 --> 00:00:22,900
Google is a Web site where a lot of data science competitions are being held.

5
00:00:23,040 --> 00:00:29,340
There was a competition which was held in 2003 in which thousands of images of cats and dogs were given

6
00:00:30,060 --> 00:00:37,520
and a model was to be buried to classify those images into cats and dogs.

7
00:00:37,650 --> 00:00:43,650
The best accuracy achieved in that competition was nearly ninety eight percent.

8
00:00:43,650 --> 00:00:49,310
We are going to use a subset of that data and try to build our model.

9
00:00:49,710 --> 00:00:57,270
And we tried to achieve over 90 percent accuracy with our model.

10
00:00:57,280 --> 00:01:01,240
Here are some of the details of this project.

11
00:01:01,300 --> 00:01:08,710
This is a binary classification problem unlike fashion amnesty in which there were 10 categories to

12
00:01:08,710 --> 00:01:09,970
be predicted.

13
00:01:10,360 --> 00:01:12,020
Here we have only two.

14
00:01:12,430 --> 00:01:16,060
Either that images of a cat or it is of a dog.

15
00:01:18,070 --> 00:01:19,390
So only two glasses.

16
00:01:19,390 --> 00:01:22,840
That is why it is a binary classification problem.

17
00:01:23,710 --> 00:01:26,930
Then this is a data set of colored images.

18
00:01:27,430 --> 00:01:32,950
That is we will have three channels R D and B instead of only one channel.

19
00:01:33,280 --> 00:01:35,380
As we have in fact an amnesty does it.

20
00:01:37,000 --> 00:01:45,340
Then we do not have a standard dimension of all these images as you saw in the previous project.

21
00:01:45,340 --> 00:01:54,520
We were using twenty eight by 28 pixel images but here are data set does not have one standard dimension.

22
00:01:54,670 --> 00:02:01,910
So when we are feeding the data to our model we will have to convert the images to one standard dimension.

23
00:02:02,650 --> 00:02:04,020
So that is one additional step

24
00:02:07,020 --> 00:02:08,840
then we are using a cattle dataset.

25
00:02:09,300 --> 00:02:18,030
If you are interested you can go to The Kaggle website and see this cat versus dog competition.

26
00:02:18,030 --> 00:02:20,170
You can also see the leaderboard there.

27
00:02:20,370 --> 00:02:28,650
How much accuracy people have achieved and you can compare your model with other people's model and

28
00:02:28,650 --> 00:02:35,850
the last point is we are going to use a subset of the total data the total data I had over 50000 images

29
00:02:36,720 --> 00:02:46,520
in our model we are going to use only 4000 images to tell them to train 1000 for validation dataset

30
00:02:47,010 --> 00:02:51,300
and 1000 for testing.

31
00:02:51,300 --> 00:02:58,500
So using only this small part of the data we are still going to achieve accuracy which are comparable

32
00:02:58,860 --> 00:03:05,390
to the other models built by people in the competition.

33
00:03:05,690 --> 00:03:13,310
So here is how we have structured the data these zip file that you will download from the link that

34
00:03:13,310 --> 00:03:23,540
we have provided has 4000 images and those images are structured in this format.

35
00:03:23,810 --> 00:03:32,390
So the first folder will have three folders inside of it these three folders will be tighter train valid

36
00:03:32,570 --> 00:03:42,220
and test the drain for the will further have two folders these folders will be cats and dogs.

37
00:03:42,400 --> 00:03:52,640
So plus a head is cat and Class B is dogs and this border we will have thousand images of cats and in

38
00:03:52,640 --> 00:03:59,750
this world we will have told them images of dogs similarly in validation data set there will be two

39
00:04:00,050 --> 00:04:08,340
folders one containing 500 images of cats the other containing 500 images of dogs in the testing data

40
00:04:08,350 --> 00:04:16,190
set will have thousand images so in total there are photos and images two thousand will be used for

41
00:04:16,190 --> 00:04:23,870
training the model 1000 will be used for validation data set and the last all the images will be used

42
00:04:23,870 --> 00:04:27,470
for testing the accuracy on previously unseen data

43
00:04:31,890 --> 00:04:37,270
so the process we are going to follow while building this project is this first.

44
00:04:37,370 --> 00:04:44,090
We will be creating a CNN model with four convolution layers.

45
00:04:44,150 --> 00:04:48,720
So it will have four different Congressional layers paired with pooling layer.

46
00:04:50,150 --> 00:04:57,720
And this model will be able to achieve accuracy in the range of 70 to 75 percent.

47
00:04:57,740 --> 00:05:00,840
I'm talking about validation accuracy here.

48
00:05:01,310 --> 00:05:05,320
So this model will be able to achieve somewhere between 70 to 75.

49
00:05:07,100 --> 00:05:16,430
Then because we have a small dataset we can improve the performance of our model by doing data augmentation

50
00:05:17,870 --> 00:05:24,290
data augmentation is the process of creating artificial images using these small dataset that you have.

51
00:05:25,760 --> 00:05:31,800
So in the second step we will augment our data and then then our model again.

52
00:05:32,030 --> 00:05:40,070
For example if you have this image of a cat you can create a new image by zooming in a small part of

53
00:05:40,070 --> 00:05:45,460
this image or you can create a new image by rotating this image of a cat.

54
00:05:47,120 --> 00:05:54,020
And there are many more transformations that you can do to this image to create a similar image of a

55
00:05:54,020 --> 00:05:57,770
cat using an existing image.

56
00:05:57,770 --> 00:06:02,930
So using one image you will be able to create multiple images just by transforming that image a little

57
00:06:02,930 --> 00:06:03,590
bit.

58
00:06:03,890 --> 00:06:11,680
Transformations include linear transformations rotations zooming in zooming out etc..

59
00:06:11,720 --> 00:06:22,230
So after you do this and you run the model again you'll be able to achieve an accuracy or 80 percent.

60
00:06:22,260 --> 00:06:30,240
Lastly we'll use one of the architectures that we have discussed previously and we will try to implement

61
00:06:30,270 --> 00:06:37,210
those pre learned architectures to try to classify this cat was a dog dataset.

62
00:06:37,480 --> 00:06:42,740
Using that pretend architecture will be able to achieve an accuracy over 90 percent.

63
00:06:45,600 --> 00:06:55,140
So after this project you have understanding of how to import images how to run binary or multi class

64
00:06:55,170 --> 00:07:03,480
classification using CNN and how to use retained architectures to solve a problem that you have with

65
00:07:03,480 --> 00:07:03,630
you.