1 00:00:00,450 --> 00:00:06,240 Now we're going to start with a complete end to end project in this project. 2 00:00:06,240 --> 00:00:13,040 We will try to classify colored images of cats and dogs. 3 00:00:14,030 --> 00:00:17,730 So we'll take up dataset from Google. 4 00:00:17,730 --> 00:00:22,900 Google is a Web site where a lot of data science competitions are being held. 5 00:00:23,040 --> 00:00:29,340 There was a competition which was held in 2003 in which thousands of images of cats and dogs were given 6 00:00:30,060 --> 00:00:37,520 and a model was to be buried to classify those images into cats and dogs. 7 00:00:37,650 --> 00:00:43,650 The best accuracy achieved in that competition was nearly ninety eight percent. 8 00:00:43,650 --> 00:00:49,310 We are going to use a subset of that data and try to build our model. 9 00:00:49,710 --> 00:00:57,270 And we tried to achieve over 90 percent accuracy with our model. 10 00:00:57,280 --> 00:01:01,240 Here are some of the details of this project. 11 00:01:01,300 --> 00:01:08,710 This is a binary classification problem unlike fashion amnesty in which there were 10 categories to 12 00:01:08,710 --> 00:01:09,970 be predicted. 13 00:01:10,360 --> 00:01:12,020 Here we have only two. 14 00:01:12,430 --> 00:01:16,060 Either that images of a cat or it is of a dog. 15 00:01:18,070 --> 00:01:19,390 So only two glasses. 16 00:01:19,390 --> 00:01:22,840 That is why it is a binary classification problem. 17 00:01:23,710 --> 00:01:26,930 Then this is a data set of colored images. 18 00:01:27,430 --> 00:01:32,950 That is we will have three channels R D and B instead of only one channel. 19 00:01:33,280 --> 00:01:35,380 As we have in fact an amnesty does it. 20 00:01:37,000 --> 00:01:45,340 Then we do not have a standard dimension of all these images as you saw in the previous project. 21 00:01:45,340 --> 00:01:54,520 We were using twenty eight by 28 pixel images but here are data set does not have one standard dimension. 22 00:01:54,670 --> 00:02:01,910 So when we are feeding the data to our model we will have to convert the images to one standard dimension. 23 00:02:02,650 --> 00:02:04,020 So that is one additional step 24 00:02:07,020 --> 00:02:08,840 then we are using a cattle dataset. 25 00:02:09,300 --> 00:02:18,030 If you are interested you can go to The Kaggle website and see this cat versus dog competition. 26 00:02:18,030 --> 00:02:20,170 You can also see the leaderboard there. 27 00:02:20,370 --> 00:02:28,650 How much accuracy people have achieved and you can compare your model with other people's model and 28 00:02:28,650 --> 00:02:35,850 the last point is we are going to use a subset of the total data the total data I had over 50000 images 29 00:02:36,720 --> 00:02:46,520 in our model we are going to use only 4000 images to tell them to train 1000 for validation dataset 30 00:02:47,010 --> 00:02:51,300 and 1000 for testing. 31 00:02:51,300 --> 00:02:58,500 So using only this small part of the data we are still going to achieve accuracy which are comparable 32 00:02:58,860 --> 00:03:05,390 to the other models built by people in the competition. 33 00:03:05,690 --> 00:03:13,310 So here is how we have structured the data these zip file that you will download from the link that 34 00:03:13,310 --> 00:03:23,540 we have provided has 4000 images and those images are structured in this format. 35 00:03:23,810 --> 00:03:32,390 So the first folder will have three folders inside of it these three folders will be tighter train valid 36 00:03:32,570 --> 00:03:42,220 and test the drain for the will further have two folders these folders will be cats and dogs. 37 00:03:42,400 --> 00:03:52,640 So plus a head is cat and Class B is dogs and this border we will have thousand images of cats and in 38 00:03:52,640 --> 00:03:59,750 this world we will have told them images of dogs similarly in validation data set there will be two 39 00:04:00,050 --> 00:04:08,340 folders one containing 500 images of cats the other containing 500 images of dogs in the testing data 40 00:04:08,350 --> 00:04:16,190 set will have thousand images so in total there are photos and images two thousand will be used for 41 00:04:16,190 --> 00:04:23,870 training the model 1000 will be used for validation data set and the last all the images will be used 42 00:04:23,870 --> 00:04:27,470 for testing the accuracy on previously unseen data 43 00:04:31,890 --> 00:04:37,270 so the process we are going to follow while building this project is this first. 44 00:04:37,370 --> 00:04:44,090 We will be creating a CNN model with four convolution layers. 45 00:04:44,150 --> 00:04:48,720 So it will have four different Congressional layers paired with pooling layer. 46 00:04:50,150 --> 00:04:57,720 And this model will be able to achieve accuracy in the range of 70 to 75 percent. 47 00:04:57,740 --> 00:05:00,840 I'm talking about validation accuracy here. 48 00:05:01,310 --> 00:05:05,320 So this model will be able to achieve somewhere between 70 to 75. 49 00:05:07,100 --> 00:05:16,430 Then because we have a small dataset we can improve the performance of our model by doing data augmentation 50 00:05:17,870 --> 00:05:24,290 data augmentation is the process of creating artificial images using these small dataset that you have. 51 00:05:25,760 --> 00:05:31,800 So in the second step we will augment our data and then then our model again. 52 00:05:32,030 --> 00:05:40,070 For example if you have this image of a cat you can create a new image by zooming in a small part of 53 00:05:40,070 --> 00:05:45,460 this image or you can create a new image by rotating this image of a cat. 54 00:05:47,120 --> 00:05:54,020 And there are many more transformations that you can do to this image to create a similar image of a 55 00:05:54,020 --> 00:05:57,770 cat using an existing image. 56 00:05:57,770 --> 00:06:02,930 So using one image you will be able to create multiple images just by transforming that image a little 57 00:06:02,930 --> 00:06:03,590 bit. 58 00:06:03,890 --> 00:06:11,680 Transformations include linear transformations rotations zooming in zooming out etc.. 59 00:06:11,720 --> 00:06:22,230 So after you do this and you run the model again you'll be able to achieve an accuracy or 80 percent. 60 00:06:22,260 --> 00:06:30,240 Lastly we'll use one of the architectures that we have discussed previously and we will try to implement 61 00:06:30,270 --> 00:06:37,210 those pre learned architectures to try to classify this cat was a dog dataset. 62 00:06:37,480 --> 00:06:42,740 Using that pretend architecture will be able to achieve an accuracy over 90 percent. 63 00:06:45,600 --> 00:06:55,140 So after this project you have understanding of how to import images how to run binary or multi class 64 00:06:55,170 --> 00:07:03,480 classification using CNN and how to use retained architectures to solve a problem that you have with 65 00:07:03,480 --> 00:07:03,630 you.