1 00:00:00,690 --> 00:00:02,980 Let's start by importing our peanuts. 2 00:00:05,040 --> 00:00:12,060 As we've discussed earlier, we are going to use defection and my NIST dataset to classify images of 3 00:00:12,060 --> 00:00:21,730 fashion objects such as trousers, Gord's boots, etc. Fashion Emini USD is a very popular dataset. 4 00:00:23,310 --> 00:00:29,470 It is relatively small and is used to verify that an algorithm books as expected or not. 5 00:00:31,500 --> 00:00:40,130 Fashion amnesty consists of a training set of 60000 examples and a test set of intelligent examples, 6 00:00:41,720 --> 00:00:46,700 each example is a 28 by 28 grayscale image. 7 00:00:47,330 --> 00:00:54,980 That is, it has twenty eight pixels by twenty eight pixels dimensions, and it is a black and white 8 00:00:55,070 --> 00:00:55,490 image. 9 00:00:58,950 --> 00:01:04,090 With each image, there is an associated label of 10 glasses. 10 00:01:06,340 --> 00:01:08,840 I'll tell you the images after we import the dataset. 11 00:01:10,300 --> 00:01:13,510 Let's run this line of code to import data, say. 12 00:01:20,310 --> 00:01:23,550 You know, there are multiple ways to import data here. 13 00:01:23,840 --> 00:01:27,130 We are using these in big data set that comes with Deke. 14 00:01:27,160 --> 00:01:28,080 Get us Labidi. 15 00:01:29,100 --> 00:01:33,450 So if they get us back, it is not and start this line more book. 16 00:01:34,470 --> 00:01:41,200 If it is installed, you will get the dataset imported into this video building, which is fashion and 17 00:01:41,230 --> 00:01:42,750 does quote Ammonite SD. 18 00:01:46,790 --> 00:01:51,890 You can see on the right in the enrollment video bill window here exactly does it. 19 00:01:54,280 --> 00:01:56,340 Next, view this dataset by clicking on it. 20 00:02:00,640 --> 00:02:05,140 Here you can see that this dataset has to pass, train and test. 21 00:02:06,580 --> 00:02:11,110 This means that it is already divided into two parts of training and testing. 22 00:02:12,730 --> 00:02:14,470 We do not need to do this separately. 23 00:02:15,310 --> 00:02:22,330 However, if you want to learn how to separate any dataset into train and test, which is not in this 24 00:02:22,330 --> 00:02:25,610 format, please take the opening section of this course. 25 00:02:26,640 --> 00:02:29,320 There you will find a lecture titled This Train Split. 26 00:02:30,610 --> 00:02:33,080 With that, you will be able to split any dataset. 27 00:02:34,470 --> 00:02:36,020 Martin's Head. 28 00:02:36,190 --> 00:02:37,810 Our dataset is already split. 29 00:02:39,430 --> 00:02:45,910 Let's go for the train set for that has two parts X and Y. 30 00:02:46,900 --> 00:02:52,060 X is the set of predictive variables and Y is the list of output values. 31 00:02:52,720 --> 00:02:55,110 That is the class of deflection object. 32 00:02:58,940 --> 00:03:08,890 You can see the structure of X and Y also her X is a set of 60000 images, which are 28 pixel by 28 33 00:03:08,900 --> 00:03:09,300 pixel. 34 00:03:10,760 --> 00:03:16,800 So for each image, we have a value between zero and 255. 35 00:03:18,080 --> 00:03:21,530 If the value is zero, that pixel is black. 36 00:03:22,460 --> 00:03:25,780 If it is 255, that bookseller's white. 37 00:03:27,440 --> 00:03:33,170 So each individual pixels data for all the 60000 images is stored. 38 00:03:33,260 --> 00:03:34,430 And this expert even. 39 00:03:37,580 --> 00:03:46,440 Similarly, why has dick glassware loose of 60000 images, for example, the first image as the glass. 40 00:03:46,790 --> 00:03:50,090 Nine what this 90 percent. 41 00:03:50,390 --> 00:03:57,920 Look at that in something similar to the green that we have data only differences in the training set. 42 00:03:57,950 --> 00:04:00,810 We have 60000 images, data in plastic. 43 00:04:01,130 --> 00:04:02,660 We have 10000 data. 44 00:04:04,250 --> 00:04:06,800 We will use just train data to print model. 45 00:04:07,790 --> 00:04:14,230 And later on, we will predictive y values for this desk using the X values of this test. 46 00:04:15,920 --> 00:04:22,550 Then we will compare the actual Y values in this test set with the predicted Y values from our model 47 00:04:23,150 --> 00:04:25,400 to find out the accuracy of our model. 48 00:04:26,960 --> 00:04:28,150 Now let's go back to our code. 49 00:04:30,300 --> 00:04:35,120 We'll be assigning the X and Y dream values to separate variables. 50 00:04:36,500 --> 00:04:42,470 To do that, this line of code is the standard way in which we assign value to a variable. 51 00:04:43,820 --> 00:04:46,740 You can run this line also and it will give you the same result. 52 00:04:47,000 --> 00:04:54,740 It will assign the X value of the training set of fashion m NASD variable into the train images. 53 00:04:54,800 --> 00:04:58,190 We will, however, get us. 54 00:04:58,250 --> 00:05:00,590 Allows us to do that in a different way. 55 00:05:01,460 --> 00:05:08,180 In this format, you can assign the two variables brain images and train labels at the same time. 56 00:05:09,690 --> 00:05:16,760 So if you run this line of code, this will assign the X values of brain to print images and derive 57 00:05:16,760 --> 00:05:18,980 value of frame to train labels. 58 00:05:20,690 --> 00:05:21,840 Next on this line, of course. 59 00:05:21,970 --> 00:05:31,470 Now you can see that we have a train images variable and then labeled variable, then images has the 60 00:05:31,490 --> 00:05:33,950 X part and bring labels has the bypass. 61 00:05:35,990 --> 00:05:38,420 Same goes with the test images and test labels. 62 00:05:39,020 --> 00:05:39,380 Next one. 63 00:05:39,410 --> 00:05:40,130 This code also. 64 00:05:42,470 --> 00:05:49,300 And we have two more variables here, although we have seen the structure of training data and test 65 00:05:49,310 --> 00:05:49,670 data. 66 00:05:50,360 --> 00:05:56,990 If you still want to check out the structure of these new variables, you can order these two lines 67 00:05:56,990 --> 00:05:57,460 of code. 68 00:05:58,900 --> 00:05:59,350 Them. 69 00:05:59,800 --> 00:06:07,130 And within decades, variable name gives you the dimension of this variable to this variable has three 70 00:06:07,130 --> 00:06:07,760 dimensions. 71 00:06:07,910 --> 00:06:15,210 First is the 60000 values of different images and then 28 across 28. 72 00:06:15,450 --> 00:06:22,270 For all the individual pixels, if you're on the SDR command, which gives you structure, there'll 73 00:06:22,280 --> 00:06:26,720 be some additional information that it has integer type of values. 74 00:06:28,670 --> 00:06:31,120 And the initial few values are diddle diddle, diddle diddle. 75 00:06:33,050 --> 00:06:40,210 So both of these are used for the same thing to understand what is the structure of this variable that 76 00:06:40,220 --> 00:06:40,580 we have. 77 00:06:42,260 --> 00:06:47,100 Now let me show you the images so that you get a feel of what kind of data we have here. 78 00:06:49,160 --> 00:06:54,530 We can store the information of one inmate into a variable called F Object. 79 00:06:56,240 --> 00:07:05,150 So when I done this line of code, it will assign the information of the fifth image, all the pixels 80 00:07:05,750 --> 00:07:06,680 into this object. 81 00:07:07,220 --> 00:07:08,150 We just f object. 82 00:07:09,490 --> 00:07:10,310 It's done this. 83 00:07:12,230 --> 00:07:16,430 You can see that F object is a indeed cross to indicate two dimensional. 84 00:07:17,300 --> 00:07:20,750 Containing all depicts a leader of this 5th image. 85 00:07:22,700 --> 00:07:30,870 Now, if you want to block this image, you can then this line of code which has blood function and 86 00:07:30,980 --> 00:07:31,790 block function. 87 00:07:31,880 --> 00:07:39,560 We are telling that we have to block this variable as a Rasta image and studying it is basically up 88 00:07:39,580 --> 00:07:40,700 pixilated image. 89 00:07:41,510 --> 00:07:43,010 So then we've done this line of code. 90 00:07:43,520 --> 00:07:45,800 So here you can see the image on the right. 91 00:07:46,760 --> 00:07:50,150 It's a small twenty eight cross, twenty eight pixel image. 92 00:07:50,450 --> 00:07:55,370 So the image quality is not good, but you can make out the object. 93 00:07:56,210 --> 00:08:00,320 It probably looks like or top if you want to check what it is. 94 00:08:00,530 --> 00:08:05,050 We need to see the image label which is stored in the green label. 95 00:08:05,160 --> 00:08:05,510 We will. 96 00:08:09,070 --> 00:08:14,570 In the train level variable, we saw that the values are in the coded format, that is, it does return 97 00:08:14,570 --> 00:08:16,340 from zero to nine. 98 00:08:18,290 --> 00:08:23,260 So to get the actual name of the class, we first create a class name. 99 00:08:23,390 --> 00:08:25,430 Eddie, this. 100 00:08:25,490 --> 00:08:30,900 Eddie contains the list of names in the order in which we have coded these names. 101 00:08:31,670 --> 00:08:33,420 So zero stands for T. 102 00:08:35,570 --> 00:08:42,020 So if you see nine hit, nine stands for ankle boot, two stands for fluid. 103 00:08:42,540 --> 00:08:43,550 It starts with zero. 104 00:08:44,120 --> 00:08:45,170 This is the second element. 105 00:08:45,680 --> 00:08:46,790 This is the ninth element. 106 00:08:48,560 --> 00:08:57,050 Once we have created this array, we can find out the name of this object, which will take fifth image 107 00:08:58,060 --> 00:09:00,680 in the training labels variable. 108 00:09:02,750 --> 00:09:10,360 So the label of the 50 image plus one, because the recording started with zero, so we just want the 109 00:09:10,370 --> 00:09:14,810 plus Vernetta element from this Eddie. 110 00:09:16,760 --> 00:09:24,110 So let's first create this Eddie and now find out the name of this fifth image. 111 00:09:25,520 --> 00:09:29,090 You can see that the fifty mate is a t shirt slash top. 112 00:09:32,170 --> 00:09:34,720 You can check this again. 113 00:09:34,910 --> 00:09:35,770 But again, that image. 114 00:09:36,370 --> 00:09:43,760 So let's try it out for 90 minutes on this Gomaa 19 Blätter. 115 00:09:43,870 --> 00:09:47,950 I mean, this looks like a sandal. 116 00:09:50,100 --> 00:09:51,220 Not if we take the. 117 00:09:55,180 --> 00:10:00,860 To be a big dick and kick delivered, it comes out to Sandon. 118 00:10:03,610 --> 00:10:04,660 So this is our data. 119 00:10:06,220 --> 00:10:13,240 We have created four variables green images contains all the predictive variables. 120 00:10:13,990 --> 00:10:18,610 Green labels contains the output variable using these two variables. 121 00:10:18,640 --> 00:10:20,260 We will be bringing our model. 122 00:10:21,770 --> 00:10:26,700 Then we will be using that model to predict on the test images. 123 00:10:27,250 --> 00:10:29,890 And we will compare the predictions of the test labels. 124 00:10:32,980 --> 00:10:38,740 The last thing I'm going to discuss in this video is normalization of data. 125 00:10:40,450 --> 00:10:46,780 When we have heterogeneous data, learning model takes a lot of time to converge to handle this problem. 126 00:10:47,110 --> 00:10:50,920 We do normalization of beta to normalize data. 127 00:10:51,310 --> 00:10:53,770 Usually a general formalize. 128 00:10:54,430 --> 00:11:01,630 We subtract the mean of that variable from the even and divided by the standard deviation. 129 00:11:02,920 --> 00:11:04,330 So this is the general formula. 130 00:11:05,260 --> 00:11:14,020 But since our training data is not that heterogeneous, every value is of a pixel having a value between 131 00:11:14,020 --> 00:11:15,310 zero to 255. 132 00:11:17,410 --> 00:11:22,760 So we can just divide all the values in the pixels by 255. 133 00:11:23,800 --> 00:11:26,410 This will result in values between zero to one. 134 00:11:27,520 --> 00:11:30,870 And we can input these values into our training model. 135 00:11:32,230 --> 00:11:36,760 So normalization is required when we have different types of variables in our dataset. 136 00:11:37,630 --> 00:11:38,590 If that is the case. 137 00:11:38,890 --> 00:11:43,930 Use this formula to normalize here since our model is already very homogeneous. 138 00:11:44,470 --> 00:11:49,660 We can just divide the the pixel values by the highest value together. 139 00:11:50,080 --> 00:11:51,370 Simple normalized value. 140 00:11:53,530 --> 00:11:59,130 Now using these draine and test values will be creating a model in the next value.