1 00:00:00,630 --> 00:00:06,750 In the last lecture, we have created the structure for our monthly levelle, Perceptron Morlin. 2 00:00:08,190 --> 00:00:12,930 Now, before dreaming this model, we need to set up the learning processes. 3 00:00:15,300 --> 00:00:18,930 And to do that, we will use the combined method. 4 00:00:20,810 --> 00:00:22,860 We will first give the lost function. 5 00:00:23,310 --> 00:00:25,290 Then we will give the optimizer. 6 00:00:26,130 --> 00:00:34,530 And then the metrics we want to calculate to judge the performance of what model we are using. 7 00:00:34,590 --> 00:00:38,270 Lost function as sparse, categorical cross and copy. 8 00:00:39,870 --> 00:00:47,640 We are using this because our VI data is available in the form of labels in the word data. 9 00:00:47,760 --> 00:00:55,830 We have specific labels for ten different items and that's why we are using this as part of what we 10 00:00:55,830 --> 00:00:56,910 call Cross and Groppi. 11 00:00:58,570 --> 00:01:06,600 If instead we had probabilities for a class in the word Y variable, then we had to use Categorical 12 00:01:06,660 --> 00:01:07,580 Cross and Groppi. 13 00:01:08,670 --> 00:01:13,620 But since we have labels, we are using the sparse, categorical Cross and Groppi. 14 00:01:16,090 --> 00:01:22,620 And suppose we had binary labels such as Yosano or Cool False. 15 00:01:23,110 --> 00:01:27,040 In that case, we had to use Binary Cross and Groppi. 16 00:01:29,300 --> 00:01:35,300 You can get details of all these lost functions in the official get us documentation. 17 00:01:36,530 --> 00:01:39,150 I have provided the link of that documentation. 18 00:01:39,740 --> 00:01:47,960 So if you open it, you will get details of all the parameters that this compiling my code can take. 19 00:01:49,370 --> 00:01:56,840 You can look at all other optimizers and lost function and matrixes in the following documentation's. 20 00:02:00,250 --> 00:02:09,040 Then for optimizer, we are using as Judy, as Judy simply stands for stochastic gradient descent. 21 00:02:10,960 --> 00:02:20,020 In other words, we are just telling us to perform back propagation algorithm and for metrics, we are 22 00:02:20,020 --> 00:02:24,790 using accuracy since we are building a classifier. 23 00:02:25,300 --> 00:02:26,740 We have to use accuracy. 24 00:02:27,010 --> 00:02:33,670 If you are using the regression model, you can use mean script, edit and so on. 25 00:02:35,720 --> 00:02:41,090 So basically, we have to provide this information before fitting our training data. 26 00:02:42,320 --> 00:02:46,130 So just fundis this come on, you are giving three parameters. 27 00:02:46,940 --> 00:02:47,630 Combining. 28 00:02:51,920 --> 00:03:00,770 Snow, we have compiled over the next step, this to fit extreme and Vytorin data in this model. 29 00:03:02,770 --> 00:03:09,020 This this index of fitting the model we are calling dark matter. 30 00:03:09,350 --> 00:03:14,590 And then we are providing extreme Vytorin, the number of epochs. 31 00:03:15,410 --> 00:03:22,770 I hope you remember what epoxide we have discussed in our tury lectures and by before the epochs value 32 00:03:22,850 --> 00:03:23,840 is set to one. 33 00:03:25,550 --> 00:03:30,080 So if you're born mentioned epoch by default, the value is one. 34 00:03:32,660 --> 00:03:35,180 And then since we have validation data one. 35 00:03:35,540 --> 00:03:43,690 So we are providing X valued and very valid datasets that we have created in our previous lectures. 36 00:03:44,540 --> 00:03:49,700 We are storing this object with another object, which we are calling a mortally history. 37 00:03:51,200 --> 00:03:53,210 So let's run this. 38 00:04:02,560 --> 00:04:10,350 You can see at each epoch during the screaming, Kate has displayed a number of instances process. 39 00:04:10,720 --> 00:04:11,380 So far. 40 00:04:15,730 --> 00:04:21,730 You can see there is a progress bar and we are getting information of each epochs. 41 00:04:23,420 --> 00:04:32,710 And then we are also getting the loss, accuracy, validation, loss and validation accuracy during 42 00:04:32,800 --> 00:04:33,790 each epoch. 43 00:04:38,870 --> 00:04:42,890 So it will take some time depending on your system configurations. 44 00:04:44,540 --> 00:04:46,670 So I'm just fast forwarding this. 45 00:04:59,130 --> 00:05:00,760 Now the training is complete. 46 00:05:01,300 --> 00:05:08,560 You can see that the loss on our training retired zero point zero eight accuracy is zero point nine 47 00:05:08,560 --> 00:05:17,920 seven for our validation set loss is zero point three nine and accuracy is zero point eight eight. 48 00:05:19,570 --> 00:05:23,200 So the food just compared with the first epoch. 49 00:05:23,200 --> 00:05:29,700 Well, to the accuracy on the 12 relevation set during our first epoch was zero point date me. 50 00:05:33,820 --> 00:05:41,290 Now you can see what regulation, accuracy's oscillating matola training accuracy. 51 00:05:42,670 --> 00:05:47,740 So during the first epoch, the accuracy score was four nine five two. 52 00:05:48,790 --> 00:05:54,730 And after the last book, that curacy score is zero point nine seven. 53 00:05:55,960 --> 00:06:01,400 So in each book, the training accuracy is increasingly developed. 54 00:06:04,430 --> 00:06:06,000 Snow V of Crane, no data. 55 00:06:08,210 --> 00:06:12,770 There are few more parameters that are available with fit method. 56 00:06:15,080 --> 00:06:18,050 One important parameter is glass weights. 57 00:06:19,460 --> 00:06:24,110 So if you have some uneven distribution of your classes and you are way variable. 58 00:06:25,010 --> 00:06:33,200 So suppose all over 60000 reports, 50000 would be shirts and dress soft. 59 00:06:33,200 --> 00:06:37,820 Nine categories are spread across the remaining 10000 records. 60 00:06:38,570 --> 00:06:49,880 Then we have to use Clausewitz to give larger weight to underrepresented classes and to give lower weights 61 00:06:50,060 --> 00:06:51,530 to what we presented. 62 00:06:51,560 --> 00:07:00,470 Classes since in our dataset the categories are uniformly spread and there is no uneven distribution 63 00:07:00,680 --> 00:07:01,700 of categories. 64 00:07:01,880 --> 00:07:04,440 That's why we are not using class weights. 65 00:07:06,140 --> 00:07:13,550 But if in your example, there is some or not representation of some specific classes, then you have 66 00:07:13,550 --> 00:07:22,750 to use class weights after fitting into more than you can call different attributes of forward more 67 00:07:22,760 --> 00:07:24,020 than history object. 68 00:07:24,980 --> 00:07:27,840 So you can call barometer's. 69 00:07:28,550 --> 00:07:33,490 This will give you information of all the parameters that we have use in training this model. 70 00:07:36,770 --> 00:07:45,570 We have another parameter that is dot epoch that will give you details of each epoch and the most important 71 00:07:45,600 --> 00:07:47,460 attribute is history. 72 00:07:47,820 --> 00:07:50,970 So a few right here object me. 73 00:07:51,390 --> 00:07:52,980 And then right dot history. 74 00:07:54,990 --> 00:08:02,030 This will give you all the laws, accuracy, relevation loss and validation, accuracy in the form of 75 00:08:02,040 --> 00:08:02,700 dictionary. 76 00:08:04,380 --> 00:08:09,960 So this is the lost value on our training set for the Today box. 77 00:08:10,260 --> 00:08:15,930 Then we have the accuracy value on our training set for the Today box. 78 00:08:17,460 --> 00:08:20,770 Then we have the validation loss for today's box. 79 00:08:21,330 --> 00:08:26,940 And lastly, the validation accuracy for today box. 80 00:08:28,850 --> 00:08:33,380 So all the information which you were getting while training your data. 81 00:08:33,560 --> 00:08:38,170 You can also access that information by using street attribute. 82 00:08:39,760 --> 00:08:49,970 You can also blog this information to visualize how our accuracy is skorts are changing with each epoch. 83 00:08:51,050 --> 00:08:55,820 So here I am just learning more than history and art history. 84 00:08:55,910 --> 00:08:57,690 The information that we have here. 85 00:08:58,910 --> 00:09:02,360 And then we won the grades in our Lord. 86 00:09:03,080 --> 00:09:09,770 And then we want our Y-axis to be within Zied Wentland to plourde this. 87 00:09:09,830 --> 00:09:14,240 You will get a graph of this kind on top. 88 00:09:14,990 --> 00:09:18,170 We have an orange line of training accuracy. 89 00:09:18,440 --> 00:09:21,590 Then we have a red line of validation accuracy. 90 00:09:22,160 --> 00:09:29,450 Then we have a green line of validation loss and then a blue line of training loss. 91 00:09:31,120 --> 00:09:39,470 If you can see with each epoch the training, accuracy and the valuation, accuracy is increasing and 92 00:09:39,770 --> 00:09:41,360 the loss is decreasing. 93 00:09:42,560 --> 00:09:51,140 You can also tell that the model has not convert yet as the validation accuracy is still going up and 94 00:09:51,140 --> 00:09:53,660 the validation loss is still going low. 95 00:09:55,640 --> 00:10:00,380 So for our next try, we should gruninger for some more epochs. 96 00:10:02,690 --> 00:10:04,880 And if you call the fit my turn again. 97 00:10:06,410 --> 00:10:10,040 Kiraz will continue to train this model where you left off. 98 00:10:11,120 --> 00:10:15,160 So that's what if you just run this call again. 99 00:10:15,950 --> 00:10:21,740 The guitar will train this model for 30 more epochs and you will get Grauwe from here. 100 00:10:23,240 --> 00:10:28,260 So try running it for 30 more epochs in the next will. 101 00:10:28,670 --> 00:10:33,110 We will learn how to predict values using this model. 102 00:10:33,710 --> 00:10:34,130 Thank you.