1 00:00:00,290 --> 00:00:07,140 All in all our previous sessions, we have done lots of analysis, lots of cleaning, lots of techniques 2 00:00:07,350 --> 00:00:12,970 for future importance, feature encoding and much, much more so in this session. 3 00:00:13,260 --> 00:00:16,470 This is the session that you all guys are waiting for. 4 00:00:16,720 --> 00:00:23,340 And in this session, we are going to apply our machine learning algorithm on data because up to a greater 5 00:00:23,340 --> 00:00:28,900 extent, our data is ready to apply your machine learning algorithm on your data. 6 00:00:29,220 --> 00:00:31,100 So let's open your assignment. 7 00:00:31,110 --> 00:00:32,010 What you have to do. 8 00:00:32,370 --> 00:00:37,200 The very first statement is you have to apply an algorithm after it. 9 00:00:37,200 --> 00:00:39,840 You have to cross validate your model as well. 10 00:00:40,380 --> 00:00:47,070 So for this, what we have to do very first, you have to split your data in the form of training as 11 00:00:47,070 --> 00:00:52,100 well as your testing, because once you have trained data, you can train your model. 12 00:00:52,440 --> 00:00:57,480 Once you have test data and you can check what exactly is the accuracy of your model. 13 00:00:57,900 --> 00:01:04,170 So for this, I'm just going to say from my Escalon module, I have something which is exactly my model 14 00:01:04,170 --> 00:01:05,160 and of course, selection. 15 00:01:05,400 --> 00:01:11,430 And from this I have to import something known as train on a score test and a score split just to execute 16 00:01:11,430 --> 00:01:11,520 it. 17 00:01:11,550 --> 00:01:15,240 Now, what we have to do is you have to simply call this class. 18 00:01:15,240 --> 00:01:21,540 And in this, if I am going to pass shift plus tab, you will check all these parameters over here, 19 00:01:21,540 --> 00:01:28,380 Taci train side Sefl, Stratify splitting and all these different, different things. 20 00:01:28,380 --> 00:01:34,920 And what will it and it will me this exchange test y train and whitelist. 21 00:01:35,430 --> 00:01:42,780 So here I'm just going to say I have to pass my X and Y and after it I have to pass my test side. 22 00:01:42,780 --> 00:01:47,890 So I'm going to say test on the cost side, let's say nothing or zero point two five. 23 00:01:47,910 --> 00:01:53,190 It means you're twenty five percent of data will be considered as a test data and the rest will be you're 24 00:01:53,550 --> 00:01:54,220 carrying it out. 25 00:01:54,230 --> 00:01:57,930 So I'm just going to say it will be done with all these parameters. 26 00:01:57,930 --> 00:02:05,820 So I'm just going to execute this cell and let me pass one more parameter, which is exactly random 27 00:02:05,820 --> 00:02:12,290 underscore estate, which is exactly, I'm going to say random on a score instead and just a standard 28 00:02:12,300 --> 00:02:14,380 zero and just execute this. 29 00:02:14,520 --> 00:02:21,190 Now, what we have to do, we have to simply apply our machine learning algorithm on our data. 30 00:02:21,570 --> 00:02:28,200 So let's say we are just going to apply logistic regression because this is exactly of a classification 31 00:02:28,200 --> 00:02:28,940 algorithm. 32 00:02:29,250 --> 00:02:33,080 So here we have to apply logistic regression on our data. 33 00:02:33,090 --> 00:02:37,200 And if you guys are not that much of it about what exactly the velocity regression. 34 00:02:37,410 --> 00:02:42,540 So I have explained the mathematics intuition behind this logistic regression. 35 00:02:42,870 --> 00:02:50,100 Just check out my previous videos and you will be able to find what exactly is and mathematics, how 36 00:02:50,100 --> 00:02:53,940 this logistic regression exactly works in your machine learning. 37 00:02:54,150 --> 00:02:57,900 So very first, I have to aim for something which is exactly my own site. 38 00:02:57,910 --> 00:03:01,440 Gatland got linear underscore model. 39 00:03:01,440 --> 00:03:04,550 I have to import my lines to progression. 40 00:03:04,560 --> 00:03:07,140 So I'm just going to say just execute the cell. 41 00:03:07,210 --> 00:03:10,430 Now, what we have to do, we have to initialize this logistic regression. 42 00:03:10,440 --> 00:03:17,700 So I'm going to say logistic regression just in size it let's say here you have some custom parameters 43 00:03:17,700 --> 00:03:17,910 here. 44 00:03:17,910 --> 00:03:21,520 You can simply say you have dozens of parameters over here. 45 00:03:21,810 --> 00:03:28,080 So let me iStore its object as, let's say, Lavarack, just execute the cell. 46 00:03:28,120 --> 00:03:33,120 Now, what we have to do using this object, I'm just going to call my fit over there and what I have 47 00:03:33,120 --> 00:03:43,140 to fit I have to fit my training data, which is exactly X on this score and basically my Y on the screen 48 00:03:43,140 --> 00:03:45,330 because I have a screen model. 49 00:03:45,330 --> 00:03:45,870 That's it. 50 00:03:46,080 --> 00:03:47,850 And just execute the set. 51 00:03:47,850 --> 00:03:53,040 It will take a while and it will return as my amazing logistic regression model. 52 00:03:53,040 --> 00:03:55,940 You will see all these custom parameters over here. 53 00:03:55,950 --> 00:03:58,870 What is the parameter for the class. 54 00:03:58,890 --> 00:04:01,850 Wait, what is it all your different different parameters? 55 00:04:02,100 --> 00:04:06,800 What exactly is your regularization technique, which is exactly all these things. 56 00:04:07,170 --> 00:04:11,750 Now what you have to do, you have to simply do prediction on your testing data. 57 00:04:11,760 --> 00:04:16,860 So I'm just going to say LaGreca not predict what I have to predict. 58 00:04:16,860 --> 00:04:22,380 I have to predict on my X on a test after doing prediction, let's say I have restored my prediction 59 00:04:22,380 --> 00:04:25,710 X, I'm going to say it is nothing but my yna school grad. 60 00:04:25,950 --> 00:04:27,180 It just executed. 61 00:04:27,180 --> 00:04:33,270 And if I have to paint this so just execute this, sell all this stuff gets executed over it, you will 62 00:04:33,270 --> 00:04:37,650 see all the prediction in the form of add it over here to wherever it is one. 63 00:04:37,890 --> 00:04:43,560 It means your book is going to cancel and wherever it is zero, it means your booking is not going to 64 00:04:43,560 --> 00:04:43,920 cancel. 65 00:04:44,010 --> 00:04:48,270 That's why it is exactly what classification use is now. 66 00:04:48,270 --> 00:04:55,490 What we have to do, we just need what exactly is a confusing matrix of my logistic regression model. 67 00:04:55,830 --> 00:04:59,940 So for this, I'm just going to say from this side, Gittleman, I. 68 00:05:00,140 --> 00:05:07,190 Playing board, something which is exactly my metrics, and from these metrics, I have to import something 69 00:05:07,190 --> 00:05:08,760 which is my confusion metrics. 70 00:05:09,050 --> 00:05:15,470 Now what we have to do from this confusion call metrics, I have to pass my ad, which is exactly my 71 00:05:15,830 --> 00:05:17,150 lie and test. 72 00:05:17,480 --> 00:05:22,280 The second is exactly my why on this score, Brad just executed. 73 00:05:22,280 --> 00:05:24,220 And this is your amazing metrics. 74 00:05:24,230 --> 00:05:30,740 You will see these are exactly the correct prediction as well as these are exactly correct predictions 75 00:05:30,740 --> 00:05:34,100 and these are your not correct predictions. 76 00:05:34,340 --> 00:05:35,420 So let's see. 77 00:05:35,420 --> 00:05:41,120 I just need to hear what exactly the accuracy of my model so far is. 78 00:05:41,120 --> 00:05:41,900 What I'm going to do. 79 00:05:41,900 --> 00:05:51,320 I'm just going to say from this cyclone dot matrix, I have to import something known as which is exactly 80 00:05:51,320 --> 00:05:52,400 accuracy score. 81 00:05:52,760 --> 00:05:56,090 Now, what I have to do, I'm just going to say accuracy score. 82 00:05:56,360 --> 00:05:58,790 And in this accuracy score, I have to pass. 83 00:05:58,820 --> 00:06:00,470 Why test after it? 84 00:06:00,470 --> 00:06:04,510 I have to pass a wide spread and just execute the cell. 85 00:06:04,520 --> 00:06:11,380 You will see your Lustick regression model has somewhere close to seventy two percent accuracy. 86 00:06:11,420 --> 00:06:17,230 So if you guys are thinking this is the exact accuracy of my last two regression model. 87 00:06:17,570 --> 00:06:19,420 So no, you all are wrong. 88 00:06:19,430 --> 00:06:23,720 This is not that accuracy which is related to velocity regression model. 89 00:06:23,720 --> 00:06:31,100 It means you have to come across with your right accuracy so how you can achieve that exact accuracy. 90 00:06:31,130 --> 00:06:34,910 So in such scenarios, you have to cross validate your model. 91 00:06:35,240 --> 00:06:36,470 So let me show you a thing. 92 00:06:36,470 --> 00:06:42,050 If I'm going to let's say if I'm going to change this random sandwiches and its value as, let's say, 93 00:06:42,050 --> 00:06:42,560 42. 94 00:06:42,980 --> 00:06:50,240 So if I'm going to execute again, execute again and let me just execute again, let me show you a thing. 95 00:06:50,810 --> 00:06:57,980 And if again, I'm going to execute it and I'll get executed and let me show you what exactly is accuracy 96 00:06:57,980 --> 00:07:01,130 after playing with my randomness Christian parameter. 97 00:07:01,340 --> 00:07:03,940 And again, I'm going to execute the search. 98 00:07:03,960 --> 00:07:10,630 Now you will see your accuracy from seventy one gets jump to seventy five. 99 00:07:11,000 --> 00:07:12,730 So you will thinking yeah. 100 00:07:12,770 --> 00:07:14,370 You automatical going over here. 101 00:07:14,720 --> 00:07:17,720 So that's the scenario where you have to cross my little model. 102 00:07:18,140 --> 00:07:25,040 So whenever you will get your accuracy without consolidating your model, you have to always say I have 103 00:07:25,040 --> 00:07:27,830 achieved accuracy within this, this range. 104 00:07:28,820 --> 00:07:35,270 It means now to get the exact accuracy you have to cross, validate your model for this. 105 00:07:35,270 --> 00:07:41,480 I'm just going to say from this side, Keadilan, I have something known as model selection. 106 00:07:41,870 --> 00:07:45,110 So I'm going to say model unaskable selection. 107 00:07:45,110 --> 00:07:52,250 And from this I have to import something known as cross on, as Corvalan is called, just execute the 108 00:07:52,250 --> 00:07:52,750 cell. 109 00:07:52,790 --> 00:07:58,070 Now, what we have to do that is first I have to initialize it and just shift crosstab. 110 00:07:58,070 --> 00:07:58,580 You will see. 111 00:07:59,150 --> 00:08:00,700 What exactly is your estimate? 112 00:08:00,920 --> 00:08:01,680 What is the X? 113 00:08:01,730 --> 00:08:05,340 What is it Y and what is your CV parameter? 114 00:08:05,360 --> 00:08:12,050 So if you guys doesn't know what exactly this cross validation, just check out my previous videos. 115 00:08:12,050 --> 00:08:14,010 I have explained it and everything. 116 00:08:14,010 --> 00:08:14,580 Go there. 117 00:08:14,960 --> 00:08:18,630 So here, if you will pass through Gustav, you will get all these parameters over here. 118 00:08:18,650 --> 00:08:21,820 Now you have to set all these stuff to my estimate. 119 00:08:21,930 --> 00:08:30,250 What exactly Lorac and my X was X, my life was like, let's say I have to assign my cross-validation. 120 00:08:30,280 --> 00:08:31,700 It goes to ten. 121 00:08:32,000 --> 00:08:37,840 And if let's say whatever it will on me, let's say I'm just going to execute the set. 122 00:08:37,880 --> 00:08:46,070 Now, you will see over here with respect to X, Y and the C.V parameter, you have this, this is called 123 00:08:46,290 --> 00:08:47,840 selectees told somewhere else. 124 00:08:47,840 --> 00:08:52,640 Let's say I'm just going to store it in score and let me just execute it again. 125 00:08:52,940 --> 00:09:01,340 Now, what we have to do to achieve your exact score, you guys can just call me an order that simple. 126 00:09:01,340 --> 00:09:03,330 You have to just execute the cell. 127 00:09:03,350 --> 00:09:10,240 And now you will see over here it has accuracy of somewhere close to 70 percent. 128 00:09:10,250 --> 00:09:14,240 It means you're 70 percent predictions are going to. 129 00:09:14,240 --> 00:09:14,880 Correct. 130 00:09:14,900 --> 00:09:17,780 So that's all about the session of the session. 131 00:09:17,780 --> 00:09:24,580 Very much how exactly I have implemented my machine learning algorithm on data, then how I have crossed 132 00:09:24,640 --> 00:09:32,240 validate my model as well, because this is your most important aspect of the machine learning approach. 133 00:09:32,720 --> 00:09:36,020 So I hope you love this session and this session will be very helpful for you. 134 00:09:36,440 --> 00:09:37,080 Thank you. 135 00:09:37,130 --> 00:09:38,070 Have a nice day. 136 00:09:38,090 --> 00:09:38,960 Keep learning. 137 00:09:38,960 --> 00:09:39,830 Keep growing. 138 00:09:40,040 --> 00:09:40,820 Keep practicing.