1 00:00:00,120 --> 00:00:06,840 Halaal, before going deep dive into the session, let's have a quick recap of what we all have done 2 00:00:07,140 --> 00:00:09,220 on this session since the very first session. 3 00:00:09,240 --> 00:00:14,820 We have collect our data, then we have prepared our data for the analysis as well as for the morning 4 00:00:14,820 --> 00:00:18,700 papers, because in one aspect you will never get to clean data. 5 00:00:18,780 --> 00:00:21,940 You will always get a complex and raw data. 6 00:00:21,960 --> 00:00:28,050 You have to always read your data for your machine learning, whether it's a deep learning or any kind 7 00:00:28,050 --> 00:00:30,160 of project that you are working on. 8 00:00:30,510 --> 00:00:34,080 So we have gone through a lot of preparation. 9 00:00:34,320 --> 00:00:42,350 And after that, in the last session, we had this amazing map, which is exactly we have analyzed exactly 10 00:00:42,370 --> 00:00:44,010 the home country of guest. 11 00:00:44,310 --> 00:00:50,220 After it, we have analyzed what exactly is the distribution of each and every room that that's what 12 00:00:50,220 --> 00:00:51,780 we have done in the previous session. 13 00:00:52,290 --> 00:00:56,700 So in the in this session, we have this amazing statement. 14 00:00:57,120 --> 00:01:04,870 How does the price where I put tonight or you can say how does price per night vary over the year? 15 00:01:04,900 --> 00:01:11,510 You have to consider this problem statement with both the total as well as for your city hotel. 16 00:01:11,550 --> 00:01:13,100 It means you need to do it often. 17 00:01:13,320 --> 00:01:15,540 So very first, I'm just going to say data. 18 00:01:15,540 --> 00:01:19,290 All hotel is equally close to zero. 19 00:01:19,290 --> 00:01:24,720 I will say the Callicles to the very first is my resort hotel after it. 20 00:01:24,720 --> 00:01:28,560 So I can say this is exactly my first condition. 21 00:01:28,830 --> 00:01:33,900 And once I have this first condition, then I have to add my second condition as well. 22 00:01:33,900 --> 00:01:35,340 So my second condition is nothing. 23 00:01:35,340 --> 00:01:40,020 But data is underscore, cancel it, call. 24 00:01:40,110 --> 00:01:44,130 It goes to zero because I just need my valid bookings. 25 00:01:44,340 --> 00:01:51,110 Once I have this filter, I have to just pass this filter in my data frame so that I will have my filter 26 00:01:51,140 --> 00:01:52,590 data from XM. 27 00:01:52,590 --> 00:01:54,720 Just going to copy this from here. 28 00:01:54,720 --> 00:02:00,950 And again, I had to just pasted this time I'm going to say here I have my city water. 29 00:02:01,200 --> 00:02:07,680 So the very first data frame, let's I'm going to say this is my data underscore resort. 30 00:02:08,100 --> 00:02:12,900 Once I have done all these things, the second data frame, I'm going to say this is exactly my data 31 00:02:12,900 --> 00:02:16,020 on Underscore City and we have to just execute. 32 00:02:16,320 --> 00:02:20,890 So these are exactly what data frame that you have to consider. 33 00:02:20,910 --> 00:02:25,430 Let me show you a quick recap of how this data frame looks like. 34 00:02:25,440 --> 00:02:28,260 Let's say I'm going to check a preview of data. 35 00:02:28,260 --> 00:02:34,760 And as a result, you will see these are all your entries with respect to your result. 36 00:02:35,010 --> 00:02:37,020 So in your statement is nothing. 37 00:02:37,020 --> 00:02:41,450 But how does the price per night that I over the. 38 00:02:42,300 --> 00:02:51,780 So let me check how exactly you price that I over the month, because this is exactly similar to your 39 00:02:51,780 --> 00:02:53,460 problem statement so far. 40 00:02:53,460 --> 00:02:54,630 This is what we have to do. 41 00:02:54,630 --> 00:02:59,340 You will see here you have arrived and this could be eight in the score month column. 42 00:02:59,610 --> 00:03:05,130 So you have to consider this column and you will see here you have lots of features in this and I will 43 00:03:05,160 --> 00:03:06,330 netcode it in school month. 44 00:03:06,720 --> 00:03:10,290 It means you have to group this feature. 45 00:03:10,290 --> 00:03:13,880 You have to group your data frame on the basis of this feature. 46 00:03:14,250 --> 00:03:23,610 So I'm just going to say data on this score result and I had to access basically my this and I wouldn't 47 00:03:23,610 --> 00:03:26,280 have scored it on a single month or what you can do. 48 00:03:26,280 --> 00:03:28,220 You have an alternative for this one. 49 00:03:28,500 --> 00:03:33,630 So just call Gooby and in this group by just pass this feature. 50 00:03:33,780 --> 00:03:35,260 Both are similar here. 51 00:03:35,310 --> 00:03:37,860 I'm going to say after it I have to access my price. 52 00:03:38,040 --> 00:03:43,590 And on this price, if I'm going to call my Maeno there, if I'm going to execute it, you will see 53 00:03:43,590 --> 00:03:49,800 with respect to Apple, you had that much starts with respect to August and all these months you have 54 00:03:49,800 --> 00:03:50,720 that much instead. 55 00:03:51,120 --> 00:03:54,350 Now what you need, you have to convert this into some bit of frames. 56 00:03:54,360 --> 00:03:57,670 I'm just going to say reset on the school index. 57 00:03:57,730 --> 00:03:58,380 That's it. 58 00:03:58,620 --> 00:04:00,610 This will give me my amazing little frame. 59 00:04:00,930 --> 00:04:02,870 Now, let me store it somewhere else. 60 00:04:02,890 --> 00:04:07,470 I'm just going to say resort and score, let's say Hot-Air. 61 00:04:07,740 --> 00:04:12,960 And after it, what we have to do, let's say I'm going to print it as well, which is exactly this 62 00:04:12,960 --> 00:04:14,340 one just executed. 63 00:04:14,700 --> 00:04:20,730 This is that amazingly the frame that you exactly need to respect your resort hotel. 64 00:04:21,060 --> 00:04:26,990 You have to follow the similar approach, the similar approach for your city model as well. 65 00:04:27,000 --> 00:04:35,070 So I'm just going to say I have to let me just copy all these stuffs and let me just paste or here I 66 00:04:35,070 --> 00:04:38,630 have to do similar approach or my data underscore city. 67 00:04:38,760 --> 00:04:42,600 So just remove the Danisco city, just axis. 68 00:04:42,960 --> 00:04:45,550 And this time I'm going to say this is nothing. 69 00:04:45,580 --> 00:04:52,560 What my city on a score hotel and I have to, to my city underscore hotel as well. 70 00:04:52,890 --> 00:04:56,280 So this is exactly my city hotel. 71 00:04:56,520 --> 00:04:59,910 He will see with respect to the city hotel you have. 72 00:04:59,950 --> 00:05:00,750 That much changed. 73 00:05:01,210 --> 00:05:08,910 Now, what you have to do, you have too much both that greater frame on the basis of this arrival and 74 00:05:08,920 --> 00:05:14,810 a school day a month, because in both Dataprep you have this common column name. 75 00:05:14,830 --> 00:05:19,600 So for your purpose, you have to just call some inbuilt function of PARNAS. 76 00:05:19,600 --> 00:05:22,000 Your function will be very handy with you. 77 00:05:22,510 --> 00:05:29,980 So this time I'm just going to say resort, underscore hotel, dot lodge and you will check all the 78 00:05:29,980 --> 00:05:33,400 custom parameters of this function that the unstuff barometer in this. 79 00:05:33,760 --> 00:05:35,790 So you will see here you have a right. 80 00:05:35,800 --> 00:05:38,990 It means what exactly is your right data frame. 81 00:05:39,010 --> 00:05:45,580 So my right data frame is exactly my city hotel and this resort hotel is exactly my data frame. 82 00:05:46,030 --> 00:05:52,150 And here I have one parameter which is exactly on it means on what basis you have to merge it here. 83 00:05:52,180 --> 00:05:54,580 I'm going to say on this basis I have to merge it. 84 00:05:54,940 --> 00:05:58,480 If I'm going to execute it, it will return this amazing stats. 85 00:05:58,870 --> 00:06:03,450 Now, let's say I have to manipulate my column and select me ready for this to go somewhere else. 86 00:06:03,460 --> 00:06:06,580 I'm going to say this is exactly my final. 87 00:06:06,820 --> 00:06:13,620 And Vollans after restoring I'm just going to say this is exactly my final score columns is it goes 88 00:06:13,630 --> 00:06:20,230 to, let's say, the very first feature I'm going to consider it is as one the second feature I'm going 89 00:06:20,230 --> 00:06:24,580 to consider it is, let's say, price for resort. 90 00:06:25,210 --> 00:06:30,490 The third feature, I'm going to say, does nothing but price for city hotels. 91 00:06:30,500 --> 00:06:34,420 I'm going to say price for city hotel. 92 00:06:34,420 --> 00:06:34,960 That's it. 93 00:06:35,320 --> 00:06:38,470 And what I have to do, I have to just print this data. 94 00:06:38,770 --> 00:06:41,950 I'm going to say just print this final exit. 95 00:06:41,950 --> 00:06:43,120 It just execute. 96 00:06:43,420 --> 00:06:45,760 This is your amazing data frame. 97 00:06:46,540 --> 00:06:51,270 Still, you have to do one more preprocessing on this data frame. 98 00:06:51,590 --> 00:06:54,310 Hope you all guys get to know about it. 99 00:06:54,460 --> 00:07:01,660 Yeah, this is exactly that month because you will see in this month column you don't have a proper 100 00:07:01,660 --> 00:07:02,280 hierarchy. 101 00:07:02,290 --> 00:07:09,690 And if you are going to use your with the addition functions over here directly on this final data frame, 102 00:07:09,850 --> 00:07:15,150 you will get some improper conclusion because this month column is on the proper track. 103 00:07:15,190 --> 00:07:17,780 You will see April, August, December. 104 00:07:17,800 --> 00:07:19,240 It makes no sense at all. 105 00:07:19,390 --> 00:07:22,090 It means you have to start this month column. 106 00:07:22,330 --> 00:07:28,570 So there are two ways either you can use your own logic, use your own programming skills. 107 00:07:28,570 --> 00:07:34,050 But seriously, that's going very complex over here if you are going to write your own logic. 108 00:07:34,330 --> 00:07:40,220 So, yeah, Python is going to be pining over the python to write to some inbuilt functionality just 109 00:07:40,220 --> 00:07:41,770 to use Python functions. 110 00:07:41,770 --> 00:07:45,080 And Python will do that task for you. 111 00:07:45,370 --> 00:07:48,760 So for this, you have to import and you have to install it. 112 00:07:48,760 --> 00:07:51,350 If you haven't installed it, you have to install it as well. 113 00:07:51,610 --> 00:07:59,590 So if you haven't installed, you guys can install using PIP install, which is exactly my short python 114 00:07:59,830 --> 00:08:04,170 data frame by a month or so. 115 00:08:04,180 --> 00:08:11,510 This is exactly that module that will come into existence and it has some dependency packet as well. 116 00:08:11,890 --> 00:08:19,030 So if you haven't installed a dependency, you have to also install it, which is exactly my pipe installed, 117 00:08:19,030 --> 00:08:23,350 shorted months and weekdays. 118 00:08:23,470 --> 00:08:25,170 So you have to also install it. 119 00:08:25,450 --> 00:08:28,270 But very first you have to install this one after it. 120 00:08:28,270 --> 00:08:32,920 You have to install this file because this is exactly our dependency packet of this one. 121 00:08:33,340 --> 00:08:38,620 So if you haven't install it, you guys can simply install using this, offering all the stuff I have 122 00:08:38,620 --> 00:08:39,920 to simply imported. 123 00:08:40,080 --> 00:08:48,760 I'm going to say import this short underscored date frame by frame by underscore month or week. 124 00:08:48,760 --> 00:08:53,680 And I'm going to create its allies as, let's say, SD just executed. 125 00:08:53,680 --> 00:09:02,080 And now what we have to do, we have to execute some function short underscore data and whatever parameter 126 00:09:02,080 --> 00:09:04,770 this function will receive and will define it later. 127 00:09:05,140 --> 00:09:12,040 And now using this SD, I have some inbuilt function, which is exactly my sort. 128 00:09:12,050 --> 00:09:20,560 Underscore data frame by month, just tab you will get all your documentation the very first what exactly 129 00:09:20,560 --> 00:09:22,600 your data frame and a second one. 130 00:09:22,600 --> 00:09:23,860 What exactly you're gonna. 131 00:09:24,370 --> 00:09:30,060 So here I am going to say do nothing but let's say it is dissolve and whatever column name it will receive, 132 00:09:30,070 --> 00:09:35,470 I'm just going to receive that thing from my own function, which is exactly short and a data. 133 00:09:35,740 --> 00:09:40,440 So I'm going to say this is nothing but Mendeleev and this is nothing but my column. 134 00:09:40,440 --> 00:09:40,990 Name it. 135 00:09:40,990 --> 00:09:45,580 And here I have to say column name that said, I have to simply return it. 136 00:09:45,790 --> 00:09:49,770 So I'm going to say just to return all this stuff executed. 137 00:09:49,780 --> 00:09:57,160 Now what you have to do, you have to call this function and this time you have to use your final data 138 00:09:57,160 --> 00:09:57,520 frame. 139 00:09:57,850 --> 00:09:59,860 And this time what I'm going to do and just. 140 00:09:59,920 --> 00:10:07,510 And to say I have changed my mind that it just executed and it will take a while and return this amazing 141 00:10:07,930 --> 00:10:12,930 data for you, what you will see this month, gas shortage right now. 142 00:10:12,970 --> 00:10:15,070 Next, I have to store it somewhere else, let's say. 143 00:10:15,280 --> 00:10:19,420 I would say this is my final little frame that I have to consider. 144 00:10:19,450 --> 00:10:21,540 So let me just print again. 145 00:10:21,550 --> 00:10:25,710 And this is exactly the data frame that you need now. 146 00:10:25,960 --> 00:10:31,680 What you need you need some visuals so that you can conclude from this data. 147 00:10:31,930 --> 00:10:38,560 So whenever you have that type of data, whenever someone will ask what exactly the trend are, how 148 00:10:38,560 --> 00:10:43,720 something is varying, it means the best approach will be just go for our line. 149 00:10:44,110 --> 00:10:51,070 So I'm going to say dot line, which is exactly inbuilt function inside my mind. 150 00:10:51,070 --> 00:10:53,530 You just press shiftless tab. 151 00:10:56,500 --> 00:11:03,190 And these are all your custom barometer data frame, what this access is, what you want and why Y-axis 152 00:11:03,370 --> 00:11:05,500 and all these different different types of things. 153 00:11:05,890 --> 00:11:13,000 So here I am going to say my data frame is nothing but which is exactly my final on X-axis. 154 00:11:13,000 --> 00:11:19,610 I have to just pass my month and definitely on Y-axis, I just need some column. 155 00:11:19,900 --> 00:11:22,540 So let me say final dot. 156 00:11:22,900 --> 00:11:30,000 Gollum's over there and you will get both these columns that you exactly need on Y-axis. 157 00:11:30,010 --> 00:11:36,730 I'm just going to copy all these stuff and I'm just going to paste all these things after it. 158 00:11:36,770 --> 00:11:39,300 What you have to do, you have to assign you some title as well. 159 00:11:39,520 --> 00:11:44,410 So I'm going to say the title is Nothing word like, say, room by night. 160 00:11:44,620 --> 00:11:53,790 Or you can assign your own title as well to room price for night over the month or the month. 161 00:11:53,800 --> 00:11:54,280 That's it. 162 00:11:54,850 --> 00:11:58,360 Now, what I have to do, I have to just execute the cell. 163 00:11:58,870 --> 00:12:03,330 It will take a while, but it will be amazing to watch there. 164 00:12:03,340 --> 00:12:10,480 You will see with respect to this one, you have that much trend, whereas with respect to your resort 165 00:12:10,480 --> 00:12:17,350 hotel, you have that much like, say, I have to conclude from this line chart so I can clearly see 166 00:12:17,350 --> 00:12:17,460 it. 167 00:12:17,470 --> 00:12:23,950 Yeah, it clearly shows that prices in the resort hotel are much higher, are much higher during the 168 00:12:23,950 --> 00:12:30,490 summer and you will see or hear the prices for the city hotel will rise very less. 169 00:12:30,490 --> 00:12:37,210 And it is most expensive during basically my spring and basically my autumn column. 170 00:12:37,540 --> 00:12:42,150 So that's the type of conclusion you can fetch from your date of trying to visit. 171 00:12:42,430 --> 00:12:44,110 I hope you love this session very much. 172 00:12:44,470 --> 00:12:45,130 Thank you. 173 00:12:45,200 --> 00:12:47,140 How nice to keep learning. 174 00:12:47,140 --> 00:12:49,030 Keep growing, keep practicing.