1 00:00:00,110 --> 00:00:08,390 Halaal, before we go into the session, let's have a quick recap of what we have done in this project. 2 00:00:08,680 --> 00:00:14,580 So in the very first session, we have prepared our data so that, again, we will analyze our data 3 00:00:14,580 --> 00:00:17,580 and find some friends, find some meaningful insight. 4 00:00:17,820 --> 00:00:24,990 And again, we will be such a machine learning model that can predict whether a particular booking is 5 00:00:24,990 --> 00:00:26,280 going to cancel or not. 6 00:00:26,430 --> 00:00:28,170 So we have to build such a model. 7 00:00:28,440 --> 00:00:32,280 But before building a model, you have to understand your data. 8 00:00:32,280 --> 00:00:37,740 And the best way to understand the data is by performing your last analysis on your data. 9 00:00:38,070 --> 00:00:44,040 So here in the previous session, we have performed lots of preparation, our data by just creating 10 00:00:44,040 --> 00:00:44,790 this filter. 11 00:00:44,940 --> 00:00:48,540 So in this session, what we have to do, we have to analyze our data very first. 12 00:00:48,540 --> 00:00:49,910 We have to understand our data. 13 00:00:50,190 --> 00:00:55,680 So the very first problem statement is where do the guests come from? 14 00:00:55,890 --> 00:01:00,860 The very first formal statement, you have to understand your guest. 15 00:01:00,870 --> 00:01:05,380 You have to understand where users from which location they are coming. 16 00:01:05,850 --> 00:01:08,850 So this is exactly the very first problem statement. 17 00:01:09,120 --> 00:01:16,230 So let me say, whenever you are going to perform your spatial analysis, so what exactly is spatial 18 00:01:16,230 --> 00:01:16,920 analysis? 19 00:01:17,310 --> 00:01:23,760 So spatial analysis is all about whenever you are going to visualize the data on some map so that you 20 00:01:23,760 --> 00:01:28,800 will get a clear cut yapped from which which location your guests are coming. 21 00:01:29,130 --> 00:01:31,670 So let me perform spatial analysis over there. 22 00:01:32,010 --> 00:01:37,230 But before performing it, I need some data so that I can easily perform. 23 00:01:37,230 --> 00:01:38,850 That is patient analysis. 24 00:01:38,850 --> 00:01:41,180 So very first for I'm going to do it here. 25 00:01:41,190 --> 00:01:44,460 Here I'm going to say data off here. 26 00:01:44,460 --> 00:01:47,040 I have a feature which is underscore canceled. 27 00:01:47,280 --> 00:01:54,390 And here I'm going to say it is equally close to zero because I just need that data that are my valid 28 00:01:54,570 --> 00:01:55,080 bookings. 29 00:01:55,080 --> 00:01:58,250 And wherever this is cancellation, it goes to zero. 30 00:01:58,260 --> 00:02:00,000 So this is exactly the invalid booking. 31 00:02:00,300 --> 00:02:06,660 So I have to just pass this filter in so that I will get my filter data frame. 32 00:02:07,020 --> 00:02:11,550 And on this filter data frame, I have to access my country. 33 00:02:11,550 --> 00:02:13,350 So I'm just going to access my country. 34 00:02:13,590 --> 00:02:21,540 And on this, if I'm going to call this value on a school counselor, dear, now you will get some idea 35 00:02:21,750 --> 00:02:24,960 from which, which country, who users are coming. 36 00:02:24,960 --> 00:02:28,830 You will see your Portugal and some European countries. 37 00:02:29,010 --> 00:02:30,810 But let's say I just need a clear cut. 38 00:02:30,810 --> 00:02:34,050 I just need to visualize this data on some map. 39 00:02:34,260 --> 00:02:38,280 So let me very first convert all this stuff into some data frames. 40 00:02:38,310 --> 00:02:43,350 I'm just going to say just gonna reset an index and I'll get executed. 41 00:02:43,350 --> 00:02:46,240 And this is your amazing data frame that you have to consider. 42 00:02:46,530 --> 00:02:48,260 So let me name this data frame. 43 00:02:48,300 --> 00:02:54,750 Let's say country score lies, underscore data and just execute it. 44 00:02:54,750 --> 00:03:01,410 And if I'm going to print it, you can just print it using this country on a scale and data and let's 45 00:03:01,410 --> 00:03:04,290 say I have to modify my column name. 46 00:03:04,290 --> 00:03:11,700 So I'm going to say this DOT columns, let me assign my own column name to the very first column name 47 00:03:11,700 --> 00:03:13,890 and exactly my country. 48 00:03:13,890 --> 00:03:18,930 And second column name is exactly my number of guest. 49 00:03:19,110 --> 00:03:25,250 And once I have all this stuff, so what I have to do, I have to simply print my data and exit. 50 00:03:25,530 --> 00:03:28,680 Now you will figure out this is exactly the data set. 51 00:03:28,950 --> 00:03:35,940 This is exactly that data frame that you have to consider, especially for this problem statement. 52 00:03:35,940 --> 00:03:37,330 Where do the guest come from? 53 00:03:37,560 --> 00:03:42,640 So now what you need, you need some external modules so that you can visualize. 54 00:03:42,660 --> 00:03:47,280 So basically you need a library, which is exactly my volume. 55 00:03:47,290 --> 00:03:53,560 So if you happen to start earlier, you guys can simply install using PIP, install FOLIA. 56 00:03:53,700 --> 00:03:55,260 So just execute this. 57 00:03:55,260 --> 00:03:58,920 Come on and your liberi get successfully executed. 58 00:03:59,100 --> 00:04:05,850 Now what I have to do so from this forelimb library, I have something Jusu so I'm just going to say 59 00:04:05,850 --> 00:04:12,900 don't plug ins, just press tab over here and you will figure out this is exactly the module. 60 00:04:12,930 --> 00:04:16,110 So from this I have to import something known as heat map. 61 00:04:16,410 --> 00:04:17,850 So it just execute it. 62 00:04:17,860 --> 00:04:26,010 Now, if using this volume, if I am going to call my map over there, which is exactly inbuilt function 63 00:04:26,010 --> 00:04:30,650 inside this volume, now you will figure out this is my global map. 64 00:04:30,660 --> 00:04:36,450 So the top of this base map, you can dump it as base maps on top of this base map. 65 00:04:36,750 --> 00:04:44,400 I have to visualize my map or I guess I had realize my visual depending upon what type of visual this 66 00:04:44,400 --> 00:04:45,500 data will me. 67 00:04:45,900 --> 00:04:48,570 So let's say this is exactly my basement. 68 00:04:48,570 --> 00:04:53,550 So I'm going to say this is exactly my base map and let me just execute it. 69 00:04:53,940 --> 00:04:58,530 Now, what you have to do, we have to import very basic stuff. 70 00:04:58,530 --> 00:04:59,670 You have to import your. 71 00:05:00,470 --> 00:05:06,830 So if you haven't installed plotless to install using this pipe installed properly, and if you guys 72 00:05:06,830 --> 00:05:14,300 don't know what exactly, the Pratley subplot laser advanced level data will liability that is extensively 73 00:05:14,300 --> 00:05:16,850 used for deployment level results. 74 00:05:17,360 --> 00:05:26,570 So let's say I'm just going to say import tautly, DOT Xpress and I have to create its ElĂ­as as, let's 75 00:05:26,570 --> 00:05:29,360 say oops and just execute it. 76 00:05:29,360 --> 00:05:33,980 Now, using this because I have something known as code of that map. 77 00:05:34,280 --> 00:05:40,070 And if you will pass to or you will figure out all your custom parameters. 78 00:05:40,070 --> 00:05:42,980 So then you have to perform this spatial analysis. 79 00:05:43,580 --> 00:05:46,790 There are two main function that are extensively used. 80 00:05:47,090 --> 00:05:50,030 The first one, it is exactly a heat map. 81 00:05:50,230 --> 00:05:55,060 Either you can consider this heat map or you can go ahead with this map. 82 00:05:55,070 --> 00:06:01,370 There's some minor difference between both and you will figure out there are some custom parameter. 83 00:06:01,380 --> 00:06:07,310 What is a data frame, latitude and longitude, all these different different parameters here? 84 00:06:07,310 --> 00:06:12,790 I have to say, my very first parameter is country and a school, which is called data after eight. 85 00:06:12,800 --> 00:06:18,950 What I have to do, I have a parameter which is exactly my location, which is exactly this one. 86 00:06:19,220 --> 00:06:22,960 It means I have to plot my country to that map. 87 00:06:23,150 --> 00:06:26,210 So for this I'm going to say a school and it's data. 88 00:06:26,210 --> 00:06:32,200 And in this I have a feature, which is exactly my so I'm going to say this is exactly my country. 89 00:06:32,510 --> 00:06:40,730 So after having all these stuff over here, I have again a parameter over here, which is exactly my 90 00:06:40,730 --> 00:06:45,480 color parameter, because to my quarterback map, I have to assign some color. 91 00:06:45,710 --> 00:06:49,820 So here I am going to say I have to assign color on the basis of number of guest. 92 00:06:50,070 --> 00:06:51,950 It means more than a number of guest. 93 00:06:51,950 --> 00:06:54,680 We have higher the density of the color. 94 00:06:54,680 --> 00:07:02,300 FILBY So here I'm going to say it is nothing but my number of you can press tab as well, which is exactly 95 00:07:02,300 --> 00:07:04,760 my number of guest. 96 00:07:04,940 --> 00:07:07,910 And it's still you have some dozens of parameters. 97 00:07:07,910 --> 00:07:11,240 You can play it whatever parameter you want. 98 00:07:11,420 --> 00:07:16,130 So here I have some more parameter which is exactly my who are on the score. 99 00:07:16,800 --> 00:07:22,770 Whenever you are going to hover your mouse on your visual, what exactly you want to reflect. 100 00:07:22,790 --> 00:07:28,450 So for this, I'm just going to say or underscore name, which is exactly this one. 101 00:07:28,460 --> 00:07:33,200 So I'm just going to say I just need to reflect my country names for this. 102 00:07:33,200 --> 00:07:37,730 I'm just going to say this is nothing but my this country. 103 00:07:37,910 --> 00:07:43,010 After having all this stuff, let's say I have to assign my subtitles for this. 104 00:07:43,010 --> 00:07:47,940 I'm just going to say title is nothing, but let's say whole country of guest. 105 00:07:48,170 --> 00:07:51,120 So home country of guests. 106 00:07:51,350 --> 00:07:56,380 So this is exactly my title of my quadruped map softheaded. 107 00:07:56,420 --> 00:07:59,540 What we have to do, let's say I'm just going to store it somewhere else. 108 00:07:59,690 --> 00:08:07,100 So I'm just going to say this is exactly my map underscore guest once having all this stuff on this 109 00:08:07,100 --> 00:08:08,270 map, underscore guest. 110 00:08:08,270 --> 00:08:10,790 I have to just call my show function. 111 00:08:10,790 --> 00:08:11,360 That's it. 112 00:08:11,570 --> 00:08:18,650 And if I'm going to execute this well now you will see over here, this is amazing quadruplet map and 113 00:08:18,650 --> 00:08:21,430 this is exactly your color bar over here. 114 00:08:21,800 --> 00:08:25,160 So almost observe over here almost. 115 00:08:25,160 --> 00:08:32,690 There are 80 to 85 percent of countries that have almost between zero to five thousand guests. 116 00:08:33,290 --> 00:08:39,740 But yeah, there are few countries, let's say, which are exactly my Portugal and some European countries 117 00:08:39,950 --> 00:08:42,350 where we have a maximum number of guest. 118 00:08:42,350 --> 00:08:45,500 So that's a type of entrance, how you can fetch stronger data. 119 00:08:45,770 --> 00:08:53,390 So let's go ahead with our next statement in which I have to analyze how much do guest pay for a room 120 00:08:53,390 --> 00:08:54,230 per night. 121 00:08:54,500 --> 00:09:01,520 So for this, let me Kollar had on my data frame to get a preview how exactly it looks like you will 122 00:09:01,520 --> 00:09:04,970 see this is exactly the data set that you have over here. 123 00:09:05,270 --> 00:09:07,730 But very first you need somebody looking. 124 00:09:07,730 --> 00:09:10,940 So I'm just going to say data of is on this call. 125 00:09:11,120 --> 00:09:13,940 Cancel it close to zero. 126 00:09:13,940 --> 00:09:18,060 I have to just pass this in my this one here. 127 00:09:18,080 --> 00:09:22,700 I'm going to say this is nothing but let's say my data too. 128 00:09:22,700 --> 00:09:30,860 So I have to just executed and now you will figure out in your problem statement you have to analyze 129 00:09:30,860 --> 00:09:34,310 how much do guest pay for night. 130 00:09:34,490 --> 00:09:40,940 It means you just need price distribution of each of the room type. 131 00:09:41,180 --> 00:09:47,210 So for this, whenever you need a distribution, either you can go ahead with your distribution function 132 00:09:47,210 --> 00:09:54,010 of Seabourne, either of a plot, or you can consider some fancy stuff like box plot. 133 00:09:54,020 --> 00:09:59,630 So I'm just going to say Asness Dot here, I have a very handy function, which is exact. 134 00:09:59,910 --> 00:10:06,860 My doctors, I'm just going to say it is exactly my box, but if you will pass shift BASTABLE here, 135 00:10:06,900 --> 00:10:08,800 you will get all the documentation. 136 00:10:08,820 --> 00:10:09,840 What is your X? 137 00:10:10,170 --> 00:10:10,680 What is it? 138 00:10:10,710 --> 00:10:11,160 Why? 139 00:10:11,160 --> 00:10:11,550 What is it? 140 00:10:11,580 --> 00:10:13,170 Who parameter humans? 141 00:10:13,170 --> 00:10:16,670 On which basis you have to split your box plot. 142 00:10:16,980 --> 00:10:21,210 So basically on this exact six what I need. 143 00:10:21,210 --> 00:10:21,900 Exactly. 144 00:10:21,930 --> 00:10:29,940 So let me say very first on this data to if I'm going to call my columns over here, you will see you 145 00:10:29,940 --> 00:10:32,610 have all your column names over here. 146 00:10:32,880 --> 00:10:40,350 So on this X axis, you exactly need your reserved room type, which is exactly this one. 147 00:10:40,350 --> 00:10:44,820 So I'm just going to copy from here and let me just paste it. 148 00:10:45,000 --> 00:10:48,930 And this is exactly on your x axis after it. 149 00:10:48,930 --> 00:10:55,260 On Y axis, you have to assign some your price because you need a distribution with the respective price. 150 00:10:55,500 --> 00:10:58,710 So on y axis, I'm going to say it is nothing but my ADR. 151 00:10:59,010 --> 00:11:02,370 And after it I have something which is exactly my hue. 152 00:11:02,760 --> 00:11:08,780 So here I'm going to say I just need distribution, both of resort as well as for that. 153 00:11:08,820 --> 00:11:11,870 So here I'm going to say you go to that operator. 154 00:11:11,880 --> 00:11:13,950 I have to mention my data frame as well. 155 00:11:13,960 --> 00:11:20,610 So I'm going to say data is nothing but basically my data to call data to the current data free on which 156 00:11:20,610 --> 00:11:21,610 we are working. 157 00:11:22,080 --> 00:11:29,640 So if let's say I have to customize this box as well, so I'm going to say PLT or figure I have to assign 158 00:11:29,640 --> 00:11:30,480 my own figures. 159 00:11:30,790 --> 00:11:37,490 So take side, let's say 12, Gunma eight, and let me assign some title as well. 160 00:11:37,500 --> 00:11:42,600 So I'm going to say t dot data and what exactly is meant to my title. 161 00:11:42,650 --> 00:11:42,990 Nothing. 162 00:11:42,990 --> 00:11:52,380 But let's say price of room types per night and per person. 163 00:11:52,410 --> 00:11:59,820 Or you can say only person once having this title you have to assign some X label, some Y level as 164 00:11:59,820 --> 00:12:00,000 well. 165 00:12:00,390 --> 00:12:07,150 So here I am going to say my X label is nothing but let's say my room type to her. 166 00:12:07,180 --> 00:12:11,430 I'm going to say X level is nothing but my type. 167 00:12:11,610 --> 00:12:15,530 After signing or X label, you have to assign Uruguay label as well. 168 00:12:15,810 --> 00:12:20,670 So my Y level is nothing but let's say price in euros. 169 00:12:20,670 --> 00:12:27,840 So I'm going to say price in euro after having all this is what you have to you have to simply show 170 00:12:27,840 --> 00:12:30,200 your matplotlib this to it. 171 00:12:30,330 --> 00:12:37,230 And here you have one more function, which is exactly what leakin because you have to show Leegin with 172 00:12:37,230 --> 00:12:40,830 respect to the result as well as with respect to your city hotel as well. 173 00:12:40,860 --> 00:12:46,980 So if I'm going to ask you that, you will see this amazing visual with respect to each of your room 174 00:12:46,980 --> 00:12:47,760 type over here. 175 00:12:48,060 --> 00:12:55,650 You will see this this blue this blue distribution is exactly a resort hotel and this orange one with 176 00:12:55,650 --> 00:12:56,850 respect to city hotel. 177 00:12:57,120 --> 00:13:07,490 But you will see the middle line is exactly your median, whereas this one is exactly four hundred percentile 178 00:13:07,560 --> 00:13:08,090 data. 179 00:13:08,130 --> 00:13:12,810 And this this one is exactly what, twenty fifth percentile data. 180 00:13:12,810 --> 00:13:21,440 And this one is exactly where 70 percent data and this just list is exactly zero percent data. 181 00:13:21,630 --> 00:13:24,110 So you can easily conclude from this. 182 00:13:24,120 --> 00:13:33,540 Yeah, the best distribution with respect to City Hotel is almost with my G type, whereas if I will 183 00:13:33,540 --> 00:13:40,890 talk about with respect to my resort hotel, it's the best distribution of price is almost 10 to with 184 00:13:40,890 --> 00:13:43,290 respect to this at room type. 185 00:13:43,290 --> 00:13:48,960 And what our data point that you will see over here, these are exactly your high values. 186 00:13:49,110 --> 00:13:52,090 Or you can see these are exactly the outliers. 187 00:13:52,140 --> 00:13:55,390 So you can definitely conclude as much as you can. 188 00:13:55,410 --> 00:13:56,480 So that's all about this. 189 00:13:56,500 --> 00:13:58,380 I hope you love the session very much. 190 00:13:58,530 --> 00:14:06,150 And if you have, don't just post your doubts in your Q&A section or via your personal mode of communication 191 00:14:06,150 --> 00:14:06,930 with me as well. 192 00:14:07,020 --> 00:14:08,050 Whatever you want. 193 00:14:08,190 --> 00:14:09,020 So thank you. 194 00:14:09,030 --> 00:14:10,000 Have a nice day. 195 00:14:10,020 --> 00:14:10,920 Keep learning. 196 00:14:10,920 --> 00:14:11,820 Keep growing. 197 00:14:11,820 --> 00:14:12,690 Keep practicing.