1 00:00:00,090 --> 00:00:06,240 Hello, welcome to the interactive session of this hotel booking project in which we are going to build 2 00:00:06,240 --> 00:00:13,950 such a machine learning model that can predict whether a particular booking which has been done by a 3 00:00:13,950 --> 00:00:16,950 user is going to cancel or not. 4 00:00:17,340 --> 00:00:25,590 So this is exactly our that feature that we have to predict considering this all these dozens of features 5 00:00:25,590 --> 00:00:26,130 over here. 6 00:00:26,160 --> 00:00:30,530 You can see how much huge chunk of data we have over here. 7 00:00:30,780 --> 00:00:32,250 How do we have lots of feature? 8 00:00:32,250 --> 00:00:33,610 What exactly the hotel? 9 00:00:33,870 --> 00:00:34,590 What are the lead? 10 00:00:35,020 --> 00:00:36,810 What does the arrival date, month. 11 00:00:36,810 --> 00:00:37,530 Arrival date? 12 00:00:38,370 --> 00:00:40,030 What does these days and weekends? 13 00:00:40,050 --> 00:00:41,580 We have tons of features. 14 00:00:41,670 --> 00:00:48,960 And considering is all these features, we have to build such a machine learning model that can predict 15 00:00:49,050 --> 00:00:52,680 whether a particular booking is going to cancel or not. 16 00:00:52,680 --> 00:00:58,720 But before building such a model, you have to understand your data or what your data is all about. 17 00:00:58,740 --> 00:01:05,910 So the best way to understand new data is that performing lots of analysis on new data by fact and some 18 00:01:05,940 --> 00:01:09,020 amazing insights from this huge chunk of data. 19 00:01:09,030 --> 00:01:16,590 And once we understand about data, what my data is all about, what exactly did I think once we understand 20 00:01:16,590 --> 00:01:23,640 our data to a greater extent than we are going to build such a model that can predict what exactly is 21 00:01:23,640 --> 00:01:24,210 going with? 22 00:01:24,210 --> 00:01:30,360 It is back to this future by doing lots of data processing, doing lots of feature encoding techniques, 23 00:01:30,510 --> 00:01:36,960 changing techniques, dealing with missing value and lots of machine learning algorithms, we are going 24 00:01:36,960 --> 00:01:38,590 to apply on our data. 25 00:01:38,610 --> 00:01:42,090 So that is exactly what all I get behind this project. 26 00:01:42,120 --> 00:01:47,190 So I hope you will know why we are going to do this project and hope you will understand the importance 27 00:01:47,190 --> 00:01:48,190 of this project as well. 28 00:01:48,330 --> 00:01:53,190 So just with me and have a fun for all the upcoming sessions.