1 00:00:00,133 --> 00:00:02,400 Hello and welcome to this art tutorial. 2 00:00:02,400 --> 00:00:04,600 So we already built two regression models. 3 00:00:04,600 --> 00:00:08,033 It was a simple linear regression and the multiple linear regression. 4 00:00:08,433 --> 00:00:12,300 And today in this tutorial we will start with polynomial regression. 5 00:00:13,066 --> 00:00:16,233 So as usual let's prepare the workspace by setting the right folder 6 00:00:16,233 --> 00:00:17,633 as working directory. 7 00:00:17,633 --> 00:00:19,966 So right now I'm going to my Machine learning A-Z folder. 8 00:00:19,966 --> 00:00:23,366 Part two regression then section polynomial regression. 9 00:00:23,600 --> 00:00:24,966 And here is the right folder. 10 00:00:24,966 --> 00:00:27,733 Make sure that you have the position salary csv file. 11 00:00:27,733 --> 00:00:30,066 And you can click on this more button here. 12 00:00:30,066 --> 00:00:32,366 And then set as working directory. 13 00:00:32,366 --> 00:00:36,800 And now let's start with the usual first step of making a machine learning model. 14 00:00:37,333 --> 00:00:40,633 So this first step is of course the data pre-processing step. 15 00:00:40,900 --> 00:00:42,266 So to be efficient I'm 16 00:00:42,266 --> 00:00:45,900 going to go to my data pre-processing template that we made in part one. 17 00:00:46,266 --> 00:00:48,366 I'm going to select everything in the template 18 00:00:49,433 --> 00:00:51,033 copy and 19 00:00:51,033 --> 00:00:54,033 paste it in my polynomial regression model. 20 00:00:55,000 --> 00:00:56,866 This way okay. 21 00:00:56,866 --> 00:00:58,833 And now we just need to change a few things. 22 00:00:58,833 --> 00:01:01,866 So the first thing we need to change is the name of the data set. 23 00:01:02,266 --> 00:01:03,466 So it's not data here. 24 00:01:03,466 --> 00:01:06,466 The data was the name of the data set in the pre-processing part. 25 00:01:06,466 --> 00:01:08,066 But now in this regression part 26 00:01:08,066 --> 00:01:11,100 the name of the data set is position underscore salaries. 27 00:01:11,400 --> 00:01:14,400 So here we'll write position 28 00:01:14,666 --> 00:01:17,000 underscore salaries. 29 00:01:17,000 --> 00:01:19,200 And now let's select this line 30 00:01:19,200 --> 00:01:22,200 and execute to have a look at our data set. 31 00:01:22,800 --> 00:01:23,166 Okay. 32 00:01:23,166 --> 00:01:26,166 And now let's see what will be our machine learning mission. 33 00:01:26,800 --> 00:01:30,100 So we are a human resource team working for a big company. 34 00:01:30,400 --> 00:01:33,400 And we are about to hire a new employee in this company. 35 00:01:33,433 --> 00:01:36,466 So this new employee seems to be great a good fit for the job. 36 00:01:36,633 --> 00:01:40,333 And we are about to make an offer to this potential new employee. 37 00:01:40,933 --> 00:01:42,566 And now it's time to negotiate. 38 00:01:42,566 --> 00:01:43,500 Negotiate on 39 00:01:43,500 --> 00:01:47,400 what is going to be the future salary of this new employee in the company. 40 00:01:48,100 --> 00:01:50,033 And so, at the beginning of the negotiation, 41 00:01:50,033 --> 00:01:54,033 this new employee is telling that he's had 20 plus years of experience 42 00:01:54,333 --> 00:01:58,400 and eventually earns 160 K annual salary in its previous company. 43 00:01:58,766 --> 00:02:02,433 So this employee is asking for at least more than 160 K. 44 00:02:03,066 --> 00:02:06,166 However, there is someone in the HR team that is kind of a control 45 00:02:06,166 --> 00:02:09,166 freak and always fantasized about being a detective. 46 00:02:09,333 --> 00:02:13,000 So suddenly decides to call the previous employee to check that info. 47 00:02:13,033 --> 00:02:16,000 You know that info about the previous 160 K 48 00:02:16,000 --> 00:02:19,000 annual salary of this future potential new employee. 49 00:02:19,366 --> 00:02:22,300 But unfortunately, all the info that this person 50 00:02:22,300 --> 00:02:25,300 manages to get are these info here. 51 00:02:25,366 --> 00:02:28,833 The simple table of salaries for these ten different positions 52 00:02:29,066 --> 00:02:30,600 in the previous company. 53 00:02:30,600 --> 00:02:35,000 So this HR member of the team runs a simple analysis on Excel or Google 54 00:02:35,000 --> 00:02:38,833 Sheets, and actually observes that there is a nonlinear relationship 55 00:02:39,000 --> 00:02:42,900 between these position levels and their associated salaries. 56 00:02:43,866 --> 00:02:45,533 However, and moreover, 57 00:02:45,533 --> 00:02:48,666 this HR guy could get another very relevant info. 58 00:02:49,300 --> 00:02:53,433 This other relevant info is that this new employee that is about to be hired 59 00:02:53,700 --> 00:02:57,366 has been a region manager for two years now in the previous company. 60 00:02:57,866 --> 00:03:00,600 And that usually takes an average four years to jump 61 00:03:00,600 --> 00:03:03,600 from being a region manager to a partner. 62 00:03:03,600 --> 00:03:06,833 So this new employee about to be hired 63 00:03:06,833 --> 00:03:10,133 was kind of halfway between level six and level seven. 64 00:03:10,133 --> 00:03:13,133 And therefore we can say he was level 6.5. 65 00:03:13,500 --> 00:03:14,566 So now this H.R. 66 00:03:14,566 --> 00:03:17,566 Guy is getting all excited because he's telling to the team 67 00:03:17,566 --> 00:03:20,700 that he can build a bluffing detector using regression models 68 00:03:20,933 --> 00:03:23,933 to predict if this new employee is bluffing about its salary. 69 00:03:24,400 --> 00:03:27,066 So at the beginning, the team finds it a little weird, 70 00:03:27,066 --> 00:03:30,066 but it's kind of curious to see what's going to happen. 71 00:03:30,133 --> 00:03:31,866 So here is the mission. 72 00:03:31,866 --> 00:03:35,133 This new employee that is about to be hired is telling 73 00:03:35,133 --> 00:03:38,833 that its annual salary was 160 K in the previous company. 74 00:03:39,366 --> 00:03:43,433 Let's use polynomial regression to build a bluffing detector to predict 75 00:03:43,433 --> 00:03:44,766 if it's truth or bluff.