1 00:00:00,533 --> 00:00:02,633 Hello and welcome to the Machine Learning A to Z course. 2 00:00:02,633 --> 00:00:05,300 Super excited to have you back here on board. 3 00:00:05,300 --> 00:00:07,233 You might be wondering what's up with the timer? 4 00:00:07,233 --> 00:00:11,466 Well, this tutorial is all about getting excited about machine learning, 5 00:00:11,766 --> 00:00:14,500 and I've set the challenge of showing you 6 00:00:14,500 --> 00:00:17,600 the power of machine learning in under five minutes. 7 00:00:18,000 --> 00:00:19,966 So let's dive straight into it. 8 00:00:19,966 --> 00:00:22,933 You are a data scientist working for a car company, 9 00:00:22,933 --> 00:00:24,333 and you've been given this dataset 10 00:00:24,333 --> 00:00:27,966 with ages and estimated salaries of potential customers. 11 00:00:28,300 --> 00:00:33,600 Your task is to predict which ones of these customers are more likely 12 00:00:33,733 --> 00:00:39,633 to purchase a car, based on a campaign that the sales division will be running. 13 00:00:40,033 --> 00:00:43,566 The good news is that the sales division also gave you this data set, 14 00:00:43,766 --> 00:00:47,833 which is data from a previous campaign, a very similar campaign 15 00:00:47,833 --> 00:00:51,900 that ran in the past, which also has ages and estimated salaries of customers. 16 00:00:52,166 --> 00:00:56,700 But it has an additional column which says whether that customer purchased the car 17 00:00:57,000 --> 00:01:00,566 that they were advertised or whether they didn't purchase a car. 18 00:01:00,866 --> 00:01:04,500 So this data set is the one we're going to use to build a model, 19 00:01:04,500 --> 00:01:07,666 and then we're going to apply that model to this data set. 20 00:01:07,666 --> 00:01:10,200 And we're going to be using a logistic regression model. 21 00:01:10,200 --> 00:01:13,200 Excited. So my let's dive straight into it. 22 00:01:13,500 --> 00:01:14,700 We're going to be using Python. 23 00:01:14,700 --> 00:01:18,233 And we're going to be working in Google Colab which is a very popular tool. 24 00:01:18,600 --> 00:01:20,866 As you can see we have some code pre-written here. 25 00:01:20,866 --> 00:01:24,533 And we're going to walk through this code and run it step by step. 26 00:01:24,533 --> 00:01:27,533 And you'll see the model getting built, which is very exciting. 27 00:01:27,700 --> 00:01:30,700 Now let's take the first step and we'll load the data sets. 28 00:01:30,700 --> 00:01:31,533 Here are the data sets. 29 00:01:31,533 --> 00:01:32,566 We're going to click over here. 30 00:01:32,566 --> 00:01:34,700 And I'm going to load them up. 31 00:01:34,700 --> 00:01:39,033 All right so the data sets are loaded and we can proceed to our code. Now. 32 00:01:40,366 --> 00:01:41,266 So there we go. 33 00:01:41,266 --> 00:01:44,166 First thing we're going to do is we're going to import the libraries. 34 00:01:44,166 --> 00:01:47,833 So we're going to open this up run these libraries. 35 00:01:47,833 --> 00:01:50,833 And this will allow us to work with the data. 36 00:01:51,000 --> 00:01:54,800 Next we're going to import the data set into our code so we can work with it 37 00:01:54,800 --> 00:01:55,066 there. 38 00:01:55,066 --> 00:01:56,100 There's you can see there's 39 00:01:56,100 --> 00:01:59,400 the age estimated salary and whether or not a person purchased. 40 00:01:59,400 --> 00:02:01,366 So that's a previous campaign. 41 00:02:01,366 --> 00:02:04,533 Then we're going to plot the data set and see what it's all about. 42 00:02:04,900 --> 00:02:07,966 Here we can see the blue dots represent people who purchase. 43 00:02:07,966 --> 00:02:12,600 They're usually at a higher age or at a higher salary. 44 00:02:12,933 --> 00:02:15,400 And people who didn't purchase are the red dots. 45 00:02:15,400 --> 00:02:17,566 And of course there's a mix over there as well. 46 00:02:17,566 --> 00:02:19,066 Now we're going to apply feature scaling. 47 00:02:19,066 --> 00:02:23,000 We won't go into detail on what this is all about is just a very important step, 48 00:02:23,000 --> 00:02:26,166 and you'll learn more about it inside the course for sure. 49 00:02:27,033 --> 00:02:30,000 Next, we're going to train the logistic regression model. 50 00:02:30,000 --> 00:02:31,033 And there we go. 51 00:02:31,033 --> 00:02:33,133 logistic regression model is trained. 52 00:02:33,133 --> 00:02:36,100 And logistic regression is just one of the models, a very popular one. 53 00:02:36,100 --> 00:02:39,233 But definitely they are the ones which again, you will learn inside the course. 54 00:02:39,766 --> 00:02:42,733 We're going to visual the visualize the model results. 55 00:02:42,733 --> 00:02:45,766 And this will allow us to see what is going on 56 00:02:45,766 --> 00:02:48,766 in our data and how the model is being applied to it. 57 00:02:48,900 --> 00:02:49,800 So there we go. 58 00:02:49,800 --> 00:02:53,400 There's our logistic regression model applied to the data. 59 00:02:53,400 --> 00:02:56,766 As you can see, it's saying that anything above this blue line 60 00:02:56,800 --> 00:02:59,700 means that the person will purchase or has purchased a car. 61 00:02:59,700 --> 00:03:02,700 Anything below the line means that they won't. 62 00:03:02,800 --> 00:03:06,300 and of course there are some mismatches with the blue and red dots. 63 00:03:06,300 --> 00:03:08,800 And that's okay, because no model is perfect. 64 00:03:08,800 --> 00:03:12,433 Those are just errors which are totally normal for models. 65 00:03:12,433 --> 00:03:15,500 And in this course you will learn how to pick the right models 66 00:03:15,500 --> 00:03:19,200 for the right applications and to minimize those errors. 67 00:03:19,600 --> 00:03:22,600 Next we're going to import the new project data. 68 00:03:22,733 --> 00:03:24,166 And there it is. 69 00:03:24,166 --> 00:03:27,033 That's our new data that we need to predict. 70 00:03:27,033 --> 00:03:30,266 we're going to apply our model to make those predictions. 71 00:03:30,533 --> 00:03:36,400 And we will then visualize the predictions to see, the results. 72 00:03:36,400 --> 00:03:37,133 So there we go. 73 00:03:37,133 --> 00:03:39,200 Those are our results over here. 74 00:03:39,200 --> 00:03:44,566 And as we can see the model is saying that anything above this blue line, 75 00:03:44,900 --> 00:03:47,566 that those people will are likely 76 00:03:47,566 --> 00:03:50,733 to purchase the car below the blue line are not likely to purchase. 77 00:03:51,000 --> 00:03:56,100 So our marketing department can just simply target the people in the blue area. 78 00:03:56,366 --> 00:04:00,533 And therefore save on costs and optimize their efforts, 79 00:04:00,533 --> 00:04:03,533 focus their efforts, and get the best return on investment. 80 00:04:03,633 --> 00:04:04,200 So there we go. 81 00:04:04,200 --> 00:04:07,033 That's how easy it is to apply machine learning models. 82 00:04:07,033 --> 00:04:09,500 As you can see, this took us under five minutes. 83 00:04:09,500 --> 00:04:12,000 And of course this was a simplified approach. 84 00:04:12,000 --> 00:04:13,200 We didn't do many of the steps, 85 00:04:13,200 --> 00:04:16,433 like splitting the data into a training set and a test set. 86 00:04:16,566 --> 00:04:20,900 We didn't, build a confusion matrix or calculate accuracy ratios 87 00:04:21,266 --> 00:04:25,466 and lots of other nuances, all of which you will learn inside the course. 88 00:04:25,633 --> 00:04:28,600 But in a nutshell, this is how machine learning works. 89 00:04:28,600 --> 00:04:32,100 And this is the power of machine learning, 90 00:04:32,100 --> 00:04:35,433 which already soon you will be able to apply in your career. 91 00:04:35,866 --> 00:04:38,833 I hope you're excited and I can't wait to see you inside 92 00:04:38,833 --> 00:04:42,533 the course, where we will learn a lot and have lots of fun along the way. 93 00:04:42,866 --> 00:04:44,900 And until next time, enjoy machine learning!