1 00:00:00,390 --> 00:00:04,930 Hello and welcome back to our second structured data project. 2 00:00:05,070 --> 00:00:10,490 Now this one's really exciting right because we're going to be predicting the sale price of bulldozers. 3 00:00:10,530 --> 00:00:14,880 Who doesn't love bulldozer remember when I was a kid I used to have all these tractors like tanker trucks 4 00:00:15,240 --> 00:00:17,310 and used to play with them all the time. 5 00:00:17,400 --> 00:00:18,940 And this one's a little bit different. 6 00:00:19,020 --> 00:00:24,900 Of course it's a regression problem because we're trying to predict a number a.k.a. the sale price of 7 00:00:24,900 --> 00:00:25,720 bulldozers. 8 00:00:25,830 --> 00:00:30,230 But what we're gonna do is we're going to take data and in this case the data. 9 00:00:30,330 --> 00:00:32,080 See this little clock emoji. 10 00:00:32,160 --> 00:00:34,410 It has a time component to it. 11 00:00:34,440 --> 00:00:40,110 So that means in our previous project there was no time series it was just a big static data set whereas 12 00:00:40,110 --> 00:00:46,560 this one there's a sale date component which means that a bulldozer has been sold on a certain date. 13 00:00:46,590 --> 00:00:54,000 And so what we'll be doing is using sales from the past to try and predict sales in the future. 14 00:00:54,260 --> 00:00:59,910 And that's our problem definition right is we'll take our data which is full of bulldozer sales we're 15 00:00:59,910 --> 00:01:01,830 trying to predict the sale price. 16 00:01:01,830 --> 00:01:06,620 And I mean you might be looking at this emerging going that's a tractor Daniel that's not a bulldozer 17 00:01:06,660 --> 00:01:10,770 and you know there there's no bulldozer emoji sadly yet. 18 00:01:10,890 --> 00:01:12,900 That may be an update coming from Apple but who knows. 19 00:01:12,900 --> 00:01:14,600 Fingers crossed. 20 00:01:14,790 --> 00:01:18,610 Of course we're going to use our six step machine learning framework. 21 00:01:18,690 --> 00:01:24,090 We'll start by defining our problem which is of course seeing if we can predict the sale price of bulldozers 22 00:01:24,450 --> 00:01:29,580 we'll have a look at some data we'll find and an evaluation metric which is suitable for our problem. 23 00:01:29,670 --> 00:01:32,010 Well then have a look at the features of the data. 24 00:01:32,100 --> 00:01:36,240 We'll of course try some machine learning modelling because after all that's our goal is to build a 25 00:01:36,240 --> 00:01:42,030 machine learning model that helps us to breed evict the sale price of bulldozers and throughout the 26 00:01:42,210 --> 00:01:49,350 entire session we'll be doing experiments now we've seen what tools we can use and we're going to be 27 00:01:49,350 --> 00:01:53,860 using the exact same tools set that we used in the previous structured data project. 28 00:01:53,940 --> 00:01:58,800 So we'll have Panda's map plot layer but none imply for data analysis we'll be working within the Jupiter 29 00:01:58,800 --> 00:02:04,710 notebook using our Condor environment and we'll use psychic learn for any machine learning modelling 30 00:02:06,510 --> 00:02:12,030 now because this is a new project will approach it as if it's a new project. 31 00:02:12,240 --> 00:02:14,760 We've got our steps to take in a new project here. 32 00:02:14,910 --> 00:02:19,990 But we've been through these we've got our computer we've download installed mini Condor. 33 00:02:20,010 --> 00:02:22,250 This is where we're at now we're starting a new project. 34 00:02:22,740 --> 00:02:29,460 So we'll create a new project folder and now of course because we're using the exact same tools as what 35 00:02:29,460 --> 00:02:35,160 we did in the previous structured data project you could complete this entire project. 36 00:02:35,160 --> 00:02:39,480 The bulldozer sale price prediction in that previous folder. 37 00:02:39,570 --> 00:02:45,300 However for completeness we're going to create a new project folder or download our data from Kaggle. 38 00:02:45,330 --> 00:02:46,700 We'll get to that in a second. 39 00:02:46,950 --> 00:02:52,740 We'll create an environment which holds all of our tools that we'll be using we'll use our Jupiter notebook 40 00:02:52,770 --> 00:02:59,100 as our workspace to perform data analysis and manipulation using our name pi panders and map plot lib 41 00:02:59,100 --> 00:03:04,100 tools and then machine learning using psychic learn. 42 00:03:04,160 --> 00:03:09,560 Now of course if you do get stuck and don't worry you probably will I get stuck all the time. 43 00:03:09,560 --> 00:03:14,960 The steps that you can take is to follow along with the code as best you can try it for yourself. 44 00:03:14,960 --> 00:03:20,660 Remember our motto if in doubt run the code then if you want to find out about more about what a function 45 00:03:20,660 --> 00:03:21,380 is doing. 46 00:03:21,440 --> 00:03:26,330 Remember to press shift and tab to read the dock string which will give you some information of water 47 00:03:26,570 --> 00:03:28,050 what a function is doing. 48 00:03:28,190 --> 00:03:33,170 If you're still stuck then of course don't be afraid to search for it right you'll come across resources 49 00:03:33,170 --> 00:03:40,910 like stack overflow of the documentation for cyclone panders and then if you search for it and you find 50 00:03:40,910 --> 00:03:46,470 some information maybe you copy it from there from the documentation into your Jupiter notebook. 51 00:03:46,700 --> 00:03:53,600 Then of course try again remember if in doubt run the code and if you're still stuck never be afraid 52 00:03:53,600 --> 00:03:54,830 to ask a question. 53 00:03:54,920 --> 00:03:59,630 If you're sitting on something you're thinking haha I wish I knew the answer to this and wonder I can't 54 00:03:59,630 --> 00:04:01,320 find it anywhere. 55 00:04:01,430 --> 00:04:04,190 Just ask right if you're running into an issue like that. 56 00:04:04,340 --> 00:04:09,620 More than likely someone else is running into a similar issue so don't be afraid to ask pose the question 57 00:04:09,630 --> 00:04:14,330 you can go to stack overflow or discord or anywhere else can ask a question. 58 00:04:14,330 --> 00:04:17,320 Never be afraid to ask a question. 59 00:04:17,340 --> 00:04:18,100 All right. 60 00:04:18,230 --> 00:04:20,360 With that being said let's not wait any longer. 61 00:04:20,360 --> 00:04:23,860 Let's get our project folder set up and get into this second project.