0 1 00:00:00,470 --> 00:00:00,870 Now, 1 2 00:00:00,870 --> 00:00:08,080 the thing is, all the prices that are in our target were collected in the 1970s. 2 3 00:00:08,160 --> 00:00:15,060 Now, I wasn't around back then, but I am told the 70s was a glorious time to be alive. 3 4 00:00:15,060 --> 00:00:23,460 The thing I know for sure, however, is that one dollar in 1970 was worth much, much more than one dollar 4 5 00:00:23,570 --> 00:00:25,110 today. 5 6 00:00:25,410 --> 00:00:34,170 In fact, inflation is a horrific thing for anybody with savings and to buy a house today you need many, 6 7 00:00:34,170 --> 00:00:36,120 many more dollars than you do 7 8 00:00:36,120 --> 00:00:44,940 in the 1970s. So given that we want to get a dollar price estimate that's a bit closer to today's values 8 9 00:00:45,320 --> 00:00:47,760 than 1970s values, 9 10 00:00:47,760 --> 00:00:54,990 let's make an adjustment to the estimates that our model is spitting out. 10 11 00:00:54,990 --> 00:01:02,610 The only question is - how do we get a more realistic price out of our little valuation tool? 11 12 00:01:02,610 --> 00:01:08,120 For starters, let's take a look at what the median price in our dataset is 12 13 00:01:08,250 --> 00:01:18,500 as it stands currently. And we can see this if I pull up numpy's median function, so "np.median()" and 13 14 00:01:18,500 --> 00:01:24,500 I supply the target values in the "boston_dataset". 14 15 00:01:24,830 --> 00:01:32,630 So "boston_dataset.target" inside these parentheses for the median function will calculate 15 16 00:01:32,810 --> 00:01:44,660 the median value of a property in our dataset. And what we see is that in 1970 the median value was 16 17 00:01:44,660 --> 00:01:49,190 21200 dollars. 17 18 00:01:49,190 --> 00:01:53,130 This is where we were at in 1970 in Boston. 18 19 00:01:53,660 --> 00:01:55,240 But what about today? 19 20 00:01:55,700 --> 00:02:02,510 If I Google "Zillow Boston home values", I am brought to this page here. 20 21 00:02:03,020 --> 00:02:12,680 So this is on "zillow.com" which actually gives us a median estimate for home prices in Boston as 21 22 00:02:12,680 --> 00:02:14,150 of today. 22 23 00:02:14,540 --> 00:02:23,270 And what we see here is that today the median home value is approximately 583000 23 24 00:02:23,270 --> 00:02:27,710 dollars, more than half a million. 24 25 00:02:27,980 --> 00:02:35,830 And that's a combination of inflation as well as increases in the real price of homes. 25 26 00:02:35,940 --> 00:02:46,130 Now, as a challenge, can you write the Python code that converts the estimated price from our model which 26 27 00:02:46,130 --> 00:02:52,280 is in 1970s log prices to today's dollar values? 27 28 00:02:52,430 --> 00:02:55,610 You're gonna have to convert the estimate itself, 28 29 00:02:55,610 --> 00:02:59,310 the upper bound and the lower bound to today's prices. 29 30 00:02:59,510 --> 00:03:05,170 And once you've converted it, round the values to the nearest thousand dollars. 30 31 00:03:05,210 --> 00:03:05,600 I'll 31 32 00:03:05,620 --> 00:03:13,610 give you a few seconds to pause the video and then I'll show you my approach to solving this problem. 32 33 00:03:13,680 --> 00:03:14,670 Ready? 33 34 00:03:14,670 --> 00:03:15,920 Here it goes. 34 35 00:03:15,930 --> 00:03:23,330 So the first thing I'm going to do is I'm going to say "ZILLOW_MEDIAN_ 35 36 00:03:23,340 --> 00:03:29,280 PRICE = 583.3". 36 37 00:03:29,280 --> 00:03:37,530 This is the price in thousands that I saw on Zillow website for the median price in Boston at the time 37 38 00:03:37,530 --> 00:03:38,670 of recording. 38 39 00:03:38,970 --> 00:03:44,100 The next thing I'm going to do is I'm going to calculate some sort of scale factor, the number I'm going to have to 39 40 00:03:44,100 --> 00:03:51,770 multiply my 21.2 by in order to get a realistic value for today. 40 41 00:03:51,780 --> 00:03:56,910 So this is gonna be "SCALE_FACTOR" which is equal to, 41 42 00:03:57,230 --> 00:04:01,970 well it's going to be equal to 583/21.2. 42 43 00:04:01,980 --> 00:04:02,680 Right? 43 44 00:04:02,700 --> 00:04:06,000 So "ZILLOW_MEDIAN_ 44 45 00:04:08,430 --> 00:04:24,350 PRICE/np.median(boston_dataset.target)". 45 46 00:04:25,340 --> 00:04:34,360 Let's take a look at what the scale factor is equal to, so "scale_factor", Shift+Enter, we now see 46 47 00:04:34,480 --> 00:04:40,650 that the dollar prices should be multiplied by 27.5. 47 48 00:04:40,860 --> 00:04:41,130 Okay. 48 49 00:04:41,160 --> 00:04:43,170 So that was step one. 49 50 00:04:43,210 --> 00:04:44,560 Now let me call my function. 50 51 00:04:44,560 --> 00:04:48,690 So my function will give me a log estimate, 51 52 00:04:49,030 --> 00:04:53,860 it will give me an upper price and a lower price 52 53 00:04:54,160 --> 00:04:58,550 and it will give me some confidence interval. 53 54 00:04:58,720 --> 00:05:04,870 Those of course are gonna get their values from our function, "get_log_estimate" 54 55 00:05:05,770 --> 00:05:15,460 and then I'm going to calculate, I don't know, a price estimate for a mansion, 9 rooms and maybe "students 55 56 00:05:15,460 --> 00:05:26,160 _per_classroom" is gonna be equal to, say 15, "next_to_river = 56 57 00:05:26,290 --> 00:05:34,900 False" and "high_confidence = " is going gonna be equal to, 57 58 00:05:34,900 --> 00:05:36,970 I don't know, False, let's vary it up a bit. 58 59 00:05:37,470 --> 00:05:37,950 Okay. 59 60 00:05:37,960 --> 00:05:43,930 So now I've got my scale factor calculated and I've got values that I'm going to calculate from my estimate, 60 61 00:05:44,320 --> 00:05:51,280 my upper and lower bounds and my confidence interval, let me hit Shift+Enter, make sure this works and it 61 62 00:05:51,280 --> 00:05:52,750 does, no errors. 62 63 00:05:52,750 --> 00:06:05,620 Now what I'll do is I'll convert to today's dollars, so the "dollar_est" is gonna be equal to first 63 64 00:06:05,620 --> 00:06:17,260 off, reversing the log transformation so "np.e**log_estimate", then times 64 65 00:06:17,620 --> 00:06:22,600 1000, because our dollars are in thousands and then times, 65 66 00:06:22,600 --> 00:06:28,500 well you guessed it, the "scale_factor" that we've calculated above. 66 67 00:06:28,570 --> 00:06:32,750 This gives us a dollar estimate for today. 67 68 00:06:32,800 --> 00:06:37,390 Let's take a look at what this one is actually, so "dollar_est", Shift+Enter, 68 69 00:06:37,900 --> 00:06:45,370 here's our unrounded value for a mansion that is not next to the river and has a pretty small class 69 70 00:06:45,370 --> 00:06:55,120 size, 15 students. Estimate in today's dollars is approximately 826000. 70 71 00:06:55,120 --> 00:07:04,480 Now we just need to round it, right? Now I'm going to add another comment, say "Round the dollar values to nearest 71 72 00:07:07,280 --> 00:07:12,550 thousand" and I'll create another variable that's going to hold onto the rounded estimate, 72 73 00:07:12,670 --> 00:07:15,140 so I'll say "rounded_estimate" 73 74 00:07:15,140 --> 00:07:20,570 is gonna be equal to the return value from some sort of rounding function. 74 75 00:07:20,570 --> 00:07:32,550 I'm going to use numpy's around function, so "np.around(dollar_estimate, )" and 75 76 00:07:32,840 --> 00:07:35,290 I have to say I want to round to the nearest thousand. 76 77 00:07:35,570 --> 00:07:39,690 So I'm going to say -3. 77 78 00:07:39,710 --> 00:07:42,490 Let's take a look at what this output actually reads. 78 79 00:07:42,520 --> 00:07:46,180 So "rounded_est", Shift+Enter, 79 80 00:07:46,220 --> 00:07:49,280 and here we see it's 826000. 80 81 00:07:49,280 --> 00:07:52,090 So this is a rounded value. 81 82 00:07:52,250 --> 00:07:53,050 Brilliant. 82 83 00:07:53,060 --> 00:08:00,320 Now all that's left to do is do the same thing for the upper value and the lower value. 83 84 00:08:00,620 --> 00:08:12,140 I can first off copy this line of code, paste it twice, change my variable name, so I'll say "hi" and then "low" 84 85 00:08:13,010 --> 00:08:21,890 and we're transforming "upper", and we're transforming "lower" and then we're also going to round the upper 85 86 00:08:22,370 --> 00:08:24,250 and the lower values. 86 87 00:08:24,290 --> 00:08:35,650 So it's going to be "rounded_hi" and "rounded_low" for "dollar_hi" and "dollar_low". 87 88 00:08:38,570 --> 00:08:46,670 Let me print all of this out with f-strings. So "print(f 88 89 00:08:46,670 --> 00:08:56,080 'The estimated property value is {rounded_est}.')", Shift+ 89 90 00:08:56,120 --> 00:08:57,100 Enter, 90 91 00:08:57,140 --> 00:09:05,780 that works, except I've got a typo, "estimated property value". Let's add another print statement, 91 92 00:09:05,780 --> 00:09:11,180 "print(f'At { 92 93 00:09:12,620 --> 00:09:27,160 conf}% confidence the valuation range is)"; final print statement "f'USD 93 94 00:09:27,600 --> 00:09:43,790 {rounded_low} at the lower end to USD {rounded_ 94 95 00:09:43,790 --> 00:09:48,710 hi} at the high end.')". 95 96 00:09:48,710 --> 00:09:49,320 Full stop. 96 97 00:09:50,660 --> 00:09:56,540 Okay, so that was just a lot of typing, but now we can look at our output 97 98 00:09:56,540 --> 00:10:05,060 nicely worded, using the variables that we've defined earlier inside some f strings. 98 99 00:10:05,060 --> 00:10:09,070 Let's take another look at the output for this cell. 99 100 00:10:09,200 --> 00:10:10,100 Brilliant. 100 101 00:10:10,100 --> 00:10:13,110 That works beautifully. 101 102 00:10:13,110 --> 00:10:14,160 Now, tell you what. 102 103 00:10:14,450 --> 00:10:20,020 Let's add all of this to a function. 103 104 00:10:20,300 --> 00:10:26,000 Let's create another function called "get_dollar_estimate" and make this function do all the conversion 104 105 00:10:26,690 --> 00:10:31,980 as well as call this function here to actually get the estimate. 105 106 00:10:31,990 --> 00:10:38,720 So I'm going to define a function, "def get_dollar_estimate 106 107 00:10:38,720 --> 00:10:45,500 ():" and then inside the body of this function I'm going to put all the code that we've 107 108 00:10:45,500 --> 00:10:46,670 written above. 108 109 00:10:46,730 --> 00:10:53,510 So all of this up to here, will go inside of the function. 109 110 00:10:53,580 --> 00:11:03,190 So copy this, I'm going to paste it in here and then you'll see it's not indented, so I'm going to have to indent, 110 111 00:11:03,210 --> 00:11:09,960 so it's actually part of this function, and I can do that by selecting all the lines and hitting Tab 111 112 00:11:10,140 --> 00:11:11,520 on my keyboard. 112 113 00:11:11,520 --> 00:11:17,670 Now what we want is we want this function here to actually take again like the 4 arguments that we're 113 114 00:11:17,910 --> 00:11:21,790 passing through to our other function call. 114 115 00:11:22,080 --> 00:11:23,990 We can call the first one "rm", 115 116 00:11:24,360 --> 00:11:27,930 we can call the second one "ptratio", 116 117 00:11:27,930 --> 00:11:33,740 we can call the third one "chas" and set that equal to False by default, 117 118 00:11:34,170 --> 00:11:43,020 and then the last argument we can call it "large_range" and set that equal to True by default. 118 119 00:11:43,110 --> 00:11:51,360 And this means that in our nested function call right here to get log estimate we can pass through these 119 120 00:11:51,360 --> 00:11:52,770 parameters. 120 121 00:11:52,770 --> 00:11:54,330 So this won't be number 9, 121 122 00:11:54,360 --> 00:11:59,680 this will be "rm", this won't be "students_per_classroom = 15", 122 123 00:11:59,800 --> 00:12:03,510 this will be "ptratio", 123 124 00:12:03,610 --> 00:12:12,710 this won't be "next_to_river = False", but "next_to_river = chas", and this won't be "high_confidence 124 125 00:12:12,710 --> 00:12:18,100 = False", but it'll be "high_confidence = large_range". 125 126 00:12:18,320 --> 00:12:23,270 So I hope that you can see how using these different arguments works. For these two, 126 127 00:12:23,270 --> 00:12:30,230 I've used an argument by keyword, so "next_to_river" is the keyword, "chas" is the value which we'll get from 127 128 00:12:30,230 --> 00:12:31,120 this function call. 128 129 00:12:31,970 --> 00:12:40,430 And for this argument "high_confidence" is the key word and "large_range" is the value, "rm" and "ptratio" 129 130 00:12:41,420 --> 00:12:47,610 are just the values because I'm passing these arguments by position, not by keyword. 130 131 00:12:47,630 --> 00:12:51,100 So this is argument number one, this is argument number two. 131 132 00:12:51,290 --> 00:12:58,880 If I didn't want to pass the arguments by position, I could use a keyword as well, so I could say "students_ 132 133 00:12:59,240 --> 00:13:05,870 per_classroom = ptratio" 133 134 00:13:06,620 --> 00:13:15,060 and that's because this is the keyword that we've defined in our signature for "get_log_estimate". All right, 134 135 00:13:15,130 --> 00:13:17,470 so we've got our function, 135 136 00:13:17,470 --> 00:13:20,050 we've got our way of getting the log estimates, 136 137 00:13:20,050 --> 00:13:24,710 we've got our way of converting the log estimates to today's dollars 137 138 00:13:24,850 --> 00:13:30,280 and we've got the part where we're rounding those values and then where we're showing some sort of 138 139 00:13:30,370 --> 00:13:33,400 output to whoever's calling the function. 139 140 00:13:33,580 --> 00:13:35,890 Let's try all of this out for good measure. 140 141 00:13:36,100 --> 00:13:45,850 So "get_dollar_estimate()" for a small apartment with, say "rm = 2", and 141 142 00:13:45,940 --> 00:13:53,510 a terrible school with 200 kids, so "ptratio = 200". 142 143 00:13:53,650 --> 00:13:57,690 But on the upside this apartment will be next to the river, 143 144 00:13:57,700 --> 00:14:06,900 so "chas = True". Before I hit Shift+Enter on the cell I'm going to make sure that I have had Shift+ 144 145 00:14:06,910 --> 00:14:15,740 Enter on this cell beforehand and now I know that my function will be recognized as I do the same here. 145 146 00:14:17,190 --> 00:14:25,790 The estimated property value is 0 and at 95 percent confidence the valuation range is between 0 146 147 00:14:26,300 --> 00:14:30,880 at the lower end and a 1000 dollars at the high end. 147 148 00:14:30,890 --> 00:14:32,490 Why is that? 148 149 00:14:32,520 --> 00:14:37,650 That's because the parameters that we've supplied here are completely unrealistic, 149 150 00:14:37,650 --> 00:14:40,070 right? This number in particular, 150 151 00:14:40,250 --> 00:14:43,900 you never have 200 kids per teacher in the local school. 151 152 00:14:43,910 --> 00:14:51,670 A more realistic number, something like say 30, which will then give us a more realistic estimate. 152 153 00:14:51,830 --> 00:14:58,670 The same problem we might face if we have, say zero rooms, in the apartment which of course is nonsense. 153 154 00:14:59,510 --> 00:15:04,160 In this case, even for a property with zero rooms, which is impossible, 154 155 00:15:04,310 --> 00:15:11,710 we get a price estimate. Now there's two ways that you can avoid this kind of situation, 155 156 00:15:11,810 --> 00:15:12,610 right? 156 157 00:15:12,620 --> 00:15:20,540 First off, you might have to inform people what get_dollar_estimate does and what these values actually 157 158 00:15:20,540 --> 00:15:24,430 represent before they start just calling this function. 158 159 00:15:24,530 --> 00:15:28,250 And the second thing that you can do is you can reject giving an estimate 159 160 00:15:28,250 --> 00:15:31,760 if these values are unrealistic. 160 161 00:15:31,760 --> 00:15:33,500 Let me show you how you could do that for example. 161 162 00:15:33,890 --> 00:15:44,120 So coming up here and at the very top of the function, we could have an expression like " if rm < 162 163 00:15:44,120 --> 00:15:51,160 164 1:", "print( 163 165 00:15:51,700 --> 00:15:55,470 'That is 164 166 00:15:55,540 --> 00:15:56,500 unrealistic. 165 167 00:15:56,500 --> 00:15:58,530 Try again.')" 166 168 00:15:58,540 --> 00:15:59,350 Yeah. 167 169 00:15:59,470 --> 00:16:05,500 And then, still within this if block as you can tell by the indentation, we return. 168 170 00:16:06,040 --> 00:16:15,250 So as soon as this return instruction is hit, none of these following lines of code will be executed. 169 171 00:16:15,250 --> 00:16:22,060 I'm going to hit Shift+Enter, refresh the cell, and then I'm going to try to run this code again, see what we 170 172 00:16:22,060 --> 00:16:31,030 get. Our function now does not provide an estimate when the inputs for the rooms are completely 171 173 00:16:31,150 --> 00:16:32,640 unreasonable. 172 174 00:16:32,710 --> 00:16:39,220 If you wanted to check two things, say we wanted to make sure that this value here is not smaller 173 175 00:16:39,220 --> 00:16:40,750 than 1 as well, right, 174 176 00:16:40,930 --> 00:16:48,310 then what we could do is we'd come up here to our if statement and then we could use a logical "or" to 175 177 00:16:48,340 --> 00:17:00,330 check two separate conditions, so if "rm < 1 or ptratio < 1", then 176 178 00:17:00,600 --> 00:17:05,640 print this line and return from the function. 177 179 00:17:05,640 --> 00:17:12,390 In other words, one of the things that you can do to make sure that your functions aren't being completely 178 180 00:17:12,390 --> 00:17:18,600 abused is to actually check the values that are being supplied when they're being called. 179 181 00:17:18,600 --> 00:17:25,800 So if I hit Shift+Enter now and I change this line to something reasonable like, say 6 rooms, and I 180 182 00:17:25,800 --> 00:17:34,530 try to change this line here from ptratio od 30 to ptratio of -60 181 183 00:17:34,740 --> 00:17:36,100 message. 182 184 00:17:36,120 --> 00:17:44,880 Okay, so strategy 1 was checking for inputs, but better yet, let's add some sort of description about what 183 185 00:17:44,880 --> 00:17:45,990 this function does 184 186 00:17:46,080 --> 00:17:52,080 so that the person using it can pull up say the quick documentation. If I press Shift+Enter on this now, 185 187 00:17:54,190 --> 00:17:59,200 I can see the signature, I can see it says "rm, ptratio, chas=False,..", I see the default 186 188 00:17:59,200 --> 00:18:07,120 values and hitting that plus sign, I don't see anything else. What I would like to see though is something 187 189 00:18:07,120 --> 00:18:16,060 called a Docstring. The Docstring should give us a description of our function, what it does, what 188 190 00:18:16,060 --> 00:18:19,450 its inputs are and more or less like how to use it. 189 191 00:18:19,570 --> 00:18:27,910 Let me show you how we can add this to our "get_dollar_estimate" function. At the very, very top we can 190 192 00:18:27,910 --> 00:18:35,920 put in three double quotes and as you can see there's already six double quotes here, because Jupyter 191 193 00:18:35,930 --> 00:18:40,710 notebook inserted the closing three double quotes. 192 194 00:18:40,930 --> 00:18:45,970 In other words, you have three double quotes at the beginning and you have three double quotes at the 193 195 00:18:46,060 --> 00:18:52,760 end. The first three double quotes mark the beginning of the docstring, the last three quotes mark the 194 196 00:18:52,940 --> 00:18:59,630 end of the docstring and whatever we put in between will appear in the quick documentation. Check it 195 197 00:18:59,630 --> 00:19:16,450 out. "Estimate the price of a property in Boston.", Shift+Enter and then Shift+Tab will now show 196 198 00:19:17,050 --> 00:19:23,650 this description in the quick documentation. One of the good things to do when you're writing function 197 199 00:19:23,650 --> 00:19:28,930 that you want other people to use is to include a little bit of information about what the function 198 200 00:19:28,930 --> 00:19:32,630 does and also what the keyword arguments are. 199 201 00:19:33,580 --> 00:19:42,790 After all, if we do that on "np.around()" and press the little plus sign, we do indeed see some information 200 202 00:19:43,150 --> 00:19:48,810 on the parameters that this function takes - a, decimals and out. 201 203 00:19:48,890 --> 00:19:49,060 Yeah. 202 204 00:19:49,090 --> 00:19:56,490 So we can see a little description here and we can use this function the way it was intended to. 203 205 00:19:56,620 --> 00:19:58,950 Let's do the same thing for our function. 204 206 00:19:58,960 --> 00:20:11,260 So if I come in here and say "Keyword arguments:", "rm -- number of rooms in the property" 205 207 00:20:12,550 --> 00:20:28,110 and then "ptratio -- number of students per teacher in the classroom for the school in the 206 208 00:20:28,200 --> 00:20:33,950 area", "chas -- True 207 209 00:20:34,080 --> 00:20:40,990 if the property is next to the river, False 208 210 00:20:41,650 --> 00:20:45,090 otherwise", "large_range 209 211 00:20:45,210 --> 00:20:54,450 -- True for a 95% percent prediction interval, 210 212 00:20:57,580 --> 00:21:03,460 False for a 68% interval." 211 213 00:21:06,350 --> 00:21:14,780 Adding this description here for our keyword arguments and hitting Shift+Enter allows us to see what 212 214 00:21:14,780 --> 00:21:23,340 these arguments are and get a little description in the quick documentation when we hit Shift+Enter 213 215 00:21:23,650 --> 00:21:25,150 and take a look. 214 216 00:21:25,200 --> 00:21:26,150 Brilliant. 215 217 00:21:26,220 --> 00:21:26,680 All right. 216 218 00:21:26,700 --> 00:21:29,760 We've almost made it to the end. 217 219 00:21:29,790 --> 00:21:35,910 The last thing I want to show you is how we can package what is in this Jupyter notebook here 218 220 00:21:36,150 --> 00:21:45,890 as a Python module. If we look at our MLProjects folder here, a Python module is a file that ends in 219 221 00:21:46,160 --> 00:21:50,000 ".py". To create one of these files, 220 222 00:21:50,090 --> 00:22:01,780 what we can do is we can upload a ".py" file to this folder or alternatively, we can go to "New" < "Text 221 223 00:22:01,780 --> 00:22:08,600 File" and create this py file in Jupyter notebook directly. 222 224 00:22:08,710 --> 00:22:17,620 I personally quite like the Atom text editor for writing Python code and editing it. 223 225 00:22:17,910 --> 00:22:23,740 You, of course, you can use other text editors as well to create these py files, but Atom is definitely 224 226 00:22:23,740 --> 00:22:30,070 a good one. To save you the trouble of installing a text editor, I'll show you quickly how you can do 225 227 00:22:30,070 --> 00:22:33,320 this in Jupyter notebook directly. 226 228 00:22:33,370 --> 00:22:39,320 First off, let's rename this file. Let's rename it as "valuation. 227 229 00:22:39,370 --> 00:22:40,910 py". 228 230 00:22:41,140 --> 00:22:46,920 Or better yet let's name it as "boston_valuation. 229 231 00:22:46,930 --> 00:22:50,210 py". Click "OK" 230 232 00:22:50,790 --> 00:22:58,970 and now we're gonna copy some of our Python code and paste it into this empty ".py" file. 231 233 00:22:58,970 --> 00:23:01,620 Let's copy the imports for example, 232 234 00:23:01,760 --> 00:23:07,420 put those here, let's copy the cell where we're gathering the data, 233 235 00:23:07,940 --> 00:23:08,630 put this here, 234 236 00:23:09,740 --> 00:23:19,370 coming down, let's copy this cell here with our indices and our property_stats variable, 235 237 00:23:19,490 --> 00:23:30,950 copy it and put it here. I don't really need this comment so I'll delete it, scroll down, add a few more lines, 236 238 00:23:33,480 --> 00:23:35,500 go back to our notebook, 237 239 00:23:35,510 --> 00:23:39,910 scroll down a little more, copy the part where 238 240 00:23:39,910 --> 00:23:52,170 we're running our regression, put it in here, come back, copy the function where we're doing our log estimate 239 241 00:23:52,170 --> 00:24:04,800 of course, add it to our ".py" file and then come down here where we've got our scale factor, 240 242 00:24:05,060 --> 00:24:13,160 copy those two lines, go into our .py file, scroll to the top in this case and I'm going to to add it below 241 243 00:24:13,160 --> 00:24:15,660 our CRIME and ZN indices. 242 244 00:24:15,680 --> 00:24:17,130 So going to paste it in here. 243 245 00:24:17,300 --> 00:24:25,970 And then finally I'm going to grab all the code in this cell here, copy it and paste it at the very end 244 246 00:24:26,120 --> 00:24:29,380 of my .py file. 245 247 00:24:29,570 --> 00:24:41,430 All that's left to do now is "File" > "Save" and now in my projects folder I have a "boston_valuation.py" 246 248 00:24:41,430 --> 00:24:49,230 module that contains all my Python code, which I can now use in any of the Python notebooks 247 249 00:24:49,740 --> 00:24:51,850 in this projects folder. 248 250 00:24:51,870 --> 00:25:01,140 Check it out. In the valuation tool, I can come down and I can say "import boston_valuation 249 251 00:25:02,040 --> 00:25:10,950 as val" and I can call a function from this module, namely our "get_dollar_estimate" function with "val. 250 252 00:25:11,160 --> 00:25:12,590 get_ 251 253 00:25:12,940 --> 00:25:24,080 dollar_estimate(6,12, True)". Hitting Shift+Enter, 252 254 00:25:24,750 --> 00:25:33,150 I get a price estimate. All my arguments here are passed by position and this function call here comes 253 255 00:25:33,150 --> 00:25:34,770 from the module, 254 256 00:25:34,770 --> 00:25:35,010 right, 255 257 00:25:35,010 --> 00:25:41,110 it comes from the module that I've just imported. This one here comes from the notebook, 256 258 00:25:41,160 --> 00:25:47,760 this one here comes from our "boston_valuation" module. 257 259 00:25:47,760 --> 00:25:54,790 Now of course this is a lot more clear if I open my Multivariable Regression notebook and try to do 258 260 00:25:54,790 --> 00:26:05,590 the same thing. Under the notebook imports I can import "boston_valuation as val" and if I hit Shift+ 259 261 00:26:05,620 --> 00:26:22,550 Enter here, scroll to the very bottom, and say "val.get_dollar_estimate(8,15,False)", 260 262 00:26:23,780 --> 00:26:32,190 I will get an output from my get_dollar_estimate function from inside my .py file. 261 263 00:26:32,190 --> 00:26:33,680 Fantastic. 262 264 00:26:33,930 --> 00:26:37,620 So we've covered quite a few things again in this lesson. 263 265 00:26:37,620 --> 00:26:44,730 It's been a review of a lot of the concepts that we've talked about so far, but we've also learned a 264 266 00:26:44,790 --> 00:26:48,290 lot of new things. On the Python front, 265 267 00:26:48,360 --> 00:26:55,170 we've learned how to create functions with optional arguments, how to call these functions with the arguments 266 268 00:26:55,680 --> 00:27:00,140 where we supply the values by position and by keyword. 267 269 00:27:00,300 --> 00:27:05,310 We've learned about how to make the quick documentation show up for functions that we ourselves have 268 270 00:27:05,310 --> 00:27:07,790 written using docstrings. 269 271 00:27:08,160 --> 00:27:14,050 We've covered how you might exit a function if you don't like the inputs that it's getting. 270 272 00:27:14,190 --> 00:27:21,420 And we've covered how to use if-else blocks to check for condition or even two different conditions. 271 273 00:27:21,420 --> 00:27:27,030 So again, congratulations for making it all the way through this. 272 274 00:27:27,120 --> 00:27:34,680 I know the learning curve is getting steeper and I look forward to seeing you in the next modules. 273 275 00:27:34,680 --> 00:27:35,250 Take care.