0 1 00:00:00,940 --> 00:00:01,650 All right. 1 2 00:00:01,690 --> 00:00:07,410 So in this lesson it's finally time to plot our gradient descent. 2 3 00:00:07,410 --> 00:00:08,620 Here's how we're gonna do it. 3 4 00:00:08,640 --> 00:00:14,910 The first thing is we're going to add two more variables, these variables will collect all the values 4 5 00:00:14,910 --> 00:00:22,480 that are being calculated within our for loop so that we can plot our algorithm's progress on the chart. 5 6 00:00:22,610 --> 00:00:31,730 I'm gonna call the first one of these "plot_vals" and this is gonna be our thetas array reshaped into 6 7 00:00:32,090 --> 00:00:35,030 a one by two array. 7 8 00:00:35,120 --> 00:00:37,430 So one row, two columns. 8 9 00:00:37,430 --> 00:00:44,690 This is where we're gonna add on all the theta values that we calculate in the for loop and then the second 9 10 00:00:44,840 --> 00:00:54,740 variable is gonna be called "mse_vals" and this is going to start out simply as our first mean squared 10 11 00:00:54,770 --> 00:00:57,530 error calculation from our initial guess. 11 12 00:00:57,600 --> 00:01:12,730 So it's going to be "y_5, thetas[0] + thetas[1] * x_5". 12 13 00:01:13,320 --> 00:01:21,350 Now that we've defined our two variables and given them some starting values I'm going to add a little 13 14 00:01:21,350 --> 00:01:30,200 comment, I'm going to say "Collect data points for scatter plot" because that's ultimately what we're gonna 14 15 00:01:30,200 --> 00:01:43,760 use these ones for. Now, within our loop what we need to do is append the new values to our numpy arrays 15 16 00:01:44,600 --> 00:01:52,610 so I'm going to use our old friends concatenate and append. Let's use concatenate on plot_vals, so 16 17 00:01:52,710 --> 00:02:03,080 "plot_vals" is gonna be equal to "np.concatenate()" and then we need to supply 17 18 00:02:03,080 --> 00:02:03,900 two things - 18 19 00:02:04,040 --> 00:02:12,980 the first is that tuple that's going to consist of the old value of plot_vals, so all the previous 19 20 00:02:12,980 --> 00:02:19,680 values that we calculated in the loop plus our updated thetas array. 20 21 00:02:19,760 --> 00:02:29,900 So this is gonna be "thetas.reshape(1,2)" and then outside of this tuple we're 21 22 00:02:29,900 --> 00:02:37,940 gonna have to specify the "axis=0". What this line of code will do is it will take 22 23 00:02:37,940 --> 00:02:46,400 the existing array of plot_vals, which is a 1 by 2 array and it will concatenate it with the new theta values 23 24 00:02:46,670 --> 00:02:54,710 which we've reshaped into a 1 by 2 array. Now we need to capture our mean square error calculations. 24 25 00:02:54,710 --> 00:02:56,500 So it's gonna be mse_vals, 25 26 00:02:56,930 --> 00:03:04,910 and just to mix things up I want to use "np.append" and accomplish the very same thing, namely append 26 27 00:03:04,910 --> 00:03:07,060 values to our array. 27 28 00:03:07,100 --> 00:03:11,480 So first, I have to specify where I want to append the values to. 28 29 00:03:11,540 --> 00:03:19,350 And this is going to be the mse_vals array and then the second thing I have to specify which 29 30 00:03:19,350 --> 00:03:20,980 values I want to append. 30 31 00:03:21,180 --> 00:03:25,270 So the values are gonna be equal to the new set of calculations. 31 32 00:03:25,290 --> 00:03:34,520 So I'm gonna call our mse function, pass in the y_5, actual y values and then I'm going to pass 32 33 00:03:34,520 --> 00:03:43,880 in that y hat values and if you remember with four linear regression this was theta zero which was fetus 33 34 00:03:43,980 --> 00:03:56,540 square brackets zero plus theta is square brackets 1 times X on the score 5 and that's it. 34 35 00:03:56,550 --> 00:03:58,930 Let me press shift enter and try to run this. 35 36 00:03:58,980 --> 00:04:01,890 See if I've made any errors no. 36 37 00:04:01,930 --> 00:04:02,860 So far so good. 37 38 00:04:03,950 --> 00:04:06,430 Now it's time to pull lots the thing. 38 39 00:04:07,830 --> 00:04:14,970 So in order to plotted Umno use an existing cell that we've already made copy it pasted in and then 39 40 00:04:15,060 --> 00:04:19,520 had the scatter plot to modify the code scrolling up. 40 41 00:04:20,110 --> 00:04:22,770 I'm going to take this cell right here. 41 42 00:04:22,770 --> 00:04:23,900 Copy it. 42 43 00:04:24,030 --> 00:04:34,780 Come down here and then I'm going to paste the cell below to add our gradient descent algorithms progress. 43 44 00:04:34,840 --> 00:04:44,380 I just have to add one line of code a scatter plot someone to do that on the axes so a X don't scatter 44 45 00:04:45,340 --> 00:04:47,170 parentheses. 45 46 00:04:47,170 --> 00:04:56,140 And then I have to supply three things I have to supply the x values the y values and the z values. 46 47 00:04:56,140 --> 00:05:03,250 So our X values are not chart where the theta zeros and we've stored this in a variable called plot 47 48 00:05:03,910 --> 00:05:08,950 underscore values and we've stored that as the first column. 48 49 00:05:08,950 --> 00:05:22,260 So to select the entire column I can add a colon comma and then a zero to select all the rows in column 49 50 00:05:22,380 --> 00:05:30,240 0 within this array and then for our theta one values I'm going to select plot on the score vowels and 50 51 00:05:30,240 --> 00:05:34,450 then I'm going to choose the second column so all the values in the second column. 51 52 00:05:34,470 --> 00:05:35,680 So these are all the rows. 52 53 00:05:35,850 --> 00:05:41,580 Once again a colon to select all the rows at column index 1. 53 54 00:05:41,670 --> 00:05:44,820 And finally we need the values for the z axis. 54 55 00:05:44,820 --> 00:05:50,760 And this was our MSE on a score vowels variable and that's pretty much it. 55 56 00:05:50,820 --> 00:05:53,970 That's the key information already supplied. 56 57 00:05:53,970 --> 00:06:01,110 Now in terms of sizing up all scatter plot I mean choose 80 for the size of the dots. 57 58 00:06:01,290 --> 00:06:06,680 And in terms of color I'm going to go for I don't know black. 58 59 00:06:06,730 --> 00:06:09,690 Now let me hit shift enter and see what this looks like. 59 60 00:06:14,850 --> 00:06:22,140 Now if you're looking at this chart and you don't see any changes the problem might not actually be 60 61 00:06:22,140 --> 00:06:23,430 with our code. 61 62 00:06:23,430 --> 00:06:27,290 The problem might be with our visualization. 62 63 00:06:27,450 --> 00:06:33,570 So I've chosen a black color for the dots and we've got quite a dark chart. 63 64 00:06:33,780 --> 00:06:39,790 So let's change the color and also the transparency of our chart. 64 65 00:06:40,140 --> 00:06:47,250 So I'll set the transparency of the plot surface with an alpha value having a pass and an alpha argument 65 66 00:06:47,400 --> 00:06:50,700 of zero point four. 66 67 00:06:50,700 --> 00:06:55,590 Now let me hit shift enter and see if we can track our gradient descent algorithm. 67 68 00:06:55,620 --> 00:06:56,160 Now 68 69 00:07:00,240 --> 00:07:05,400 ah now this is already much more clear. 69 70 00:07:05,400 --> 00:07:06,370 Brilliant. 70 71 00:07:06,390 --> 00:07:12,420 Now given that this is our crowning achievement for this lesson I think we're gonna have to style our 71 72 00:07:12,420 --> 00:07:14,710 chant in a different way. 72 73 00:07:14,790 --> 00:07:24,990 So I'm gonna go with a psychedelic theme by changing the color map from hot to Rainbow in honor of the 73 74 00:07:25,350 --> 00:07:33,710 final race track in Mario Kart and take a look at our wonderful rainbow colored half pipe here. 74 75 00:07:33,840 --> 00:07:34,830 Brilliant. 75 76 00:07:34,830 --> 00:07:41,820 So we can clearly see our step size which is initially large as we're far away from the minimum decrease 76 77 00:07:41,970 --> 00:07:44,470 as we're getting closer and closer. 77 78 00:07:44,550 --> 00:07:48,270 And the interesting thing about our gradient descent here is that we seem to be reaching the bottom 78 79 00:07:48,270 --> 00:07:52,170 of this half pipe but there's still some way to go. 79 80 00:07:52,170 --> 00:08:00,420 The only thing is that the slope is so shallow that all the steps at this point are very very small. 80 81 00:08:00,420 --> 00:08:06,180 So even though we progressed very very quickly to this point down here from there on we have to take 81 82 00:08:06,180 --> 00:08:09,750 many many steps to get to the actual minimum. 82 83 00:08:09,750 --> 00:08:13,710 Now all of this was a very very technical section. 83 84 00:08:13,710 --> 00:08:20,820 We've covered quite a lot of mathematical topics and we've really peered under the hood of what goes 84 85 00:08:20,820 --> 00:08:23,890 on in a machine learning algorithm. 85 86 00:08:23,910 --> 00:08:30,360 We've looked at various different cost functions and we've reviewed some calculus and we've played with 86 87 00:08:30,360 --> 00:08:35,720 an optimization algorithm by turning various knobs including the learning rates. 87 88 00:08:35,730 --> 00:08:37,320 The starting point. 88 89 00:08:37,320 --> 00:08:40,400 And finally even the cost function. 89 90 00:08:40,500 --> 00:08:41,840 So where do we go from here. 90 91 00:08:41,850 --> 00:08:43,180 Where do we go next. 91 92 00:08:43,290 --> 00:08:48,520 In the next section we're going to return to do something a bit more practical. 92 93 00:08:48,540 --> 00:08:55,350 We're going to look at a house price data set for the city of Boston in the United States and we're 93 94 00:08:55,350 --> 00:09:02,340 going to get into the business of predicting how much a house is worth based on its characteristics 94 95 00:09:03,450 --> 00:09:05,230 and the technique we're going to use. 95 96 00:09:05,280 --> 00:09:09,330 It's called multi variable regression. 96 97 00:09:09,330 --> 00:09:12,300 So congratulations for making it this far. 97 98 00:09:12,300 --> 00:09:16,850 I hope you're ready for tackling the challenges coming up in the next lessons. 98 99 00:09:17,040 --> 00:09:22,870 Looking out the window I can see that the almost never ending London drizzle seems to have stopped. 99 100 00:09:22,920 --> 00:09:29,130 So I think I'm going to take this opportunity to pop out to the shops and restock the coffee that's 100 101 00:09:29,130 --> 00:09:31,100 slowly running out here. 101 102 00:09:31,110 --> 00:09:31,770 See you soon.