0 1 00:00:00,840 --> 00:00:08,280 All right so in this lesson what we're gonna do is we're gonna run this gradient descent without 1 2 00:00:08,360 --> 00:00:09,200 SymPy. 2 3 00:00:09,360 --> 00:00:15,000 We're going to look at making our Jupyter notebook a little bit more performant. 3 4 00:00:15,030 --> 00:00:20,250 This gives us a chance to solve this problem the old fashioned way, the way we've done it for the previous 4 5 00:00:20,250 --> 00:00:21,270 examples 5 6 00:00:21,480 --> 00:00:28,020 and one reason to do this is that you'll notice this performance difference in your Jupyter notebook 6 7 00:00:28,280 --> 00:00:32,700 because it turns out running it without SymPy is actually much faster. 7 8 00:00:33,690 --> 00:00:42,990 But first, let's add the markdown for our partial derivatives in LaTeX notation. 8 9 00:00:43,050 --> 00:00:49,380 So I'm going to go back to this section heading here and we're going to write down the partial derivatives 9 10 00:00:49,740 --> 00:00:55,620 in mathematical notation because it will give us a good opportunity to see a few more of these LaTeX 10 11 00:00:55,770 --> 00:00:59,790 tags in action. In our print statement, on the cell below, 11 12 00:00:59,790 --> 00:01:06,110 we've got the partial derivative with respect to x and it looks like this. 12 13 00:01:06,120 --> 00:01:14,250 So the question is how to translate this very little bit ugly and unreadable equation into LaTeX 13 14 00:01:14,250 --> 00:01:16,150 notation? 14 15 00:01:16,320 --> 00:01:22,500 The first thing I'm going to do is use the opening double dollar sign tags and I'm going to have a little 15 16 00:01:22,500 --> 00:01:23,120 fraction here, 16 17 00:01:23,150 --> 00:01:35,550 I'm going to say "\frac{f}{x}" 17 18 00:01:35,550 --> 00:01:40,100 and then two more dollar signs. Let's see what this looks like. 18 19 00:01:40,970 --> 00:01:48,320 So we can see that our fraction tag and the dollar signs together center our equation like this. 19 20 00:01:48,320 --> 00:01:51,110 So let's take a look at this, to get this a bit bigger, 20 21 00:01:51,110 --> 00:01:58,910 I'm going to add the two pound symbols before this thing and press Shift+Enter again so we can see it 21 22 00:01:59,930 --> 00:02:01,670 a bit larger. 22 23 00:02:01,670 --> 00:02:07,820 Now I'm going to add the mathematical notation for the partial derivatives, so I can go back into my 23 24 00:02:07,820 --> 00:02:16,750 fraction and use the backslash again and before the f, I'm going to use a tag called "partial" and have a space 24 25 00:02:16,910 --> 00:02:21,920 and you can see that the markdown actually highlights this tag differently, 25 26 00:02:21,940 --> 00:02:24,900 It highlights it in green just like it does with the fraction tag. 26 27 00:02:25,010 --> 00:02:31,270 So that way I know just from the syntax highlighting that this is a legitimate LaTeX tag. 27 28 00:02:31,560 --> 00:02:35,230 I'm going to also add the partial tag to the bottom. 28 29 00:02:35,350 --> 00:02:42,020 So I'm going to "\partial x" and then after the curly braces I'm going to put an equal sign 29 30 00:02:42,910 --> 00:02:48,590 and I'm going to press Shift+Enter and see what it looks like. And here we go. Here we can see the partial 30 31 00:02:48,590 --> 00:02:54,120 tag translates into these little delta symbols for the equation. 31 32 00:02:54,140 --> 00:02:58,970 Okay so now we can have the top part of the fraction on the right hand side. 32 33 00:02:59,690 --> 00:03:06,810 And that was "2x" had log, I'm going to use the natural log here, 33 34 00:03:06,860 --> 00:03:20,320 So "\ln(3)*3^{-x 34 35 00:03:20,320 --> 00:03:28,320 ^2 - y^2}" 35 36 00:03:28,360 --> 00:03:31,770 Let's see what this looks like. Now, 36 37 00:03:31,770 --> 00:03:37,930 now one thing I don't like is this times symbol that we've got here. It would be much nicer to have a dot 37 38 00:03:38,230 --> 00:03:41,700 and there is a LaTeX tag for that as well 38 39 00:03:41,970 --> 00:03:47,930 and it's "\cdot". Pressing Shift+Enter, 39 40 00:03:48,230 --> 00:03:52,690 we can see that it now looks like this which is a lot nicer. 40 41 00:03:52,940 --> 00:03:55,400 So we've got our top part of a fraction now. 41 42 00:03:55,400 --> 00:03:56,630 Now, all I have to do is actually, 42 43 00:03:57,050 --> 00:04:01,170 well now all I have to do is actually put it in a fraction. 43 44 00:04:01,310 --> 00:04:10,580 So I'm going to say "\frac", open curly braces and at the end closing curly braces. 44 45 00:04:10,930 --> 00:04:18,320 And I'm going to open another pair of curly braces at the en. This is going to be for the bottom part of our 45 46 00:04:18,320 --> 00:04:24,050 fraction. Now at the bottom we had 3 to the power of, 46 47 00:04:24,050 --> 00:04:34,910 and then it was "{-x^2-y^2}", 47 48 00:04:35,660 --> 00:04:37,350 + 1. 48 49 00:04:37,470 --> 00:04:40,960 Let's press Shift+Enter and see what this looks like. 49 50 00:04:41,060 --> 00:04:44,070 So now our equation looks like that - we've got a fraction, 50 51 00:04:44,240 --> 00:04:50,670 we've got the top part and we've got the bottom part. But this isn't quite correct, 51 52 00:04:50,680 --> 00:04:56,950 we have to add a little bit more notation since the bottom is actually squared. 52 53 00:04:57,070 --> 00:04:58,310 So how do we do that? 53 54 00:04:58,420 --> 00:05:06,910 One of the things I can do is I can take a parentheses here and a parentheses at the end and then have 54 55 00:05:06,910 --> 00:05:10,690 that whole thing to the power of two and pressing Shift+Enter, 55 56 00:05:10,690 --> 00:05:13,270 it would then look like this. 56 57 00:05:13,400 --> 00:05:16,900 I have the parentheses and then the two, 57 58 00:05:16,950 --> 00:05:19,200 like so. 58 59 00:05:19,740 --> 00:05:22,860 And you know the thing is this isn't even all that terrible, 59 60 00:05:23,000 --> 00:05:30,380 but I can use LaTeX notation again to style these parentheses a little differently. 60 61 00:05:30,380 --> 00:05:40,900 So instead of having just a normal parentheses here, I can actually have a backslash and say "\left", 61 62 00:05:40,940 --> 00:05:43,530 so this is one of the tags, 62 63 00:05:43,850 --> 00:05:50,720 and then at the end here instead of having the closing parentheses I can actually also put 63 64 00:05:51,020 --> 00:05:57,740 "\right" and then have the closing parentheses like so. 64 65 00:05:57,740 --> 00:06:04,960 So this code here, this markdown, now refers to the right or closing parenthesis. Then you press Shift+ 65 66 00:06:04,960 --> 00:06:08,740 Enter to show what this looks like in comparison. 66 67 00:06:08,740 --> 00:06:16,030 You can seen now with the "left" and "right" parentheses using the LaTeX tags, it actually looks 67 68 00:06:16,330 --> 00:06:17,620 a lot better. 68 69 00:06:17,950 --> 00:06:25,240 And just like that we're done. We have gut our partial derivative with respect to x formatted very 69 70 00:06:25,240 --> 00:06:35,180 beautifully in LaTeX notation. Let's add the partial derivative with respect to y now, so this is the 70 71 00:06:35,180 --> 00:06:44,150 really easy part because all we have to do is copy this, paste it and change this x to a y, and change 71 72 00:06:44,360 --> 00:06:54,850 this x to a y, and press Shift+Enter. Now we've got both our partial derivatives displayed here. 72 73 00:06:54,940 --> 00:07:01,960 Now if you're suspicious and you don't believe me that this is indeed the partial derivative for y, then 73 74 00:07:01,960 --> 00:07:09,460 you can go down here and you can refresh this cell here, just by changing this a here where we're calling 74 75 00:07:09,460 --> 00:07:16,630 the differentiation function from SymPy to b, hitting Shift+Enter and then you should be able to 75 76 00:07:16,630 --> 00:07:21,510 confirm that this indeed is the partial derivative with respect to y. 76 77 00:07:21,580 --> 00:07:27,970 So now that we've figured out the functional form for both of these partial derivatives, can you as a 77 78 00:07:27,970 --> 00:07:34,830 challenge, can you write these partial derivative functions as Python functions? 78 79 00:07:34,900 --> 00:07:37,970 I'll also give you a hint - when you're writing the Python functions, 79 80 00:07:38,010 --> 00:07:44,680 there's actually one additional requirement that you have to consider. Also for the function names use 80 81 00:07:44,680 --> 00:07:51,470 the names fpx and fpy for the names of the partial derivative functions. 81 82 00:07:51,550 --> 00:07:54,640 I'll give you a few seconds to pause the video and give this a shot. 82 83 00:07:58,400 --> 00:07:59,400 All right, you ready? 83 84 00:07:59,400 --> 00:08:02,310 Here's the solution. For the solution, 84 85 00:08:02,310 --> 00:08:05,280 I'm going to add a little Python comment, 85 86 00:08:05,280 --> 00:08:10,310 Now let's just say "Partial Derivative Functions 86 87 00:08:12,010 --> 00:08:12,790 example 4". 87 88 00:08:16,160 --> 00:08:17,720 And here they are. 88 89 00:08:17,780 --> 00:08:29,090 It's gonna be "def fpx", which needs two inputs, an x and a y, colon and to write the partial derivative 89 90 00:08:29,090 --> 00:08:35,380 for this, I'm going to again use a little simplification and define a variable called r and that's gonna 90 91 00:08:35,390 --> 00:08:37,070 be equal to part of my expression, 91 92 00:08:37,070 --> 00:08:47,630 I'm going to have 3**(-x**2 - y**2), 92 93 00:08:47,630 --> 00:08:57,510 so then my derivative is going to be 2*x*log(3*r). 93 94 00:08:57,650 --> 00:09:10,680 This is the top part and the bottom part of that fraction is gonna be (r+1)**2 and that's it. 94 95 00:09:10,680 --> 00:09:17,490 That's my partial derivative with respect to x and as we've already discussed this is very, very similar 95 96 00:09:17,820 --> 00:09:19,260 to the partial derivative 96 97 00:09:19,260 --> 00:09:28,530 with respect to y so I'm just going to copy this and change this to y and change this x in my return 97 98 00:09:28,530 --> 00:09:31,590 statement to y as well. 98 99 00:09:32,100 --> 00:09:39,330 Now I said there was one requirement that you have to take into account and this is the log function 99 100 00:09:39,330 --> 00:09:40,730 here. 100 101 00:09:40,770 --> 00:09:43,260 This comes from the math library, 101 102 00:09:43,260 --> 00:09:50,510 the math module and we have to import this functionality in our Python notebook for it to work. 102 103 00:09:51,090 --> 00:09:57,660 Otherwise, when it comes to running fpx or fpy to evaluate one of these functions you're actually gonna 103 104 00:09:57,660 --> 00:09:58,180 get an error. 104 105 00:09:58,200 --> 00:10:06,330 So if I had fpx(1.8, 1.0) and I hit Shift+Enter on this cell, I'm actually 105 106 00:10:06,330 --> 00:10:07,640 going to get an error. 106 107 00:10:07,770 --> 00:10:10,330 And the reason is is that log is not defined. 107 108 00:10:10,410 --> 00:10:17,380 I can't just use the log functionality like I would up here without importing the module first. 108 109 00:10:17,730 --> 00:10:23,540 So let me go back up to the very, very top and then down here, 109 110 00:10:23,550 --> 00:10:28,290 I'm going to say "from math import log", 110 111 00:10:28,920 --> 00:10:35,520 then I'm going to hit Shift+Enter on this and I can scroll back down again I can try rerunning this cell 111 112 00:10:35,520 --> 00:10:43,770 here and I can see that my slope, my partial derivative with respect to x, is equal to 112 113 00:10:44,100 --> 00:10:46,410 0.037 113 114 00:10:46,410 --> 00:10:50,900 which is exactly what I am getting here. 114 115 00:10:50,910 --> 00:10:57,510 In this case I'm using SymPy to evaluate my partial derivative 115 116 00:10:57,510 --> 00:11:04,650 and in this case I've already got my partial derivative here as a function and I can just plug in the 116 117 00:11:04,650 --> 00:11:06,240 values. 117 118 00:11:06,240 --> 00:11:12,420 So one question you might ask is: well why did you use log, why is it not ln? And the answer is is that 118 119 00:11:12,420 --> 00:11:20,010 if I check the documentation on this, then you see here, if I hit that little plus sign that if the base 119 120 00:11:20,100 --> 00:11:21,400 is not specified, 120 121 00:11:21,420 --> 00:11:27,920 so if there's no second argument being passed to this function it returns the natural logarithm, 121 122 00:11:27,930 --> 00:11:31,350 so ln, base e, of x. 122 123 00:11:31,470 --> 00:11:42,240 So in fact this log(3) is the natural logarithm or ln which is what we've got up here for our partial 123 124 00:11:42,240 --> 00:11:46,740 derivative function in the LaTeX notation. 124 125 00:11:46,770 --> 00:11:52,050 So now we've got that out of the way we can do a bit of a horse race between these two methodologies - 125 126 00:11:52,530 --> 00:12:00,570 we can namely take this cell here and we're gonna copy it, we're going to copy this cell and what we're gonna 126 127 00:12:00,570 --> 00:12:05,090 do is we're gonna paste that cell here. 127 128 00:12:05,310 --> 00:12:09,220 I don't need this one, I'm going to delete that. 128 129 00:12:09,390 --> 00:12:19,170 So I'm going to go to "Edit" > "Delete Cells" and then I'm going to modify this cell. In particular, I'm going to modify 129 130 00:12:19,320 --> 00:12:20,670 these two parts 130 131 00:12:24,550 --> 00:12:30,630 to use my fpx and my fpy functions instead. 131 132 00:12:30,760 --> 00:12:31,400 You know what? 132 133 00:12:31,480 --> 00:12:36,490 Maybe you should give this a quick shot, see if you can figure out what code should go here in order 133 134 00:12:36,490 --> 00:12:43,300 to use fpx and fpy, the partial derivative functions instead of what we had before. 134 135 00:12:43,360 --> 00:12:49,500 I'll let you pause the video and I'll show you how to do it in a second. All good? 135 136 00:12:50,850 --> 00:12:52,130 Here's the solution. 136 137 00:12:52,260 --> 00:12:59,180 You simply call the two functions and supply the x value and the y values. 137 138 00:12:59,400 --> 00:13:03,470 So the x value was under "params 138 139 00:13:03,600 --> 00:13:09,730 [0]", and the y value was under "params[1]". 139 140 00:13:10,530 --> 00:13:11,280 And that's it. 140 141 00:13:11,280 --> 00:13:19,750 That's the gradient or partial derivative with respect to x. And for our gradient in the y direction, 141 142 00:13:20,110 --> 00:13:23,150 we do exactly the same thing with one difference - 142 143 00:13:23,170 --> 00:13:25,660 we're gonna use fpy, 143 144 00:13:25,840 --> 00:13:28,360 the partial derivative with respect to y 144 145 00:13:28,560 --> 00:13:35,560 and we're gonna supply the same two inputs, the first value in our array and the second value in our 145 146 00:13:35,650 --> 00:13:36,890 array. 146 147 00:13:36,890 --> 00:13:43,310 Now I'm going to run the cell with 500 iterations and see how well it performs. 147 148 00:13:43,330 --> 00:13:53,890 Ready, steady, go! And it's done. Going back up here to where we're using SymPy, I can change the 148 149 00:13:53,890 --> 00:14:02,620 maximum number of iterations to 500 and rerun the cell. Thinking, thinking, thinking, thinking, thinking, 149 150 00:14:02,680 --> 00:14:05,050 there we go. 150 151 00:14:05,050 --> 00:14:11,320 So you can see that Python is actually doing a lot of extra work in this case. Every single time this 151 152 00:14:11,320 --> 00:14:18,790 loop runs it has to differentiate our cost function and then evaluate that cost function. 152 153 00:14:18,790 --> 00:14:25,540 And these instructions actually are a little bit more resource intensive than if it already knew from 153 154 00:14:25,540 --> 00:14:29,360 the get go what the partial derivative was. 154 155 00:14:29,360 --> 00:14:36,400 And this is one of the drawbacks of why you might not want to use SymPy in your loop if you're 155 156 00:14:36,400 --> 00:14:41,190 running an optimization algorithm many, many, many times. 156 157 00:14:41,450 --> 00:14:42,980 So I hope you enjoyed that. 157 158 00:14:43,180 --> 00:14:47,650 So I really think this is an interesting exercise because it really shows some of the pros and cons 158 159 00:14:47,650 --> 00:14:51,890 of the different tools that are at our disposal. 159 160 00:14:51,970 --> 00:14:59,680 On one hand SymPy can make our life a lot easier when it comes to calculating derivatives and doing 160 161 00:14:59,680 --> 00:15:02,370 symbolic mathematics and much, much more. 161 162 00:15:02,470 --> 00:15:09,460 But on the other hand, it does show that you have to be clever when it comes to running your optimization 162 163 00:15:09,460 --> 00:15:10,270 algorithm. 163 164 00:15:10,270 --> 00:15:17,410 You do have to think about a little bit about the resources that you're using and running your optimization 164 165 00:15:18,210 --> 00:15:20,940 and this kind of goes back to thinking like an engineer, 165 166 00:15:20,950 --> 00:15:28,270 thinking about the resources that are at your disposal and how you can write your code and how you can 166 167 00:15:28,270 --> 00:15:35,070 choose your algorithm to make the most use of the resources that you have. OK, 167 168 00:15:35,100 --> 00:15:36,780 so what's left to do? 168 169 00:15:36,810 --> 00:15:39,150 Well, only one thing right? 169 170 00:15:39,150 --> 00:15:46,080 Plotting the gradient descent on our wonderful 3D chart and this is what we're going to do next lesson. 170 171 00:15:46,100 --> 00:15:51,260 So I still got that coffee pumping through my veins so I hope I'll see you there.