0 1 00:00:00,570 --> 00:00:01,290 All right. 1 2 00:00:01,300 --> 00:00:07,270 So, if you haven't fired up to put a notebook in a while then open your Anaconda command line 2 3 00:00:07,300 --> 00:00:17,020 if you're in Windows, or go to terminal if you're on Mac and then type in the command "jupyter notebook" 3 4 00:00:18,190 --> 00:00:26,980 and hit Enter. Your browser should fire up and drop you off on local host with your folders. Here, 4 5 00:00:27,280 --> 00:00:33,970 you want to navigate to the MLProjects folder that you've set up earlier and there you want to create 5 6 00:00:34,120 --> 00:00:37,420 a new Python 3 notebook. 6 7 00:00:40,120 --> 00:00:40,740 And, 7 8 00:00:40,780 --> 00:00:42,830 we're gonna click up here what says Untitled. 8 9 00:00:42,940 --> 00:00:47,640 I'm gonna rename it to "0 3 9 10 00:00:48,060 --> 00:00:48,660 Gradient, 10 11 00:00:48,760 --> 00:00:52,950 (oh man I can't type), gradient descent." 11 12 00:00:53,120 --> 00:00:59,980 We're gonna hit "Rename" and now we're ready to go. For our first cell in our new notebook, 12 13 00:01:00,040 --> 00:01:03,550 we're not gonna be typing any Python code. 13 14 00:01:03,550 --> 00:01:09,130 Instead we're going to be making our notebook a little bit more pretty, a little bit more readable by 14 15 00:01:09,160 --> 00:01:11,300 inserting a heading. 15 16 00:01:11,320 --> 00:01:19,510 So what I'm going to do is I'm going to go to "Cell" and then change the "Cell Type" from "Code" to "Markdown". 16 17 00:01:20,050 --> 00:01:28,630 So select "Markdown" here and you'll notice that the "in" here has disappeared because this cell will 17 18 00:01:28,630 --> 00:01:32,190 now no longer be evaluated as code. 18 19 00:01:32,320 --> 00:01:34,770 So we said we're gonna make a title. 19 20 00:01:34,870 --> 00:01:41,570 Our first title is gonna be "Notebook Imports and Packages". 20 21 00:01:41,570 --> 00:01:50,110 And, we hit Shift+Enter. So now we can see here this is just plain text. And in the cell below 21 22 00:01:50,320 --> 00:01:52,360 we're gonna be adding our import statements. 22 23 00:01:52,390 --> 00:01:59,440 So the first thing that we're gonna import is our old friend matplotlib pyplot - we're going to be doing 23 24 00:01:59,440 --> 00:02:02,120 some plotting in this notebook. 24 25 00:02:02,120 --> 00:02:13,570 So I'm going to write "import matplotlib.pyplot as plt" - I'm going to stick with the same naming convention 25 26 00:02:13,570 --> 00:02:14,430 here. 26 27 00:02:14,800 --> 00:02:20,750 And we're also going to import another library, another package, called numpy. 27 28 00:02:21,220 --> 00:02:24,900 And this is commonly referred to as "np". 28 29 00:02:24,930 --> 00:02:29,780 So a lot of people use np and we'll use that as well. 29 30 00:02:29,800 --> 00:02:38,980 So I'm going to write "import numpy as np" and we're also going to add the good ole Jupyter notebook statement 30 31 00:02:39,490 --> 00:02:41,050 "% 31 32 00:02:41,050 --> 00:02:41,450 matplotlib 32 33 00:02:41,500 --> 00:02:44,310 inline". 33 34 00:02:44,660 --> 00:02:51,280 So I'm going to be doing some plotting, so we're gonna add this matplotlib inline statement so that we can 34 35 00:02:51,310 --> 00:03:00,820 export our plots very, very nicely. I'm going to hit Shift+Enter now and before we go on I want to revert back 35 36 00:03:00,880 --> 00:03:03,640 to this markdown up here. 36 37 00:03:03,640 --> 00:03:10,210 I'd really like to have a section heading, and let me show you how we can get that, because "Notebook Imports 37 38 00:03:10,240 --> 00:03:13,260 and Packages" is very, very small. 38 39 00:03:13,480 --> 00:03:23,080 If we put a hashtag in front of this and hit Space then immediately the font becomes a lot bigger, bolder 39 40 00:03:23,170 --> 00:03:31,540 and blue and this is because this hashtag is telling Jupyter that now "Notebook Imports and Packages" should 40 41 00:03:31,540 --> 00:03:36,000 be considered as a heading. It's not actually gonna look like this when I hit Shift+Enter, it's going 41 42 00:03:36,000 --> 00:03:40,100 to look like this - it's gonna be big and bold. 42 43 00:03:40,210 --> 00:03:45,740 So this is a very nice way to create section headings in your notebook. 43 44 00:03:45,760 --> 00:03:53,530 Now if you wanted this a little smaller and one level down in the size and boldness then you can put 44 45 00:03:53,530 --> 00:04:00,690 two hashtags and you can see how the markdown adjusts really nicely and will look like this in contrast. 45 46 00:04:00,760 --> 00:04:01,910 Right. 46 47 00:04:01,930 --> 00:04:03,070 And you can even try three. 47 48 00:04:03,070 --> 00:04:03,360 Right. 48 49 00:04:03,370 --> 00:04:12,190 So there's different levels of boldness and size that you can you can play with for your section headings 49 50 00:04:12,190 --> 00:04:14,280 to keep your notebook organized. 50 51 00:04:14,380 --> 00:04:20,300 So I go with one, and this way we can find our imports very quickly. 51 52 00:04:20,440 --> 00:04:24,320 Now we're gonna dive straight into our first example. 52 53 00:04:24,340 --> 00:04:31,240 Now the thing I'm going to do though is, I'm going to take this cell here and I'm going to modify it to 53 54 00:04:31,240 --> 00:04:34,600 be a markdown cell as well. 54 55 00:04:34,600 --> 00:04:40,690 So this is my chance to share a little bit about what Jupyter notebook can do. So I'm going to say "Example 1", 55 56 00:04:41,770 --> 00:04:53,170 "Example 1" and example 1 is going to be about a very, very, very simple cost function and the cost 56 57 00:04:53,170 --> 00:04:55,810 function is going to look like this. 57 58 00:04:55,810 --> 00:05:07,280 It's going to be f(x) = x^2 + x + 1. 58 59 00:05:07,420 --> 00:05:10,040 Let's see what this markdown would actually look like. 59 60 00:05:10,300 --> 00:05:17,650 Would look something like this but, um, the cool thing about markdown is that you can actually make the 60 61 00:05:17,650 --> 00:05:21,850 mathematical notation look a lot better. 61 62 00:05:22,360 --> 00:05:31,070 So you can add a dollar sign in the front and you can add a dollar sign in the back. 62 63 00:05:31,060 --> 00:05:39,730 And if I press Shift+Enter now, then you can see that indeed we have this formatting here in the markdown 63 64 00:05:39,730 --> 00:05:43,010 cell looking like an actual equation. 64 65 00:05:43,060 --> 00:05:45,490 So this is gonna be Example 1 is 65 66 00:05:45,590 --> 00:05:46,630 going to look like this. 66 67 00:05:46,630 --> 00:05:53,380 I can make this bigger as well so I can give it a section heading if you will. 67 68 00:05:53,380 --> 00:06:01,470 And now it's going to look like this so this is gonna be a simple cost function. 68 69 00:06:01,530 --> 00:06:02,640 All right. 69 70 00:06:02,740 --> 00:06:03,780 Here we go. 70 71 00:06:03,830 --> 00:06:04,340 Now, 71 72 00:06:04,780 --> 00:06:10,000 if this is the first time that you're seeing this then you might be confused about what are these dollar 72 73 00:06:10,000 --> 00:06:11,350 symbols, right? 73 74 00:06:11,380 --> 00:06:21,700 Well this is a markup notation from a system called LaTeX and LaTeX uses tags like these dollar 74 75 00:06:21,700 --> 00:06:29,440 signs to mark a particular section as a mathematical notation. If I double up on the dollar 75 76 00:06:29,440 --> 00:06:32,820 signs, then I'm giving it a different kind of tag. 76 77 00:06:33,100 --> 00:06:36,210 And you can see that you'll have a different formatting as well. 77 78 00:06:36,210 --> 00:06:44,110 So in this case it's centered, so single dollar sign is in line 78 79 00:06:44,230 --> 00:06:48,860 and the other tag the double dollar sign is for display. 79 80 00:06:48,970 --> 00:06:54,140 So we're gonna be using a LaTeX a little more in this module and 80 81 00:06:54,400 --> 00:07:00,670 you might actually see this in in many, many other places as well - it's super popular for writing 81 82 00:07:00,670 --> 00:07:06,610 mathematical equations or scientific papers especially in academia. 82 83 00:07:06,670 --> 00:07:14,050 The best analogy that I can think of of how LaTeX works is that it works really similar to XML 83 84 00:07:14,110 --> 00:07:15,270 or HTML. 84 85 00:07:16,000 --> 00:07:22,330 So if you were to go to a Web site like say example.com and you right click on it and you go to 85 86 00:07:22,750 --> 00:07:28,380 "View Page Source", then you'll see the HTML documents. 86 87 00:07:28,510 --> 00:07:31,920 So while the website will look like this - "Example 87 88 00:07:32,170 --> 00:07:39,240 Domain" is actually surrounded by two HTML tags - title and title, 88 89 00:07:39,580 --> 00:07:46,080 so a beginning tag and an end tag. And LaTeX sort of works a bit like this as well. 89 90 00:07:46,090 --> 00:07:53,200 You've got a markup, meaning these tags, that kind of give structure to your document. 90 91 00:07:53,200 --> 00:08:01,060 So it's through these tags that the Jupyter notebook knows how to format a particular section of text 91 92 00:08:01,180 --> 00:08:07,880 in the in the markup cells. Now that we've added the markup and our section heading, 92 93 00:08:07,900 --> 00:08:14,440 we can now actually, you know, write the Python code for this function and it would look like this "def 93 94 00:08:14,860 --> 00:08:25,930 f(x):", new Line, "return x", double multiplication sign for the power, 94 95 00:08:25,990 --> 00:08:32,650 so "**2+x+1" . 95 96 00:08:32,710 --> 00:08:34,380 And that's our function. 96 97 00:08:34,470 --> 00:08:35,020 Right? 97 98 00:08:35,040 --> 00:08:38,810 This is our function in Python code. 98 99 00:08:38,820 --> 00:08:42,150 Now what we're going to do is we're going to generate the data. 99 100 00:08:42,160 --> 00:08:50,760 I'm just going to add a little comment here "Make Data" and the way we're going to do this is by using num- 100 101 00:08:50,760 --> 00:08:51,680 py. 101 102 00:08:51,690 --> 00:08:58,650 So our data is going to sit in a variable, I'm going to call it x_1 - "_1" 102 103 00:08:58,650 --> 00:09:05,850 because this is our first example, so x_1 is gonna be equal to something and it's 103 104 00:09:05,890 --> 00:09:17,990 gonna get its value from a numpy function - numpy was np and the function we're gonna call is "lin- 104 105 00:09:18,000 --> 00:09:19,160 space". 105 106 00:09:19,240 --> 00:09:20,150 Yeah. 106 107 00:09:20,460 --> 00:09:33,100 "np.linspace(start = - 3, stop = 3)" 107 108 00:09:34,090 --> 00:09:45,700 and then "num" is gonna equal, say, 100. I'm going to hit Shift+Enter and explain what I did just there. 108 109 00:09:45,700 --> 00:09:54,250 So the linspace function is something that comes from the numpy library - I'm going to pull up the 109 110 00:09:54,250 --> 00:10:01,120 documentation for you guys and I've got a couple of arguments that I gave it - I gave it a start value, 110 111 00:10:01,180 --> 00:10:09,940 a stop value and a value for this third one called num. Now linspace creates a sequence, a sequence 111 112 00:10:09,940 --> 00:10:15,460 of numbers that creates our data for us and it creates a sequence of numbers between the start value 112 113 00:10:15,580 --> 00:10:24,070 and the stop value and the number of samples is set by that third parameter so that third value, the num 113 114 00:10:24,080 --> 00:10:29,050 value. Back in our Python notebook we can actually take a look at what this would look like 114 115 00:10:29,450 --> 00:10:36,040 so if I put x_1 here and hit Shift+Enter then we can see what it is that we've actually got 115 116 00:10:36,040 --> 00:10:46,490 back; we've got back an array that starts at -3 and goes to 3 and has a hundred individual 116 117 00:10:46,850 --> 00:10:47,440 data points - 117 118 00:10:47,450 --> 00:10:54,480 one hundred individual values that are spaced out equally between -3 and 3. 118 119 00:10:54,560 --> 00:11:04,250 If this, you know, was I don't know the number 10 instead of 100 then I'd get an array with much fewer 119 120 00:11:04,280 --> 00:11:05,090 values, right? 120 121 00:11:05,090 --> 00:11:10,590 Just 1,2,3,4,5,6,7,8,9,10. 121 122 00:11:10,640 --> 00:11:10,820 Right? 122 123 00:11:11,270 --> 00:11:16,670 So you can see this linspace function here is very, very handy for generating data. 123 124 00:11:16,890 --> 00:11:24,760 And one of the things I quite like doing is actually adding the names for the arguments 124 125 00:11:25,640 --> 00:11:31,040 when I make a call to my function because I just find it's so much more readable than having it like 125 126 00:11:31,580 --> 00:11:33,890 -3, 3, 10. 126 127 00:11:33,890 --> 00:11:34,300 Right? 127 128 00:11:34,340 --> 00:11:40,400 That's much harder to read, especially when you're coming back to it and you might 128 129 00:11:40,400 --> 00:11:42,720 not remember it so well. 129 130 00:11:42,890 --> 00:11:49,450 So x_1 is an array with approximately 100 values. 130 131 00:11:49,500 --> 00:11:52,410 We can even give it more, we can give it like say 500. 131 132 00:11:52,460 --> 00:11:53,380 Right? 132 133 00:11:53,460 --> 00:11:54,820 I'm going to hit Shift+Enter 133 134 00:11:54,920 --> 00:12:00,720 And that's us having made our data. Now that we've got our data, 134 135 00:12:00,980 --> 00:12:02,450 let's plot it, right? 135 136 00:12:02,690 --> 00:12:13,070 Let's graph it using our function, so we can add the plot by using our matplotlib - plt, so we can actually 136 137 00:12:13,070 --> 00:12:14,660 do this in one line. 137 138 00:12:14,660 --> 00:12:14,930 Right? 138 139 00:12:14,940 --> 00:12:23,910 We can say "plt.plot", and then "(x_1, )", and then what, 139 140 00:12:24,490 --> 00:12:34,840 f(x_1)" - we're going to feed the actual data that we have in x_1 into our 140 141 00:12:35,200 --> 00:12:41,080 function here and we are going to plot f(x_1) - this is our y right. 141 142 00:12:41,140 --> 00:12:44,180 And x_1 - this is our x. 142 143 00:12:44,470 --> 00:12:48,790 So let's call plt.show(). 143 144 00:12:49,420 --> 00:12:50,350 See what happens. 144 145 00:12:50,710 --> 00:12:51,190 And here it is. 145 146 00:12:51,190 --> 00:12:59,740 This is what our function actually looks like - it's a parabola. I'm going to actually center this a little bit as well, 146 147 00:12:59,740 --> 00:13:10,390 so I'm going to set the axes by writing "plt.xlim()", and I'm going to give it a range of 147 148 00:13:10,390 --> 00:13:11,130 -3 148 149 00:13:11,440 --> 00:13:14,230 uh, to, I don't know, 3. 149 150 00:13:14,320 --> 00:13:15,110 Right. 150 151 00:13:15,160 --> 00:13:23,680 This is what we generated our data from, right - from -3 to 3 and the y axis, 151 152 00:13:23,680 --> 00:13:33,940 we can set with "plt.ylim(0,8)" 152 153 00:13:34,090 --> 00:13:37,980 See what we get. 153 154 00:13:38,030 --> 00:13:41,170 This is starting to look a little bit better. 154 155 00:13:41,240 --> 00:13:54,370 I'm going add two labels as well with "plt.xlabel()" and then the string 'X', I'm going to give it a font size of 155 156 00:13:55,000 --> 00:13:57,850 16. And for the y axis. 156 157 00:13:57,850 --> 00:14:05,320 we're gonna just write f(x), so "plt.ylabel()" and then I'm going to give it the string 157 158 00:14:06,070 --> 00:14:16,270 "f(x)", font size is also equal to 16. 158 159 00:14:16,270 --> 00:14:17,290 There we go. 159 160 00:14:19,000 --> 00:14:22,110 All right so that's the plot of our function. 160 161 00:14:22,240 --> 00:14:23,020 That's pretty good, right? 161 162 00:14:23,050 --> 00:14:25,120 So we've got a cost function - 162 163 00:14:25,120 --> 00:14:29,600 sn example of one: x^2+x+1. 163 164 00:14:29,710 --> 00:14:36,980 And the way we've kind of broken this down is we've created a function, a Python function, right? 164 165 00:14:37,120 --> 00:14:39,550 Also called it f(x) - less confusing. 165 166 00:14:39,580 --> 00:14:46,480 We've created some data, and the reason we had to do this was so that we could generate a nice graph, 166 167 00:14:46,620 --> 00:14:47,320 right? 167 168 00:14:47,410 --> 00:14:56,830 So x_1 used numpy's linspace and then we've just used matplotlib again to graph our 168 169 00:14:57,040 --> 00:14:59,830 data here. In the next lesson, 169 170 00:15:00,070 --> 00:15:06,420 we're gonna set the stage for minimizing our cost. If f(x) is our cost, 170 171 00:15:06,880 --> 00:15:11,090 then the lowest cost will be at the bottom of this graph. 171 172 00:15:11,110 --> 00:15:11,410 Right? 172 173 00:15:11,410 --> 00:15:15,510 So it'll be somewhere around here. 173 174 00:15:15,580 --> 00:15:22,900 All we have to do now is find out what's the lowest cost and for what value of x is our cost 174 175 00:15:22,900 --> 00:15:23,470 the lowest.