0 1 00:00:01,300 --> 00:00:07,750 In this lesson, we're going to apply all our Python programming knowledge to do the regression and analyze 1 2 00:00:07,750 --> 00:00:16,050 the data from a 1968 American study on how drugs affect math test scores. 2 3 00:00:16,180 --> 00:00:22,600 So we're going to be writing some Python code to visualize and plot our data and then we're going to 3 4 00:00:22,600 --> 00:00:31,810 run a linear regression to look at how well drug concentration in the tissue affects test performance. 4 5 00:00:32,780 --> 00:00:37,820 But before we do that, let's chat a little bit about the dataset that we're going to be using for this 5 6 00:00:37,820 --> 00:00:39,140 exercise. 6 7 00:00:39,170 --> 00:00:44,630 I was actually genuinely surprised when I found out that research like this even exists. 7 8 00:00:44,630 --> 00:00:52,370 But then again, maybe the late 1960s where a different time in the United States. In the original research 8 9 00:00:52,370 --> 00:00:59,510 paper the three academics by the names of Wagner, Aghajanian and Bing get five male volunteers 9 10 00:00:59,780 --> 00:01:04,600 to sit math tests after being injected with drugs. 10 11 00:01:04,670 --> 00:01:11,480 Now I reckon the fact that these drugs were administered intravenously speaks way more for the commitment 11 12 00:01:11,480 --> 00:01:18,760 of these five blokes than sitting math tests voluntarily. Anyhow, the specific drug in question is 12 13 00:01:18,760 --> 00:01:21,390 called, and I hope I pronounced this right - 13 14 00:01:21,430 --> 00:01:32,220 the lysergic acid diethylamide, or LSD 25, which Google tells me is a pretty potent hallucinogen. And 14 15 00:01:32,230 --> 00:01:37,960 the way that this experiment worked is that these five blokes had to give seven blood samples after 15 16 00:01:37,960 --> 00:01:40,300 being injected with LSD. 16 17 00:01:40,810 --> 00:01:46,690 The amount of drugs that each dude was injected was actually proportional to their body weight so that 17 18 00:01:46,720 --> 00:01:51,370 each volunteer should be roughly affected more or less equally. 18 19 00:01:51,520 --> 00:01:59,140 The blood samples were taken from the volunteers after five minutes, half an hour, two hours, four hours 19 20 00:01:59,320 --> 00:02:02,400 and again after eight hours. 20 21 00:02:02,410 --> 00:02:06,130 Now if you remember this is what you see in the first column of our data 21 22 00:02:06,130 --> 00:02:16,530 dataframe. The second column labelled LSD_ppm is the tissue concentration of the drug measured 22 23 00:02:16,620 --> 00:02:21,610 in parts per million. And after each blood sample was taken, 23 24 00:02:21,750 --> 00:02:28,050 the volunteers were sat down for a total of three minutes to solve as many arithmetic math problems 24 25 00:02:28,170 --> 00:02:29,620 as possible. 25 26 00:02:29,970 --> 00:02:33,950 And this brings me to our third column in the data frame. 26 27 00:02:34,050 --> 00:02:41,520 I've labeled this Avg_Math_Test_Score and the way these numbers 27 28 00:02:41,520 --> 00:02:49,020 are calculated is that, first off the number shown in the column is actually the average of the five 28 29 00:02:49,020 --> 00:02:52,140 guys in the experiment. 29 30 00:02:52,140 --> 00:02:58,770 The second thing to note about these numbers is that the score is expressed as a percentage value and 30 31 00:02:58,770 --> 00:03:02,010 it's not like the number of questions that they got right. 31 32 00:03:02,190 --> 00:03:07,110 But it's a percentage of the control value. 32 33 00:03:07,350 --> 00:03:13,830 In other words, the researchers knew how well people fared on average on their arithmetic tests and they 33 34 00:03:13,830 --> 00:03:20,910 compared the performance of the volunteers against that, against the control test scores. 34 35 00:03:20,910 --> 00:03:27,210 Another way to think about this is that if there were 10 questions and the control group got around 35 36 00:03:27,300 --> 00:03:34,230 nine of them right, then the volunteers got about three or 32.9 percent of the questions 36 37 00:03:34,230 --> 00:03:38,800 right, 240 minutes or four hours into the experiment. 37 38 00:03:39,970 --> 00:03:47,100 Okay, so we've imported pandas in our notebook and then we called the read_csv function to store our 38 39 00:03:47,100 --> 00:03:51,480 data in a data frame called data. 39 40 00:03:51,740 --> 00:03:55,290 Now we've been messing around with our data quite a bit. 40 41 00:03:55,290 --> 00:04:02,130 So if you're curious what your data frame looks like, at the bottom of your notebook then type data and 41 42 00:04:02,130 --> 00:04:03,960 press Shift+Enter. 42 43 00:04:04,230 --> 00:04:10,620 If you just started this lesson and have not run the cells above then the Jupyter notebook will not 43 44 00:04:10,620 --> 00:04:13,520 recognize this variable. In this case, 44 45 00:04:13,560 --> 00:04:18,720 you'll get a name error which means you have to go to "Cell" and go to "Run All". 45 46 00:04:24,040 --> 00:04:28,390 So that's a good thing to check if you're coming back to the Python intro 46 47 00:04:28,390 --> 00:04:33,420 after taking a break. Here we can see what our data looks like, right? 47 48 00:04:33,490 --> 00:04:41,740 We've got three columns and we're going to extract each column and put it into a separate variable. 48 49 00:04:41,740 --> 00:04:45,570 The first column I'm going to extract is gonna be called time. 49 50 00:04:45,610 --> 00:04:53,080 And I'm gonna set it equal to data[] and then when I pass in the name, I'm going to say 50 51 00:04:53,370 --> 00:05:02,280 Time_Delay_in_Minutes, and hit Shift+Enter to see what it looks like. 51 52 00:05:02,650 --> 00:05:06,870 If I want to know what it looks like I can simply say print(time). 52 53 00:05:07,210 --> 00:05:15,360 And here it is - our variable called time now contains a series containing the values of the first column. 53 54 00:05:15,580 --> 00:05:18,020 Let's do the same thing with the other two columns. 54 55 00:05:18,040 --> 00:05:26,120 So I'm gonna create a variable called LSD set it equal to data[] and it was 55 56 00:05:26,200 --> 00:05:33,820 LSD_ppm. And the third variable I'm going to create, I'm gonna call score, I'm going to set equal 56 57 00:05:33,820 --> 00:05:39,660 to data[] and then 57 58 00:05:45,070 --> 00:05:53,080 Avg_Math_Test_Score. I'm going to hit Shift+Enter to make sure I haven't made any typos and this gives me confidence 58 59 00:05:53,350 --> 00:05:56,440 that my Python code is working. 59 60 00:05:56,440 --> 00:06:01,120 Now we're ready to start looking at our data in a graphical way. 60 61 00:06:01,210 --> 00:06:10,360 If you recall from previous lessons we've imported our plotting module as plt so we can refer to 61 62 00:06:10,480 --> 00:06:16,390 our pyplot as plt and visualize the data in these variables. 62 63 00:06:16,390 --> 00:06:23,560 So the easiest way to do this is to type plt.plot and supply into two things that we want to 63 64 00:06:23,560 --> 00:06:24,620 see. 64 65 00:06:24,640 --> 00:06:34,140 Let's say we want to look at the time vs. the amount of LSD in people's tissues, hitting Shift+Enter doesn't 65 66 00:06:34,140 --> 00:06:35,660 show me anything. 66 67 00:06:35,660 --> 00:06:42,530 And that's because we need to give our pyplot the instruction to show us the graph, so I'm going to write 67 68 00:06:42,550 --> 00:06:44,460 plt.show() 68 69 00:06:47,880 --> 00:06:56,770 and this gives me a graph that looks like this. What this plot is showing us is the amount of LSD 69 70 00:06:56,770 --> 00:07:00,180 tissue concentration over time. 70 71 00:07:00,220 --> 00:07:08,230 So we've got eight data points and we're simply plotting a line chart. For the x-axis on this chart we 71 72 00:07:08,230 --> 00:07:15,140 have supplied our time and for the y axis we have supplied our LSD variable. 72 73 00:07:15,160 --> 00:07:23,570 Now, check this out - at the moment our plot takes two arguments, right, time and LSD. 73 74 00:07:23,650 --> 00:07:30,760 However, we're not limited to just providing these two. We can actually provide more arguments including 74 75 00:07:31,240 --> 00:07:34,030 an argument for the color of the graph. 75 76 00:07:34,450 --> 00:07:36,940 So I'm going to put a comma here and write 76 77 00:07:36,940 --> 00:07:47,470 "color=" and then 'G'. If I hit Shift+Enter now, we see the line color change to 77 78 00:07:47,470 --> 00:07:48,880 green. 78 79 00:07:48,880 --> 00:07:57,400 Now, a reasonable question to ask is "How did you know that you can supply a color argument with a keyword 79 80 00:07:57,640 --> 00:07:59,450 just like this?" 80 81 00:07:59,620 --> 00:08:06,010 And the answer is - I didn't know until I looked at the official documentation that's available for 81 82 00:08:06,040 --> 00:08:06,830 pyplot. 82 83 00:08:07,300 --> 00:08:14,380 Being able to read and interpret the official documentation that's available for each of these components 83 84 00:08:14,710 --> 00:08:20,910 is one of the key skills at getting good at Python programming. So check this out, 84 85 00:08:20,930 --> 00:08:27,710 we're gonna head down to the part of the documentation that gives us information on the plotting method. 85 86 00:08:27,710 --> 00:08:32,780 This is the part of the documentation where we can find out more about the plotting method. 86 87 00:08:32,780 --> 00:08:37,870 We've got the name of our method up here, namely plot and inside the parentheses, 87 88 00:08:37,910 --> 00:08:40,020 we can see it takes some arguments. 88 89 00:08:40,280 --> 00:08:45,980 Now the documentation provides some examples as well as information about what these arguments could 89 90 00:08:45,980 --> 00:08:47,150 be. 90 91 00:08:47,150 --> 00:08:54,680 The first thing that we did was that we plotted time vs. drug concentration in tissue, so we just had 91 92 00:08:54,800 --> 00:08:58,670 two arguments. And in the documentation, 92 93 00:08:58,670 --> 00:09:04,420 we can see that this is the way to create a plot using the default styles and colors. 93 94 00:09:05,240 --> 00:09:13,640 However, in addition to these standard x and y arguments we can supply some additional arguments. 94 95 00:09:13,640 --> 00:09:24,080 This is where we come across this term called kwargs, kwargs is an abbreviation for keyword arguments. 95 96 00:09:24,550 --> 00:09:30,540 And an example of a keyword argument is our color. Scrolling down, 96 97 00:09:30,610 --> 00:09:38,590 we can also see that G is a supported color abbreviation. And scrolling down even further, we can see 97 98 00:09:38,650 --> 00:09:47,880 a handy list of the keyword arguments that we can use on our plot method. So right here, we can see that 98 99 00:09:47,880 --> 00:09:52,200 color is one of these keyword arguments, but there are others too. 99 100 00:09:52,230 --> 00:09:57,060 For example, line style or line width. 100 101 00:09:57,060 --> 00:10:00,630 Let's add the line with keyword argument to our Python code, 101 102 00:10:00,630 --> 00:10:02,960 now. Here's how we can do it. 102 103 00:10:03,030 --> 00:10:06,910 We'll just add a comma and then write 103 104 00:10:06,960 --> 00:10:09,360 linewidth = 104 105 00:10:09,360 --> 00:10:10,620 I don't know, say 10 105 106 00:10:13,290 --> 00:10:17,360 Hitting Shift+Enter updates the chart. And, 106 107 00:10:17,370 --> 00:10:22,690 voila! We can see that now it has a very, very thick green line in it. 107 108 00:10:22,770 --> 00:10:24,980 Ten is probably a little bit too thick for my taste, 108 109 00:10:24,990 --> 00:10:27,170 I'm going to go with three. 109 110 00:10:27,180 --> 00:10:27,960 There you go. 110 111 00:10:27,960 --> 00:10:30,200 This is not so bad. 111 112 00:10:30,240 --> 00:10:37,260 The other thing to think about with visualization and looking at data is that design is actually quite 112 113 00:10:37,260 --> 00:10:45,900 important in data visualization because, if you think about it, the end goal is showing our data to people 113 114 00:10:46,560 --> 00:10:53,730 and that's why we both need to make it look clear and we need to make it look pretty because typically, 114 115 00:10:54,000 --> 00:10:59,610 you and I are gonna be in a position where we're going to have to impress both our bosses and our customers. 115 116 00:11:00,210 --> 00:11:06,300 Everybody judges a book by its cover I'm afraid, so our chant is going to have to look a lot better 116 117 00:11:06,510 --> 00:11:08,270 than it is right now. 117 118 00:11:09,210 --> 00:11:13,160 The first thing to note is that we're going to add text to our chart. 118 119 00:11:13,170 --> 00:11:20,340 We have to add labels to these axes and a title to the chart, so that people know what it is they're 119 120 00:11:20,340 --> 00:11:22,800 actually looking at. 120 121 00:11:22,800 --> 00:11:26,540 The second thing is that we don't actually have to stick to the default colors. 121 122 00:11:26,550 --> 00:11:29,800 We don't have to stick to a generic green or red. 122 123 00:11:29,820 --> 00:11:37,090 We can add a custom color and we do that by supplying a color's hex code. 123 124 00:11:37,440 --> 00:11:42,500 So here's a mini crash course in design. A hex code looks something like this: 124 125 00:11:42,500 --> 00:11:50,100 it has a little hash tag and then it has one, two, three, four, five numbers or digits that come after it. 125 126 00:11:51,520 --> 00:11:57,240 In other words, a hex code is just like a particular ID for a color. 126 127 00:11:57,240 --> 00:12:02,700 Now what you'll want to do is you'll want to go to a Web site, like say flat ui colors, to grab these 127 128 00:12:02,700 --> 00:12:09,900 hex codes of colors that have already been curated for you, because, quite frankly, these colors look really, 128 129 00:12:09,900 --> 00:12:11,060 really good. 129 130 00:12:11,190 --> 00:12:13,350 So let's grab the hex code for this 130 131 00:12:13,370 --> 00:12:20,940 red, this alizarin here and then, when we go back to our plot, we can supply this hex code instead of 131 132 00:12:20,940 --> 00:12:24,670 this G. Hitting Shift+Enter, 132 133 00:12:24,720 --> 00:12:29,800 I can see that now my line has been updated to this beautiful red. 133 134 00:12:29,800 --> 00:12:35,430 A great resource to be aware of to find beautiful colors, by the way, is a website called material 134 135 00:12:35,430 --> 00:12:36,950 palette. 135 136 00:12:36,960 --> 00:12:41,000 If you want to include more than one color in your chart, you can click on two of these, 136 137 00:12:43,150 --> 00:12:48,920 and you'll get an eight color color palette that you can use in your designs. These kind of tricks are 137 138 00:12:48,920 --> 00:12:53,300 really handy for making something look beautiful really, really quickly. 138 139 00:12:53,300 --> 00:12:56,340 So let's add some text to our graph. 139 140 00:12:56,360 --> 00:13:06,410 First off, I'm going to give this graph a title, so I'm going to say 140 141 00:13:08,540 --> 00:13:09,920 plt.title('Tissue concentration of LSD over time') 141 142 00:13:15,890 --> 00:13:25,220 and then I'll Shift+Enter. Excellent! 142 143 00:13:25,330 --> 00:13:29,920 But I do think that this title is a little bit small. 143 144 00:13:29,980 --> 00:13:35,100 So again, we're gonna use a keyword argument to set the font size here. 144 145 00:13:35,230 --> 00:13:39,980 So the argument's name is fontsize, 145 146 00:13:40,090 --> 00:13:46,550 no surprises there. And I'm going to set the font size to 17. Hitting Shift+Enter updates 146 147 00:13:46,600 --> 00:13:48,090 my chart. 147 148 00:13:48,250 --> 00:13:50,200 Now let's put some labels on these axes. 148 149 00:13:52,950 --> 00:14:06,780 For the x-axis, we will write plt.xlabel('Time in Minutes') and we'll set 149 150 00:14:06,780 --> 00:14:16,380 the font size for that equal to, say 14. Hitting Shift+Enter, we can see our change right away. 150 151 00:14:17,570 --> 00:14:25,120 So now our x-axis has a nice looking label. We're going to do the same thing for our y-axis. 151 152 00:14:25,150 --> 00:14:31,950 And notice how every time we write plt, which is our object, put a dot after it and then we're calling 152 153 00:14:32,760 --> 00:14:35,160 a method on our object. 153 154 00:14:37,890 --> 00:14:43,550 So to put that y label on there, we're going to say 154 155 00:14:46,390 --> 00:14:48,780 plt.ylabel('Tissue LSD ppm'), 155 156 00:14:51,730 --> 00:14:57,320 font size equals 14. 156 157 00:14:57,320 --> 00:14:57,830 There you go. 157 158 00:15:00,300 --> 00:15:04,340 Now, these aren't the only ways of adding text to our graph by the way. 158 159 00:15:04,440 --> 00:15:11,640 We can also add text in freeform, more or less. So if you want to add a little footer to the graph, we 159 160 00:15:11,640 --> 00:15:16,750 can do that and have it display where our data came from. 160 161 00:15:16,860 --> 00:15:23,450 So, let's take a look at the documentation for adding arbitrary text to our graph. 161 162 00:15:24,480 --> 00:15:30,930 Here we can see that our text method has multiple arguments that it takes - some of these are optional 162 163 00:15:31,410 --> 00:15:35,030 and some of these are required. 163 164 00:15:35,100 --> 00:15:40,660 The first thing you'll notice is that we have to supply an x value and a y value. 164 165 00:15:40,710 --> 00:15:46,980 These are the coordinates of where the text should be in the picture. And the second thing that you'll 165 166 00:15:46,980 --> 00:15:50,520 notice is that the s stands for string. 166 167 00:15:50,700 --> 00:15:53,940 And this is just the piece of text that should be displayed. 167 168 00:15:53,940 --> 00:16:00,840 So we have to supply coordinates for the text as well as what the text should be. 168 169 00:16:00,840 --> 00:16:02,070 Let's try this out. 169 170 00:16:02,970 --> 00:16:05,640 So coming back here we can see 170 171 00:16:05,730 --> 00:16:07,830 plt.text() 171 172 00:16:07,830 --> 00:16:15,980 Now, I don't know exactly where the text should go so I'm going to say x is gonna be equal to zero, 172 173 00:16:16,230 --> 00:16:26,340 y is gonna be equal to zero and our string, our text itself, is gonna read "Wagner, et al., 173 174 00:16:26,520 --> 00:16:29,060 1968. 174 175 00:16:31,980 --> 00:16:33,500 Maybe some parentheses here as well. 175 176 00:16:34,950 --> 00:16:42,360 Let's press Shift+Enter now and see where these 0,0 coordinates actually are. Scrolling down, 176 177 00:16:42,420 --> 00:16:45,690 I can see that it places the text right here 177 178 00:16:45,690 --> 00:16:55,400 if x is equal to zero and y is equal to zero. Let's see what happens when y is equal to, say 5. When y 178 179 00:16:55,400 --> 00:17:05,690 is equal to 5 if it moves our text up here, and when y is equal to say negative 1 I can see that it 179 180 00:17:05,690 --> 00:17:08,470 moves our text down here. 180 181 00:17:08,480 --> 00:17:12,150 So this is pretty good, we know where the origin is. 181 182 00:17:12,180 --> 00:17:19,080 We know that positive number moves the text up and a negative number on the y value moves the text down. 182 183 00:17:19,160 --> 00:17:29,010 I reckon that a good amount for this y value might be something like 0.5. Let's try that. This arranges 183 184 00:17:29,010 --> 00:17:31,050 it very nicely. 184 185 00:17:31,440 --> 00:17:38,270 Now, at the moment our font size for this piece of text is the default font size. 185 186 00:17:39,030 --> 00:17:47,590 We can set our own value again by providing the font size argument and say setting it to 12. There 186 187 00:17:47,600 --> 00:17:52,320 we go. Now, since we're setting the font size on a lot of things, 187 188 00:17:52,440 --> 00:17:59,100 we can also specify the font size of these numbers that are on the axes. 188 189 00:17:59,130 --> 00:18:04,760 Say we wanted the numbers on the axes to be the same font size as the label here. 189 190 00:18:04,860 --> 00:18:16,530 So I could write something like plt.xticks(fontsize=14) and 190 191 00:18:16,620 --> 00:18:23,030 plt.yticks(fontsize=14) 191 192 00:18:23,310 --> 00:18:32,090 If a hit Shift+Enter now, we can see that the numbers now have the same size as our labels. Another 192 193 00:18:32,100 --> 00:18:40,370 thing that we can do on these axes is to set hard values for the range that they should cover. 193 194 00:18:40,380 --> 00:18:47,630 So for example, we know that our time delay in minutes is going to be between 0 and 500. 194 195 00:18:47,850 --> 00:18:54,450 And we also know that the parts per million are going to be between 1 and say 6.5 or 7. 195 196 00:18:59,740 --> 00:19:01,180 To set a hard range 196 197 00:19:01,210 --> 00:19:10,240 we can write something like plt.ylim(), lim for limit and provide two values - a lower bound which is, 197 198 00:19:10,240 --> 00:19:16,180 say, 1 and an upper bound which is to say, I don't know, 7. Hitting Shift+Enter, 198 199 00:19:16,560 --> 00:19:23,110 we can now see that our chart has updated, so it's no longer determining automatically what the range 199 200 00:19:23,110 --> 00:19:24,390 of values should be 200 201 00:19:24,430 --> 00:19:26,820 that should be displayed on the y axis. 201 202 00:19:26,920 --> 00:19:33,130 We have now specified our own range that we'd like to show, namely we'd like to have the chart to show 202 203 00:19:33,130 --> 00:19:41,230 the values between 1 and 7. We can do the same thing for the x axis by writing plt.xlim() and 203 204 00:19:41,230 --> 00:19:47,020 providing a lower bound of 0 and an upper bound of 500. 204 205 00:19:47,030 --> 00:19:52,820 The reason I'm trying to show all this is to get this idea across that we're kind of breaking up our 205 206 00:19:52,820 --> 00:20:01,660 Python code into two sections, if you will, for the chart. We're doing the styling and then down here we're 206 207 00:20:01,670 --> 00:20:03,160 actually plotting the data. 207 208 00:20:03,220 --> 00:20:09,410 And what's interesting to note about this is that the amount of code that's written to style the chart 208 209 00:20:10,010 --> 00:20:13,020 is actually getting longer and longer and longer. 209 210 00:20:13,040 --> 00:20:20,450 This is one of the reasons why matplotlib comes with something called built-in styles these built-in 210 211 00:20:20,450 --> 00:20:26,450 styles can act as a shortcut or a template to help you reduce the amount of code that you need to 211 212 00:20:26,450 --> 00:20:28,490 write to make something look pretty. 212 213 00:20:28,880 --> 00:20:37,100 In our previous lessons, we've used the 'fivethirtyeight' style for our chart, but there is actually many other styles 213 214 00:20:37,190 --> 00:20:42,500 that you can choose from. One of the handy things to look at is actually the style of gallery. 214 215 00:20:42,560 --> 00:20:45,120 And here you can see for example the name of the style; 215 216 00:20:45,140 --> 00:20:49,980 so the first one it's called 'bmh' and the second one's called 'classic', 216 217 00:20:50,240 --> 00:20:56,750 and the third one is called 'dark_background' and you can scroll down to see what these 217 218 00:20:56,750 --> 00:21:01,370 different styles translate into if you're actually going to plot a chart. 218 219 00:21:01,370 --> 00:21:05,500 Now mind you this is actually an unofficial style guide that I found very, very helpful. 219 220 00:21:05,510 --> 00:21:12,890 The official one looks something like this. Now I find both of these really, really helpful in deciding 220 221 00:21:13,160 --> 00:21:19,790 what kind of template I should go with for my charts. To apply any of these particular styles to your 221 222 00:21:19,790 --> 00:21:20,730 chart, 222 223 00:21:20,750 --> 00:21:24,200 all you have to do is write plt.style.use 223 224 00:21:24,260 --> 00:21:30,060 and then provide the name of the particular template. 224 225 00:21:30,080 --> 00:21:33,770 So say we use 'classic'. If I hit Shift+Enter, 225 226 00:21:34,310 --> 00:21:36,410 I can see my chart update. 226 227 00:21:36,530 --> 00:21:40,470 This is what it would look like using the 'classic' style. 227 228 00:21:40,550 --> 00:21:42,190 Of course you can put anything you want in here. 228 229 00:21:42,470 --> 00:21:49,850 So if you wanted to use say 'dark_background' which is the name of another style then we'll 229 230 00:21:49,850 --> 00:21:50,990 get this. 230 231 00:21:55,670 --> 00:22:01,010 I think one of the other names that I saw on the gallery was 'ggplot'. 231 232 00:22:01,010 --> 00:22:03,610 So let me just hit Shift+Enter here. 232 233 00:22:03,610 --> 00:22:09,190 Now one thing you'll notice is that maybe the style won't be applied right away. 233 234 00:22:09,190 --> 00:22:14,740 So if you see that happen to you, just click into the cell and it Shift+Enter again. 234 235 00:22:15,100 --> 00:22:17,480 Then you should see the style applied. 235 236 00:22:17,990 --> 00:22:19,520 So I'm going to go with 'classic'. 236 237 00:22:23,210 --> 00:22:30,250 Now the last thing we'll do with our plotting code is add a line of code into the cell so that our beautiful 237 238 00:22:30,250 --> 00:22:34,150 charts can be exported when we export our notebook. 238 239 00:22:34,510 --> 00:22:40,900 You see, at the moment our chants are generated when this code is run but if we add a percentage sign and 239 240 00:22:40,900 --> 00:22:52,500 then write 'matplotlib inline', then we are telling Jupyter notebook to export these charts 240 241 00:22:53,040 --> 00:23:00,930 along with the notebook when we go to File > Download as > Notebook. 241 242 00:23:01,650 --> 00:23:06,750 In other words, this line of code here is Jupyter notebook specific. 242 243 00:23:06,780 --> 00:23:08,640 Now it's time to run our regression.