0 1 00:00:00,630 --> 00:00:06,480 Functions that execute a bunch of code are very well and good but often we want to get a value back 1 2 00:00:06,480 --> 00:00:08,050 from a function. 2 3 00:00:08,220 --> 00:00:13,580 In this lesson we're going to talk about how functions return a value and get a result back. 3 4 00:00:14,920 --> 00:00:20,110 Returning a particular value is actually why we often call a function in the first place. 4 5 00:00:20,140 --> 00:00:28,780 The reason we're calling the read_csv function here is because we want to pass a CSV file and get our 5 6 00:00:28,780 --> 00:00:33,770 data as an output and store that data in a variable. 6 7 00:00:33,850 --> 00:00:41,030 Let's look at the mechanics of returning values more closely in our get_milk example. In this case we're 7 8 00:00:41,070 --> 00:00:43,230 supplying an amount of money as an input 8 9 00:00:44,370 --> 00:00:48,570 and we're using our input to make a calculation. 9 10 00:00:48,570 --> 00:00:54,750 This part is the same as before - we're dividing the amount of money by the price of milk to calculate 10 11 00:00:54,870 --> 00:00:57,920 the litres of milk to buy. 11 12 00:00:57,930 --> 00:01:02,240 So the question is - how do we return the result of this calculation? 12 13 00:01:02,520 --> 00:01:10,230 And the answer is that we use a keyword - the "return" keyword, followed by the value of what we want to 13 14 00:01:10,230 --> 00:01:10,980 return. 14 15 00:01:10,980 --> 00:01:15,570 We write "return litres" - that would return. 15 16 00:01:15,570 --> 00:01:22,620 It's actually a Python keyword. When Python hits this line of code where it says return, it will exit 16 17 00:01:22,620 --> 00:01:29,180 the function and it will return whatever result comes after that keyword. 17 18 00:01:29,220 --> 00:01:35,250 So in our example here get_milk is outputting the value inside litres. 18 19 00:01:35,340 --> 00:01:38,670 The call to get_milk is actually the same as before - 19 20 00:01:38,940 --> 00:01:44,250 we have the function name and the value for an argument, say 20.5. 20 21 00:01:44,850 --> 00:01:50,170 However, if we run our Python code just like this our output is lost. 21 22 00:01:50,300 --> 00:01:50,690 Our 22 23 00:01:50,700 --> 00:01:55,550 get_milk function is returning a value but we're not storing this value anywhere. 23 24 00:01:56,070 --> 00:02:01,350 So, what you'll typically find is that when a value is returned by a function we're gonna be storing 24 25 00:02:01,350 --> 00:02:09,780 that value in a variable. In this case, our variable called amount is now holding on to 20.5 25 26 00:02:09,810 --> 00:02:17,160 divided by 1.15 or around 17.8. But so much for the theory, 26 27 00:02:17,200 --> 00:02:19,330 let's put this into practice. 27 28 00:02:19,330 --> 00:02:25,690 Suppose you're working for a cutting edge Silicon Valley company and you're building a calculator. 28 29 00:02:25,690 --> 00:02:32,350 You are the engineering lead on the team and you have been assigned the incredibly important task of 29 30 00:02:32,350 --> 00:02:36,740 programming the multiplication button. As a challenge, 30 31 00:02:36,760 --> 00:02:39,920 can you do the following - in the Jupyter notebook, 31 32 00:02:40,000 --> 00:02:46,600 create a function called "times" and have this function take two inputs - it needs to multiply these two 32 33 00:02:46,600 --> 00:02:53,430 inputs together and provide the result as an output of the function. 33 34 00:02:53,490 --> 00:03:02,030 Also you need to test this function that it works. So call your function and multiply 3.14 34 35 00:03:02,390 --> 00:03:13,380 and 5.09, store this value in a variable called test and print out the value of test below 35 36 00:03:13,380 --> 00:03:17,200 the cell. I'll give you a few seconds to pause the video. 36 37 00:03:20,350 --> 00:03:21,280 You ready? 37 38 00:03:21,280 --> 00:03:23,220 Here's the solution. 38 39 00:03:23,300 --> 00:03:25,930 The first thing we're gonna do is define our function. 39 40 00:03:26,090 --> 00:03:29,000 So we're gonna use the keyword "def". 40 41 00:03:29,270 --> 00:03:32,570 We're gonna give a function a name and call it "times". 41 42 00:03:32,570 --> 00:03:39,260 Now we're going to provide the two parameters, say x and y; you can call these parameters anything you 42 43 00:03:39,260 --> 00:03:41,160 want to by the way. 43 44 00:03:41,390 --> 00:03:51,620 We then have the colon, hit Enter and say we want to store the result of the multiplication in a variable 44 45 00:03:51,980 --> 00:04:02,810 so we'll say result = x * y and then to output the value of this function we'll go with return 45 46 00:04:03,380 --> 00:04:05,000 result. 46 47 00:04:05,000 --> 00:04:10,040 Notice how return is colored in green because it's a python keyword. 47 48 00:04:10,040 --> 00:04:16,360 Now suppose you called your parameters a and b instead of x and y, then you've got to be consistent. 48 49 00:04:16,360 --> 00:04:21,980 You've got to have a and b inside the body of the function as well. 49 50 00:04:22,030 --> 00:04:23,660 So now I'm done with my function. 50 51 00:04:23,740 --> 00:04:27,760 I'm going to hit Shift+Enter and I'm ready to test it. 51 52 00:04:27,820 --> 00:04:37,310 So I'm going to call times(3.14, 5.09) and hit Shift+Enter. 52 53 00:04:37,770 --> 00:04:40,440 Our function is working as expected. 53 54 00:04:40,440 --> 00:04:47,250 Now we said we'd store the value of the output in a variable so we'd say test is equal to the times 54 55 00:04:47,250 --> 00:04:56,040 function and then we can print out test. If I hit Shift+Enter now we see the value of our test variable 55 56 00:04:56,070 --> 00:04:58,750 printed below the cell. 56 57 00:04:58,770 --> 00:05:04,350 Now we can actually simplify the code inside our times function a little bit. 57 58 00:05:04,350 --> 00:05:08,690 In fact we don't need this extra line of code right here. 58 59 00:05:08,760 --> 00:05:18,220 So I can comment it out with a hashtag and instead of returning result I can return the calculation 59 60 00:05:18,220 --> 00:05:24,970 directly. I can say a*b. If I hit Shift+Enter on this cell and then hit Shift+Enter again on the 60 61 00:05:24,970 --> 00:05:26,320 cell below, 61 62 00:05:26,320 --> 00:05:33,610 I can see that my function is working - my output is unchanged. If you've decided to write your function 62 63 00:05:33,640 --> 00:05:40,900 just like this, then you're doing the calculation on the same line as the return keyword. 63 64 00:05:40,900 --> 00:05:42,610 And this is absolutely fine. 64 65 00:05:42,970 --> 00:05:44,170 Okay, excellent. 65 66 00:05:44,260 --> 00:05:51,010 So let's continue our little story of working at the Silicon Valley tech company and connect this lesson 66 67 00:05:51,250 --> 00:05:56,530 with our previous lesson on data types. Now that you've written your function, 67 68 00:05:56,590 --> 00:06:03,340 you've probably submitted your calculator code by now to the project team and a few moments later your 68 69 00:06:03,340 --> 00:06:10,660 boss, an accomplished python programmer, will walk over and say "Hey, I've looked at your code and I've 69 70 00:06:10,660 --> 00:06:12,480 got a question for you - 70 71 00:06:12,670 --> 00:06:17,860 have you watched the Monty Python movie for the quest for the Holy Grail?" 71 72 00:06:17,860 --> 00:06:28,610 He then pushes you aside from your keyboard and types in times('Ni', 72 73 00:06:28,610 --> 00:06:37,500 4), and after typing this code into your Jupyter notebook he smiles and walks away. 73 74 00:06:37,580 --> 00:06:40,790 Let's see what happens when we run this code. 74 75 00:06:40,840 --> 00:06:47,230 If I hit Shift+Enter, I see the word "Ni" displayed four times below our cell. 75 76 00:06:47,230 --> 00:06:49,400 Now this is unexpected, right? 76 77 00:06:49,420 --> 00:06:57,520 The times function does something completely different now. That multiplication symbol is not multiplying 77 78 00:06:57,600 --> 00:06:58,990 when we're passing in a string. 78 79 00:07:00,460 --> 00:07:04,390 The multiplication sign in fact is repeating a sequence. 79 80 00:07:04,630 --> 00:07:10,330 And what this shows is that the python code doesn't really care about the specific data types 80 81 00:07:10,330 --> 00:07:11,900 it's running on. 81 82 00:07:11,980 --> 00:07:21,250 In this case, both numbers and strings support this time symbol - both numbers and strings have some sort 82 83 00:07:21,250 --> 00:07:25,450 of operation that uses this multiplication symbol. 83 84 00:07:25,450 --> 00:07:31,540 Now what this means in practice is that we as the developers when we're writing our Python code have 84 85 00:07:31,540 --> 00:07:37,690 to run some tests and be aware of the kind of data types that we're working with and also to detect 85 86 00:07:37,750 --> 00:07:39,130 errors. 86 87 00:07:39,130 --> 00:07:46,320 A reasonable question to ask as well is - is this a bad thing? And why was our boss not angry when our 87 88 00:07:46,320 --> 00:07:48,690 code did something really weird? 88 89 00:07:48,730 --> 00:07:56,190 And the answer to that question is: Python is really all about flexibility. Ideally, Python code should 89 90 00:07:56,190 --> 00:07:58,040 not really care about the types. 90 91 00:07:58,210 --> 00:08:03,060 And this is in contrast to other programming languages like C++ or Java. 91 92 00:08:03,270 --> 00:08:05,920 Python is meant to be flexible. 92 93 00:08:05,970 --> 00:08:12,480 This is a core part of the Python philosophy in fact. Returning to our topic of functions, 93 94 00:08:12,490 --> 00:08:18,610 this is a good time to review and summarize the three flavors of functions that we've encountered so 94 95 00:08:18,610 --> 00:08:19,560 far. 95 96 00:08:19,990 --> 00:08:24,640 We've seen functions with no inputs and no outputs. 96 97 00:08:24,640 --> 00:08:32,680 We've seen functions with inputs and we've seen functions with both outputs and inputs. 97 98 00:08:33,010 --> 00:08:35,670 When defining a function you always have to use that 98 99 00:08:35,680 --> 00:08:42,310 "def" keyword. Then you have the name of the function followed by the parentheses and then the semicolon. 99 100 00:08:44,020 --> 00:08:51,780 Also very important is that the next line, the one below the header in the body of the function is indented. 100 101 00:08:51,820 --> 00:08:58,920 I recommend using four spaces, but most text editors will indent this automatically for you. 101 102 00:08:59,150 --> 00:09:06,280 Now, when you have inputs or parameters, they go between the parentheses following the function name. The 102 103 00:09:06,280 --> 00:09:08,740 parameters are like placeholders. 103 104 00:09:08,860 --> 00:09:15,580 They're essentially a variable with a name that gets a value when the function is called. The cool thing 104 105 00:09:15,580 --> 00:09:21,700 about parameters is that you can use a parameter inside the body of the function. And when you have more 105 106 00:09:21,700 --> 00:09:28,810 than one parameter, you separate each parameter with a comma. Finally we encountered functions that have 106 107 00:09:28,810 --> 00:09:31,930 both inputs and outputs. 107 108 00:09:31,960 --> 00:09:40,190 The key here is that functions with outputs have a return statement. The output is whatever comes after 108 109 00:09:40,280 --> 00:09:40,940 the return. 109 110 00:09:40,940 --> 00:09:48,640 keyword. And the way to remember this is that if you're ever dating a programmer and you tell them to 110 111 00:09:48,640 --> 00:09:55,120 go to a supermarket and buy some milk and they never come back it's because you haven't told them to 111 112 00:09:55,120 --> 00:09:57,800 return anything. 112 113 00:09:57,970 --> 00:10:04,570 So over the past couple of lessons we've worked quite a bit with different kinds of functions. Before 113 114 00:10:04,570 --> 00:10:05,590 we move on, 114 115 00:10:05,860 --> 00:10:09,220 let's talk a little bit about the big picture for a second. 115 116 00:10:09,400 --> 00:10:15,640 I'm sure that at this point you're starting to see how functions really help with structuring code. Instead 116 117 00:10:15,640 --> 00:10:22,210 of copy pasting a bunch of code, we can stick a bunch of instructions into a single function and then 117 118 00:10:22,210 --> 00:10:24,790 just call that one function. 118 119 00:10:24,790 --> 00:10:33,650 That's what's called code reuse and code reuse basically helps you avoid copy pasting. Also, because functions 119 120 00:10:33,650 --> 00:10:41,670 can have parameters, we can call them over and over again while supplying different kinds of inputs. 120 121 00:10:41,720 --> 00:10:48,770 So in scikit-learn we wrote a linear regression that looked at movie budgets and movie revenue and 121 122 00:10:48,770 --> 00:10:56,330 we can run linear regressions over and over again on different kinds of data - that X and that y in that 122 123 00:10:56,330 --> 00:11:03,580 fit function can have different values and we can reuse the code that fits our regression model. 123 124 00:11:03,650 --> 00:11:12,450 Again this is code reuse in action. But there's also a more subtle advantage - functions allow us to split 124 125 00:11:12,450 --> 00:11:20,110 up complex tasks and by splitting up a complex task we get more manageable parts. 125 126 00:11:20,240 --> 00:11:28,520 For example, in this notebook, theoretically we could have had one function that, I don't know, parsed all 126 127 00:11:28,520 --> 00:11:35,990 our data and then generated the graph and then did the regression but we didn't have one line of God 127 128 00:11:35,990 --> 00:11:38,180 code that did everything. 128 129 00:11:38,180 --> 00:11:46,730 Instead we had different functions that were responsible for different pieces of the workflow. So functions 129 130 00:11:47,270 --> 00:11:51,300 are a design tool for organizing our code. 130 131 00:11:51,350 --> 00:11:55,940 We've actually encountered this idea before with the packages that we looked at. 131 132 00:11:56,000 --> 00:12:02,280 So our pandas package had a bunch of different files that made up the package. 132 133 00:12:02,600 --> 00:12:10,760 Our pandas module did not consist of one enormous .py file. The code instead was split up into 133 134 00:12:10,760 --> 00:12:17,730 a bunch of smaller files and each of these files had a specific job. 134 135 00:12:18,000 --> 00:12:23,340 So as we're writing our own code we're going to be battling the same enemies as every other developer 135 136 00:12:23,340 --> 00:12:24,270 out there. 136 137 00:12:24,270 --> 00:12:31,260 We're gonna be fighting change and complexity. Change and complexity are the reason why good design and 137 138 00:12:31,260 --> 00:12:33,970 good structure for code is key. 138 139 00:12:34,860 --> 00:12:40,860 And one thing that developers have learned over the past decades is that splitting up a task into smaller, 139 140 00:12:40,920 --> 00:12:47,290 more manageable parts is really important and it helps a lot. On a small scale, 140 141 00:12:47,310 --> 00:12:53,820 this splitting up of different functionality is accomplished by putting code into functions. And on a 141 142 00:12:53,820 --> 00:12:54,740 bigger scale, 142 143 00:12:54,750 --> 00:12:59,250 this is accomplished by splitting up bits of code into different files. 143 144 00:12:59,340 --> 00:13:02,380 The rationale behind both of these strategies is the same - 144 145 00:13:02,460 --> 00:13:08,670 they're about managing complexity. And on that note I'm going to leave you with a little Python Easter 145 146 00:13:08,670 --> 00:13:12,170 egg. In your Jupyter notebook type 146 147 00:13:12,360 --> 00:13:22,240 "import this" and hit Shift+Enter. You will be greeted by the Zen of Python by Tim Peters. 147 148 00:13:22,320 --> 00:13:25,440 My two favorite lines by a longshot shot are 148 149 00:13:25,710 --> 00:13:30,960 "Simple is better than complex" and "Complex is better than complicated". 149 150 00:13:32,360 --> 00:13:35,540 And on that bombshell I'll see you in the next lesson.