0 1 00:00:00,550 --> 00:00:05,320 In this lesson we're going to wrap up our introduction to Python programming with something a little 1 2 00:00:05,320 --> 00:00:06,880 bit more fluffy. 2 3 00:00:07,300 --> 00:00:12,130 A hot topic on the Internet is always coding style. 3 4 00:00:12,490 --> 00:00:20,230 But what do I mean by coding style? Coding style is whether you use capital or lowercase letters, whether you 4 5 00:00:20,230 --> 00:00:24,520 use tabs or spaces, where you put your curly brackets and so on. 5 6 00:00:24,520 --> 00:00:31,930 It's how you arrange your code and what sort of syntax you use. Now the thing about programming style 6 7 00:00:32,020 --> 00:00:35,890 is that it's the classic bike shedding problem, right? 7 8 00:00:35,920 --> 00:00:39,010 Everybody has an opinion on what color the bike should be. 8 9 00:00:39,010 --> 00:00:44,530 And so everyone gets in a heated and lengthy argument about it in the same way. 9 10 00:00:44,530 --> 00:00:49,280 People also have very strong feelings on how code should be formatted. 10 11 00:00:49,390 --> 00:00:54,430 Now the irony is that the standards vary hugely between programming languages. 11 12 00:00:54,430 --> 00:01:00,880 So mainstream Python programming style is a little different from the mainstream Java programming style, 12 13 00:01:01,090 --> 00:01:04,690 which again is different from C++ or Swift. 13 14 00:01:05,650 --> 00:01:11,990 So the question is how do we find out what good Python programming style is or should be? 14 15 00:01:12,160 --> 00:01:18,160 And what are the guidelines that we should follow or what are the guidelines that people tend to follow? 15 16 00:01:18,160 --> 00:01:24,040 Well, lucky for us there are so-called style guides out there on the Internet. 16 17 00:01:24,040 --> 00:01:30,880 And the most popular style guide I think these days for Python is the official one from the Python Software 17 18 00:01:30,880 --> 00:01:32,070 Foundation. 18 19 00:01:32,260 --> 00:01:33,650 And it looks like this. 19 20 00:01:33,700 --> 00:01:37,790 It's called PEP 8 - the style guide for python code. 20 21 00:01:38,020 --> 00:01:45,780 And if you ask me this this makes excellent bedtime reading. Yeah, it's pretty long. 21 22 00:01:46,020 --> 00:01:54,810 It's very comprehensive and it's also not something that needs to be followed religiously. 22 23 00:01:54,810 --> 00:02:00,870 It's a suggestion for writing readable Python code, right? Readable for humans. 23 24 00:02:00,870 --> 00:02:04,440 The computers don't really care. 24 25 00:02:04,520 --> 00:02:08,730 The thing I'll say is that this isn't the only style guide out there. 25 26 00:02:08,730 --> 00:02:15,180 In fact many companies have their own style guides too, that, you know, might be slightly different from 26 27 00:02:15,180 --> 00:02:18,730 the ones of the Python Foundation for example. 27 28 00:02:18,750 --> 00:02:23,490 So here is the style guide from the engineers at Google. 28 29 00:02:23,490 --> 00:02:26,700 The Python style guide, to be specific. 29 30 00:02:26,700 --> 00:02:34,440 Now, I was actually looking through this and trying to spot a difference with the foundation's style 30 31 00:02:34,440 --> 00:02:42,870 guide and the thing I noticed is that these style guides are actually very, very similar and there are 31 32 00:02:42,970 --> 00:02:51,330 maybe like some differences; Python Software Foundation says line length should be 80 characters, 32 33 00:02:51,330 --> 00:02:55,860 Google says it should be 79. 33 34 00:02:56,040 --> 00:02:58,530 Most of the suggestions are actually really, really similar. 34 35 00:02:59,220 --> 00:03:04,940 So one of the differences I actually found seems to me more philosophical than anything else, right? 35 36 00:03:04,950 --> 00:03:08,620 So at the end Google says their parting words. 36 37 00:03:08,730 --> 00:03:10,070 BE CONSISTENT. 37 38 00:03:10,080 --> 00:03:10,260 Right? 38 39 00:03:10,260 --> 00:03:12,890 This is this is the formatting they use - all caps. 39 40 00:03:12,900 --> 00:03:14,490 Right? 40 41 00:03:14,850 --> 00:03:22,470 In contrast, the PEP 8 style guide from the Python Foundation says "A Foolish Consistency is the Hobgoblin 41 42 00:03:22,560 --> 00:03:25,040 of Little Minds". 42 43 00:03:25,500 --> 00:03:30,240 So what they're saying here is that you can be a bit more flexible in your coding style if you need 43 44 00:03:30,240 --> 00:03:31,200 to be. 44 45 00:03:31,230 --> 00:03:35,750 Now as I said before, in the end none of these style guides are hard and fast rules. 45 46 00:03:35,760 --> 00:03:39,710 They're more about aesthetics and readability of the code. 46 47 00:03:39,870 --> 00:03:41,730 So you yourself you can make a choice, right? 47 48 00:03:41,730 --> 00:03:45,600 You can choose one of these style guides and just stick to it. 48 49 00:03:45,630 --> 00:03:47,580 That's really all there is to it. 49 50 00:03:47,610 --> 00:03:53,370 That said, let's have a look that we've written so far in the lessons and see how it stacks up against 50 51 00:03:53,370 --> 00:03:57,330 the style guide from the Python Software Foundation. 51 52 00:03:57,330 --> 00:04:03,660 I'm not going to ask you to read through all of PEP 8 and play dpot the difference, but truth be told 52 53 00:04:03,810 --> 00:04:08,770 we've actually violated some of the style recommendations of PEP 8. 53 54 00:04:08,820 --> 00:04:15,060 So what I want to do in this video is kind of give an introduction to familiarize ourselves with what 54 55 00:04:15,060 --> 00:04:16,700 is considered good practice. 55 56 00:04:16,740 --> 00:04:23,560 And afterwards we can return to our Python code and make changes if necessary. 56 57 00:04:23,610 --> 00:04:29,070 One thing that we've already alluded to in the previous lessons is the indentation - 57 58 00:04:29,070 --> 00:04:37,840 So the layout of our code. And we recommended that we use four spaces for each indentation level. 58 59 00:04:38,070 --> 00:04:43,770 If we look back at our Jupyter notebook, we can see that this formatting has already been taken care 59 60 00:04:43,770 --> 00:04:45,590 of for us, right? 60 61 00:04:45,600 --> 00:04:52,860 So we can see that this print statement 'Open Door' is one, two, three, four spaces indented. 61 62 00:04:52,920 --> 00:04:59,890 So in many ways the editor that we're using - Jupyter notebook or other text editors - modern text editors, 62 63 00:05:00,060 --> 00:05:06,120 not Notepad actually shoulder some of this code formatting burden for us. 63 64 00:05:06,150 --> 00:05:10,090 Another thing that I want to point out is naming conventions. 64 65 00:05:10,200 --> 00:05:16,620 So I'm gonna dwell on this a little bit because it's very, very handy for reading and interpreting the 65 66 00:05:16,620 --> 00:05:19,070 code that other people have written. 66 67 00:05:19,170 --> 00:05:27,450 So naming conventions are handy for being able to read Python code and know a little bit more about 67 68 00:05:27,660 --> 00:05:34,900 what it is that you're looking at, just from how it's typed. So let's start with variables. 68 69 00:05:34,970 --> 00:05:41,630 Typically you'll see variables written as single letters, especially if you're doing math or physics 69 70 00:05:41,720 --> 00:05:43,440 or machine learning, right? 70 71 00:05:43,460 --> 00:05:51,740 So lowercase y, Capital X, these are variable names that you'll see very, very often. Now short variable 71 72 00:05:51,740 --> 00:05:57,230 names like this are good when there's like an accepted convention in place. 72 73 00:05:57,230 --> 00:06:03,170 So for example with scikit-learn, you'll often see the variable name for the training data as capital 73 74 00:06:03,230 --> 00:06:08,200 X and you'll see that target values as lower case y. 74 75 00:06:08,330 --> 00:06:13,490 And this is a convention that they adopt and try to be consistent with across, like, their whole code 75 76 00:06:13,490 --> 00:06:14,260 base. 76 77 00:06:14,510 --> 00:06:23,180 But there is also a big, big advantage from not using single letters for variable names because if you 77 78 00:06:23,180 --> 00:06:29,750 have a descriptive variable name, right, as opposed to a single letter you can help make your code a lot 78 79 00:06:29,840 --> 00:06:34,600 more readable. For a nice, long, descriptive variable name, 79 80 00:06:34,610 --> 00:06:40,370 you'll want to write it in all lowercase and you want to separate, if there's multiple words, you want 80 81 00:06:40,370 --> 00:06:46,760 to separate the words with an underscore, so lightning_mcqueen was an example of this kind 81 82 00:06:46,760 --> 00:06:53,720 of formatting. And if you really, really want to use a single letter variable names then please, please 82 83 00:06:53,720 --> 00:07:01,730 please do not use lowercase l or uppercase I or uppercase O as your variable name. 83 84 00:07:01,730 --> 00:07:07,970 It's really, really hard to tell these apart with smaller fonts, right, 84 85 00:07:07,970 --> 00:07:10,450 if you're looking at a screen. So yeah, 85 86 00:07:10,460 --> 00:07:18,140 single letter variable names, maybe just exclude these ones for readability. Now module and package 86 87 00:07:18,140 --> 00:07:22,670 names, they actually tend to follow the same naming conventions as variables really. 87 88 00:07:22,670 --> 00:07:22,870 Right? 88 89 00:07:22,880 --> 00:07:28,610 They're written in lowercase and the longer names have the words separated again with an underscore. 89 90 00:07:29,580 --> 00:07:35,860 And again, for functions and methods, we typically use lower case words separated by underscores as well. 90 91 00:07:36,910 --> 00:07:43,000 If the function in question has parameters, then the parameters tend to follow the same naming conventions 91 92 00:07:43,090 --> 00:07:45,390 as with variables. 92 93 00:07:45,460 --> 00:07:52,150 So with this in mind, we can actually quickly spot the coding style in our Python intro notebook where 93 94 00:07:52,150 --> 00:07:55,210 we've run afoul of these guidelines. 94 95 00:07:55,210 --> 00:08:01,570 For example, at the very top of our Python intro notebook we had some variables called myAge, restaurantBill 95 96 00:08:01,570 --> 00:08:06,970 and serviceCharge and these are written in camel case. 96 97 00:08:06,970 --> 00:08:13,060 So camel case means starting with a lowercase letter and then capitalizing each word. 97 98 00:08:13,150 --> 00:08:19,420 And this is quite a popular style in programming languages like Java but you see it much much less in 98 99 00:08:19,420 --> 00:08:21,830 Python. Cool. 99 100 00:08:21,840 --> 00:08:26,150 So that pretty much covers our introduction to Python programming. 100 101 00:08:26,160 --> 00:08:32,610 We'll be diving deeper into Python programming techniques and the subsequent lessons because programming 101 102 00:08:32,730 --> 00:08:37,390 is actually strongly linked to both data science and machine learning. 102 103 00:08:37,620 --> 00:08:44,190 And the reason is that programming goes well beyond how to call functions from scikit-learn. Knowing 103 104 00:08:44,190 --> 00:08:47,770 programming develops your mindset as an engineer. 104 105 00:08:48,030 --> 00:08:53,520 And to succeed in this course and with machine learning more generally, having an engineering mindset 105 106 00:08:53,700 --> 00:08:56,780 is key as part of this course. 106 107 00:08:56,820 --> 00:09:02,460 And in the workplace you're going to be putting a lot of different skills together from different fields. 107 108 00:09:02,640 --> 00:09:09,360 And those include mathematics, statistics, design and of course programming. 108 109 00:09:09,360 --> 00:09:14,400 And the reason is that outside of academia you're gonna be in a position where you're going to be trying 109 110 00:09:14,400 --> 00:09:22,560 to solve a real life business problem and figuring out how to solve this problem will test your engineering 110 111 00:09:22,560 --> 00:09:23,790 mindset. 111 112 00:09:23,980 --> 00:09:30,630 You know ,a common misconception these days is that to solve problems you need to use the most sophisticated 112 113 00:09:30,630 --> 00:09:34,910 model possible or, I don't know, artificial intelligence. 113 114 00:09:35,010 --> 00:09:37,050 Everybody is talking about A.I., right? 114 115 00:09:37,050 --> 00:09:38,910 Artificial intelligence. 115 116 00:09:38,910 --> 00:09:44,120 However, in reality A.I. is not the only thing that matters. 116 117 00:09:44,280 --> 00:09:47,700 And A.I. doesn't solve all problems. 117 118 00:09:47,700 --> 00:09:53,340 Now depending on your specific problem that you're faced with, A.I. might not even add all that much 118 119 00:09:53,340 --> 00:10:00,840 value, especially considering the high cost and the high amount of data required to build an A.I. system. 119 120 00:10:00,870 --> 00:10:06,960 So, you know, with your engineering mindset as an engineer you're going to be thinking about how to solve 120 121 00:10:06,960 --> 00:10:09,210 your problem in an efficient way. 121 122 00:10:09,210 --> 00:10:16,500 So maybe a simple model will do for your purposes and this is why this course will teach you a variety 122 123 00:10:16,500 --> 00:10:21,480 of tools and that way you can use the right tool for the job. 123 124 00:10:21,510 --> 00:10:27,310 After all, if all you've got is a hammer then everything starts to look like a nail. 124 125 00:10:27,420 --> 00:10:34,920 And coming back to programming, and programming in Python specifically, Python is actually a really, really 125 126 00:10:34,920 --> 00:10:44,160 nice language to start learning to program in contrast to other languages like C or C++, because Python 126 127 00:10:44,310 --> 00:10:50,560 actually takes care a lot of the error prone grunt work that goes on behind the scenes for you. 127 128 00:10:50,940 --> 00:10:56,970 And I don't think this is a bad thing because, unlike with C or C++, with Python you don't have to do 128 129 00:10:56,970 --> 00:11:03,810 chores like laying out the memory structure, managing the memory allocation or implementing access routines. 129 130 00:11:04,980 --> 00:11:05,980 Now, don't get me wrong, 130 131 00:11:06,050 --> 00:11:11,180 I think there is a time and place for all those things and becoming familiar with like how computer 131 132 00:11:11,180 --> 00:11:17,960 hardware actually works will definitely, definitely help down the line but I don't think it's useful 132 133 00:11:18,020 --> 00:11:23,510 to get confronted with this right away, especially if you've got so many other things to worry about. 133 134 00:11:24,290 --> 00:11:28,230 Which brings me to our next topic. Where are we going next? 134 135 00:11:28,340 --> 00:11:35,150 The next couple of lessons and sections will introduce you to the core of machine learning, namely loss 135 136 00:11:35,150 --> 00:11:39,320 functions and a technique called gradient descent. 136 137 00:11:39,350 --> 00:11:44,660 This is where the actual learning takes place and we're going to take a look at how this works under 137 138 00:11:44,660 --> 00:11:47,840 the hood for regression techniques. 138 139 00:11:47,960 --> 00:11:54,400 Also, we're going to be looking more closely at regressions that use more than one explanatory variable. 139 140 00:11:54,440 --> 00:11:57,910 In other words, we're gonna spice up our regression model. 140 141 00:11:57,980 --> 00:12:04,250 For example, if you were trying to predict a house price then you probably have to look at a number of 141 142 00:12:04,250 --> 00:12:11,480 different factors like the number of bedrooms, the location, the total area of the flat or the house, the 142 143 00:12:11,480 --> 00:12:14,970 proximity to amenities like a park or a good school. 143 144 00:12:14,970 --> 00:12:21,470 There's a number of factors that contribute to price of a house and the technique that will tease 144 145 00:12:21,470 --> 00:12:25,120 out the individual contribution of each of these factors. 145 146 00:12:25,220 --> 00:12:32,420 It's called multivariable regression and on that bombshell, it's time for me to get some more coffee 146 147 00:12:32,630 --> 00:12:35,150 and then I'll see you in the next lesson. 147 148 00:12:35,150 --> 00:12:36,080 Take care.