1 00:00:00,320 --> 00:00:02,280 Look at this beautiful framework. 2 00:00:02,280 --> 00:00:05,910 Now we've covered each of these stamps briefly. 3 00:00:05,910 --> 00:00:09,100 We're going to we're going to dive into each of them one by one. 4 00:00:09,240 --> 00:00:12,090 The first one is problem definition. 5 00:00:12,090 --> 00:00:19,210 The question you're trying to answer in the first step is what problem are we trying to solve. 6 00:00:19,420 --> 00:00:26,290 But before we get into different types of machine learning problems it's important to note machine learning 7 00:00:26,500 --> 00:00:28,880 isn't the solution to every problem. 8 00:00:29,020 --> 00:00:33,460 And I think that that's we'd been in a machine learning course but this is this is an important concept 9 00:00:33,460 --> 00:00:38,060 to remember so when shouldn't you use machine learning. 10 00:00:38,170 --> 00:00:46,900 Well will a simple hand coded instruction based system work then you should favor the simpler system 11 00:00:47,050 --> 00:00:53,740 over the machine learning system such as if you wanted to make the favorite chicken dish we used before 12 00:00:53,740 --> 00:01:00,040 an example if you had the ingredients and you knew the exact steps you had to take to create your favorite 13 00:01:00,040 --> 00:01:01,090 chicken dish. 14 00:01:01,090 --> 00:01:05,590 It's probably best that you choose a simple system over using machine learning to try and figure the 15 00:01:05,590 --> 00:01:11,890 steps out other than these kind of scenarios where you know the simple ham coding instruction based 16 00:01:12,040 --> 00:01:20,670 system already most of the time you can probably find value using machine learning now comes the first 17 00:01:20,670 --> 00:01:27,390 step in identifying the problem we're trying to solve as a machine learning problem we can do this by 18 00:01:27,390 --> 00:01:32,940 matching our problem the one we're working on it might be a business problem or some other kind of problem 19 00:01:33,420 --> 00:01:36,690 to the main types of machine learning problem. 20 00:01:36,780 --> 00:01:45,630 These are supervised learning unsupervised learning transfer learning and reinforcement learning. 21 00:01:45,810 --> 00:01:51,930 We're going to be focused on supervised learning unsupervised learning and transfer learning. 22 00:01:52,290 --> 00:01:53,100 Why. 23 00:01:53,310 --> 00:01:58,560 Because this is the most common ones you'll find and you'll come across in practice and then the ones 24 00:01:58,740 --> 00:02:03,990 when I was Machine Learning engineer when I work on machine learning problems that have proven time 25 00:02:04,020 --> 00:02:10,380 and time again to be useful supervised learning is called supervised learning because you have data 26 00:02:10,860 --> 00:02:18,600 and labels a machine learning algorithm tries to use the data to predict a label if it guesses the label 27 00:02:18,600 --> 00:02:22,540 wrong the algorithm corrects itself and tries again. 28 00:02:22,590 --> 00:02:26,460 This act of correction is why it's called supervised. 29 00:02:26,490 --> 00:02:32,970 It's like if you were trying to guess the stamps it took to turn a set of ingredients the data into 30 00:02:32,970 --> 00:02:35,620 your favorite chicken dish the label. 31 00:02:35,820 --> 00:02:40,470 If you tried once and got it wrong you'd tell yourself this was wrong. 32 00:02:40,470 --> 00:02:42,960 Maybe next time we'll try something different. 33 00:02:43,200 --> 00:02:50,160 A supervised learning algorithm repeats this process over and over and over again trying to get better 34 00:02:51,540 --> 00:02:58,500 the main types of supervised learning problems a classification and regression classification involves 35 00:02:58,500 --> 00:03:05,400 predicting if something is one thing or another such as if you wanted to predict whether or not a patient 36 00:03:05,400 --> 00:03:15,140 had heart disease or not based on their medical records or what type of dog brain was in an image if 37 00:03:15,140 --> 00:03:16,580 there are only two options. 38 00:03:16,580 --> 00:03:18,730 It's called binary classification. 39 00:03:18,830 --> 00:03:23,440 If there are more than two options it's called multi class classification. 40 00:03:23,600 --> 00:03:29,630 So trying to predict heart disease or not heart disease would be binary classification because there's 41 00:03:29,690 --> 00:03:36,890 only two classes heart disease or not heart disease and trying to predict different dog breeds based 42 00:03:36,890 --> 00:03:43,160 on photos in in images would be multi class classification because there are many different kinds of 43 00:03:43,160 --> 00:03:50,460 dog breeds regression problems involve trying to predict a number you might hear it referred to as a 44 00:03:50,460 --> 00:03:57,510 continuous number as well which just means a number which can go up or down a classical regression problem 45 00:03:57,540 --> 00:04:03,780 is trying to predict the sale price of a house based on things like number of rooms the area it's in 46 00:04:03,930 --> 00:04:10,110 how many bathrooms it has or trying to predict how many people will buy a new app based on Web site 47 00:04:10,110 --> 00:04:16,110 visits and clicks unsupervised learning has data but no labels. 48 00:04:16,160 --> 00:04:22,430 For example you might have the purchase history of all customers at your store and your marketing team 49 00:04:22,640 --> 00:04:28,280 wants to send out a promotion for next summer but they know not everyone will be interested in new summer 50 00:04:28,280 --> 00:04:28,990 clothes. 51 00:04:29,150 --> 00:04:35,070 So they come to you as the in-house data science and machine learning engineer and ask Do you know who 52 00:04:35,070 --> 00:04:36,880 is interested in summer clothes. 53 00:04:36,980 --> 00:04:42,150 The thing is you don't either but you know you can figure it out from the data you have. 54 00:04:42,530 --> 00:04:48,860 So you decide to run an algorithm to find patterns in the data and group customers who purchase similar 55 00:04:48,890 --> 00:04:50,270 things together. 56 00:04:50,600 --> 00:04:57,530 Once it's finished you notice two groups one group of customers who purchase only during winter time 57 00:04:58,070 --> 00:05:01,890 and one group of customers who purchase mostly during summertime. 58 00:05:01,970 --> 00:05:08,870 You label them with winter customers and some customers and send them to your marketing team and they 59 00:05:08,870 --> 00:05:12,500 thank you for saving them sending out thousands of unwanted emails. 60 00:05:12,500 --> 00:05:18,090 I'm sure you've probably got some of those kind of emails in your email inbox before what's important 61 00:05:18,090 --> 00:05:24,420 to note here is that you provided the labels they weren't there to begin with but the patterns were 62 00:05:24,870 --> 00:05:30,800 and that's what the machine learning algorithm found and after inspecting the groups you're the one 63 00:05:30,800 --> 00:05:37,980 who saw the commonalities and applied the labels such as summer or winter problems like this are also 64 00:05:37,980 --> 00:05:43,440 called clustering or putting groups of similar examples together. 65 00:05:43,440 --> 00:05:48,960 Recommendation problems such as recommending what music someone should listen to based on their previous 66 00:05:48,960 --> 00:05:57,650 music choices often start out as unsupervised learning problems like this transfer learning leverages 67 00:05:57,740 --> 00:06:04,480 what one machine learning model has learned in another machine learning for example say you're trying 68 00:06:04,480 --> 00:06:08,430 to predict what dog breed appears in a photo that's a cute dog. 69 00:06:08,500 --> 00:06:12,050 That's my that's my poppy 7 and that's Bella in the background. 70 00:06:12,970 --> 00:06:13,610 She's posing. 71 00:06:13,900 --> 00:06:19,160 She knows she's she knows she's on this election you could find an existing model which is learned to 72 00:06:19,160 --> 00:06:24,150 decipher different car types and fine tune it for your task. 73 00:06:24,830 --> 00:06:26,600 Why is this valuable. 74 00:06:26,840 --> 00:06:31,490 Because training a machine learning algorithm which means letting it find all of the patterns in data 75 00:06:32,210 --> 00:06:36,100 can be a very expensive task to find patterns in data. 76 00:06:36,230 --> 00:06:39,660 Machine learning algorithm has to make millions of calculations. 77 00:06:39,860 --> 00:06:45,320 And although computers are very fast at making calculations making calculations aren't free. 78 00:06:46,280 --> 00:06:53,240 So instead of learning everything about different photos from scratch such as what patterns different 79 00:06:53,240 --> 00:07:00,440 trees look like what different shapes are like the rectangle down here what grass looks like the car 80 00:07:00,440 --> 00:07:01,220 type model. 81 00:07:01,220 --> 00:07:06,740 The machine learning model which has figured out what kind of different cars look like has already done 82 00:07:06,740 --> 00:07:10,210 most of these things if you've already tried to model it might have already figured out okay. 83 00:07:10,220 --> 00:07:12,110 These are trees not cars these are grass. 84 00:07:12,110 --> 00:07:15,460 And so it kind of has an idea of what different patterns look like. 85 00:07:16,600 --> 00:07:23,230 Now you can think of this as being the same as when you write an essay versus writing poetry although 86 00:07:23,230 --> 00:07:25,390 the writing styles are different. 87 00:07:25,390 --> 00:07:29,530 The writing that you do uses the same fundamental principles. 88 00:07:29,860 --> 00:07:38,500 So we can take this car model that identifies different cars and use its foundational patterns and apply 89 00:07:38,500 --> 00:07:42,280 it to our dog breed problem of course is a few more steps involved here. 90 00:07:42,300 --> 00:07:48,190 But that's the basic premise of transfer learning reinforcement learning involves having a computer 91 00:07:48,190 --> 00:07:54,700 program perform some actions within a defined space and rewarding it for doing it well or punishing 92 00:07:54,700 --> 00:07:56,410 it for doing poorly. 93 00:07:56,410 --> 00:08:00,670 A good example is teaching a machine learning algorithm to play chess. 94 00:08:00,670 --> 00:08:05,810 The chess board is a divine space and actions are moving pieces. 95 00:08:05,920 --> 00:08:11,950 And when I say punishment or reward these things could be as simple as updating a score with plus one 96 00:08:12,130 --> 00:08:20,680 if it wins a negative one if it loses the machine linings algorithms goal could be to maximize the score. 97 00:08:20,680 --> 00:08:27,610 So this means if you've done it right it should learn moves which lead to wind reinforcement learning 98 00:08:27,610 --> 00:08:31,530 is what was used for deep mines Alpha go to become the best go. 99 00:08:31,650 --> 00:08:38,260 A complicated Chinese ball game far more complicated than chess player of all time defeating many go 100 00:08:38,260 --> 00:08:44,560 world champions and although promising reinforcement learning has yet to find its way into too many 101 00:08:44,560 --> 00:08:45,910 practical applications. 102 00:08:46,480 --> 00:08:52,690 And since we're focused on building practical solutions we've decided to focus on the other kinds of 103 00:08:52,690 --> 00:09:00,040 learning such as supervised learning unsupervised learning and transfer learning throughout this course. 104 00:09:00,070 --> 00:09:05,790 Now you know the major types of learning you've now got the tools to tackle step one in the framework 105 00:09:06,280 --> 00:09:12,440 problem definition aligning the problem you're trying to solve to a machine learning problem. 106 00:09:12,610 --> 00:09:18,760 So for supervised learning you might say I know my inputs and outputs such as I've got patient records 107 00:09:18,790 --> 00:09:26,230 could be the inputs and outputs whether or not the patient has heart disease or your inputs could be 108 00:09:26,530 --> 00:09:31,810 the parameters of a different house and the number of rooms where it's located how many bathrooms there 109 00:09:31,810 --> 00:09:35,900 are and your outputs are how much the house costs. 110 00:09:35,920 --> 00:09:41,470 So it's a regression problem and for unsupervised learning you might say I'm not sure of the outputs 111 00:09:41,500 --> 00:09:47,080 but I do have inputs such as customer purchases and you're trying to figure out which customers are 112 00:09:47,080 --> 00:09:51,900 most similar to each other or for transfer learning. 113 00:09:51,970 --> 00:09:55,330 You might think my problem might be similar to something else. 114 00:09:55,450 --> 00:10:01,930 Can I leverage one existing machine learning model has learned and use it in my own now. 115 00:10:01,930 --> 00:10:06,160 Don't worry if these kinds of learning are sort of going over your head at the moment we're going to 116 00:10:06,160 --> 00:10:12,730 be building a hands on project for each of these learning types supervised unsupervised and transfer 117 00:10:13,060 --> 00:10:14,600 throughout the course. 118 00:10:14,650 --> 00:10:18,640 In the meantime have a think about some of the problems you face day to day. 119 00:10:18,640 --> 00:10:21,930 Could any of them be classified as a machine learning problem. 120 00:10:21,940 --> 00:10:25,120 Are you trying to classify whether one thing is something or another. 121 00:10:25,180 --> 00:10:26,980 That's a classification problem. 122 00:10:26,980 --> 00:10:31,290 Do you ever try to predict what a what a number might be that could be a regression problem.