1 00:00:00,690 --> 00:00:04,140 You might be thinking why are we going through all these stamps. 2 00:00:04,140 --> 00:00:05,170 I've got my computer. 3 00:00:05,160 --> 00:00:06,030 What do I have to download. 4 00:00:06,030 --> 00:00:11,110 Many condo with Python and then and an old install these packages. 5 00:00:11,280 --> 00:00:15,030 I just want to get started coding and so you should be. 6 00:00:15,060 --> 00:00:16,860 It's good to hear you want to code. 7 00:00:16,860 --> 00:00:22,680 We'll be doing lots of that soon but getting your computer and getting our computers set up with mini 8 00:00:22,680 --> 00:00:30,360 Conda and conduct are important stamps which will save a lot of time in the future when you download 9 00:00:30,420 --> 00:00:38,320 many Conda and subsequently Conda kinda gives you the ability to create environments. 10 00:00:38,430 --> 00:00:41,890 You'll hear a lot of these in in different projects that you're working on. 11 00:00:41,940 --> 00:00:49,760 Let me give you an example a workflow for our classifying heart disease project might involve starting 12 00:00:49,850 --> 00:00:55,730 with our computer and then creating a project folder which contains the data we're using. 13 00:00:55,730 --> 00:01:00,320 Patient records and the tools we're using for our machine learning project. 14 00:01:01,100 --> 00:01:07,700 After a little bit of experience we know there's a collection of tools we might want to use such as 15 00:01:07,700 --> 00:01:16,100 pandas and matte plot lib for Data Analysis a collection of tools or packages is called an environment. 16 00:01:16,100 --> 00:01:23,200 We use content to create the environment and then use it again to install different packages. 17 00:01:23,390 --> 00:01:31,480 Data science and machine learning tools to it then every time we work on our project site classifying 18 00:01:31,480 --> 00:01:35,790 heart disease we use the tools within our environment. 19 00:01:35,890 --> 00:01:37,210 So let's walk through this. 20 00:01:37,270 --> 00:01:38,620 You got your computer. 21 00:01:38,620 --> 00:01:42,040 You want to start a new project such as working on heart disease. 22 00:01:42,040 --> 00:01:45,780 You might create just a separate folder maybe on the desktop or something like that. 23 00:01:46,330 --> 00:01:51,400 And then within that one single folder you keep everything that you're gonna use with that project in 24 00:01:51,400 --> 00:01:53,420 that folder such as the data. 25 00:01:53,440 --> 00:01:58,600 Remember the patient records could be the data and the content environment which are these are all the 26 00:01:58,600 --> 00:02:05,770 tools you're going to be working on to find insights on that data to try and predict heart disease conduct 27 00:02:05,860 --> 00:02:08,590 is what you use to create this environment. 28 00:02:08,590 --> 00:02:11,590 Install and update the tools that you need. 29 00:02:11,590 --> 00:02:14,380 Now why is this why is is helpful. 30 00:02:14,380 --> 00:02:17,170 Why would we do it in this kind of way. 31 00:02:17,170 --> 00:02:23,920 Well here's where it comes in handy say for example you're working on your heart disease project since 32 00:02:23,980 --> 00:02:26,520 much of machine learning is experimental. 33 00:02:26,680 --> 00:02:29,680 You might be using a number of different tools for a single project. 34 00:02:30,890 --> 00:02:38,200 Now what if someone else wanted to start working on what you're working on with you instead of setting 35 00:02:38,200 --> 00:02:40,990 up their own workbench from scratch. 36 00:02:41,200 --> 00:02:49,930 Conda allows you to share with them the exact same set of tools and packages you've been using so they 37 00:02:49,930 --> 00:02:53,240 can get started straight away. 38 00:02:53,260 --> 00:02:59,740 This may not seem helpful if you're working largely by yourself at the moment but I can assure you it 39 00:02:59,740 --> 00:03:03,490 is one of the biggest problems when you start working on a team. 40 00:03:03,700 --> 00:03:06,120 There would be days where me and my teammates. 41 00:03:06,340 --> 00:03:12,880 We would literally spend all day working on the same project or trying to set up or trying to work on 42 00:03:12,880 --> 00:03:19,990 the same project but we couldn't because we ran into dependency issues a dependency issue is a term 43 00:03:20,020 --> 00:03:25,000 that's used when you don't have the right package for the problem that you're working on because this 44 00:03:25,000 --> 00:03:25,630 problem. 45 00:03:25,630 --> 00:03:31,900 This project this classifying heart disease will be dependent on these tools here and when we ran into 46 00:03:31,900 --> 00:03:36,430 dependency issues it mean that we couldn't install some of these tools. 47 00:03:36,700 --> 00:03:40,000 Kondo helps to fix that dependency issue. 48 00:03:40,000 --> 00:03:45,670 It ensures that you can create this environment that's easily shareable and update able that you can 49 00:03:45,670 --> 00:03:47,320 share with someone else. 50 00:03:47,380 --> 00:03:52,900 So if you had multiple people on your team and you're all working on a similar project with similar 51 00:03:52,900 --> 00:03:59,560 data and similar data science and machine learning tools the reason being we do this workflow is so 52 00:03:59,560 --> 00:04:04,270 that you can create a project folder and then if you had to share it with someone else's computer or 53 00:04:04,270 --> 00:04:11,320 someone else on your team you could just send them the project folder that contains the data and your 54 00:04:11,320 --> 00:04:16,120 conduit environment and they would be able to get started straight away. 55 00:04:16,120 --> 00:04:22,150 Now again if this sounds like a law and it is a lot to take in we're going to have hands on practice 56 00:04:22,360 --> 00:04:28,420 setting up a project folder as well as an environment for each of the hands on projects we're going 57 00:04:28,420 --> 00:04:30,060 to cover throughout this course.