1 00:00:05,750 --> 00:00:09,230 Hey everyone and welcome to the section in this section. 2 00:00:09,230 --> 00:00:14,690 We are going to work with data science that what actually data science and what are the technologies 3 00:00:14,690 --> 00:00:16,310 used in data science. 4 00:00:16,400 --> 00:00:22,360 So before that this one is a little video about what actually Data Sciences. 5 00:00:23,060 --> 00:00:32,020 So here I have an image downloaded from Internet which is showing a we can see that big bag of data 6 00:00:32,020 --> 00:00:32,960 science. 7 00:00:33,110 --> 00:00:39,770 And let me begin with the definition of data science that would actually is data science so data science. 8 00:00:39,770 --> 00:00:45,860 Most of the people think that this thing is known as Data Science graphical plotting and visualization 9 00:00:45,860 --> 00:00:47,360 of data. 10 00:00:47,960 --> 00:00:57,360 And this consists of like something analyzing the data and loading them on graphs or let but this one 11 00:00:57,420 --> 00:01:05,010 is just a part of data science and data science is actually all these steps in forming these graphs 12 00:01:05,070 --> 00:01:10,670 and using these graphs that they are use a common example is like you have what do you do. 13 00:01:11,250 --> 00:01:15,400 And you have also like used Facebook Instagram. 14 00:01:15,540 --> 00:01:19,800 Then there you see something like when there's a post or like this video here. 15 00:01:19,800 --> 00:01:26,620 This video has fifty four key lights and sixteen K dislikes and a number of comments. 16 00:01:26,670 --> 00:01:31,620 So these things are actually data you people just make the video like or dislike. 17 00:01:31,800 --> 00:01:38,010 But the program or the person that are sitting at the back and analyze their data like this thing is 18 00:01:38,010 --> 00:01:40,680 getting so much like this thing is getting popular. 19 00:01:40,680 --> 00:01:47,970 This thing is not going to work in the future so they generally analyze these data that what is the 20 00:01:47,970 --> 00:01:53,150 demand of market and then they perform according to their data. 21 00:01:53,150 --> 00:02:00,380 Make these changes and provide efforts and they see that how that thing is affecting data on company 22 00:02:02,640 --> 00:02:10,140 a more common example is something like Let me consider a state with not a state a small city that consists 23 00:02:10,140 --> 00:02:11,730 of 1000 people. 24 00:02:12,210 --> 00:02:20,520 So I have a company on mobile not me just take two example like Apple and my Apple provides expensive 25 00:02:20,520 --> 00:02:27,440 mobiles that are just like nowadays in thousands of dollars like iPhone access of Trump and another 26 00:02:27,450 --> 00:02:36,330 company that is and my providing cheap mobile it's also there just 150 bucks now in the small city of 27 00:02:36,330 --> 00:02:42,000 thousand peoples I put the data scientist know what they go for. 28 00:02:42,000 --> 00:02:48,210 Like is first they collected data and hack in need of data science that is these graphs. 29 00:02:48,210 --> 00:02:51,930 You can also search for them the data Santiago needs. 30 00:02:52,350 --> 00:02:59,610 And this consists of all the steps in data science that is first to collect the data like these thousand 31 00:02:59,610 --> 00:03:06,990 people just assume that nine hundred people are not much reached digits on 200 to 300 dollars a month 32 00:03:07,050 --> 00:03:14,490 or maximum five hundred dollars a month and digest under are upper middle class people or we can assume 33 00:03:14,490 --> 00:03:18,780 that the upper class peoples who are earning thousands of dollars. 34 00:03:18,780 --> 00:03:26,120 So if we talk about the mobiles then only these hundred people that are in upper class can afford that 35 00:03:26,120 --> 00:03:26,960 mobiles. 36 00:03:27,270 --> 00:03:34,590 Person who is earning 500 or 300 dollars a month cannot afford a mobile phone thousand dollars. 37 00:03:34,590 --> 00:03:41,760 So this is a data like the company will analyze the data that what the market is that what they demand 38 00:03:41,760 --> 00:03:45,100 of people invoke according to them like the. 39 00:03:45,150 --> 00:03:46,500 And my company. 40 00:03:46,500 --> 00:03:53,160 And this is one more complaint that we can take example of that is Nokia as you can see nowadays Nokia 41 00:03:53,160 --> 00:03:59,910 has lower market than and might that is because Nokia once focused on Windows Mobile. 42 00:04:00,240 --> 00:04:05,530 That is something people did not like but they did not change according to the need of market. 43 00:04:05,550 --> 00:04:09,870 They just go with the product they are making their product superior and superior. 44 00:04:10,290 --> 00:04:20,500 I believe that at the time Nokia was a good but having a good product and having a need of product. 45 00:04:21,550 --> 00:04:23,440 Is a different thing. 46 00:04:23,500 --> 00:04:28,790 There are some products that are very great and we can see that just ideal products. 47 00:04:28,900 --> 00:04:31,210 But there are no market for them. 48 00:04:31,240 --> 00:04:38,920 Sometimes you have seen and maybe some of you have also practically feel this thing that you are an 49 00:04:38,920 --> 00:04:41,560 enduring or technical type of student. 50 00:04:41,560 --> 00:04:47,620 Sometime you have posted videos and tried to post a new video on YouTube that is related to some particular 51 00:04:47,620 --> 00:04:54,250 concept like some post like hacking something hacking the password of any VI file something like this 52 00:04:54,850 --> 00:04:59,000 and you just get 100 or 200 views. 53 00:04:59,100 --> 00:05:06,730 But on the same side any against incident just like funny video or any moment both just leave you know 54 00:05:06,730 --> 00:05:11,190 something shit that doesn't have any meaning and doesn't have any sense. 55 00:05:11,200 --> 00:05:20,590 But that entertain people and that video gets millions of views so that's the thing is that we in data 56 00:05:20,590 --> 00:05:22,960 science just not focus on developing the product. 57 00:05:22,960 --> 00:05:28,780 This is the misconception students that data science is something by virtue and by learning which we 58 00:05:28,810 --> 00:05:35,760 make like products that are generally ideal and no other product stand in front of them. 59 00:05:35,950 --> 00:05:37,910 There is no data sciences. 60 00:05:38,170 --> 00:05:45,490 Now we are taking the example of like a thousand papers and just assume that MRI was the company at 61 00:05:45,490 --> 00:05:46,170 that time. 62 00:05:46,180 --> 00:05:53,050 Like am I starting in 2010 then this one is just example not practical and that has happened before 63 00:05:53,710 --> 00:06:00,690 like the MRI has observed that out of thousand people they put any person to collect all the data and 64 00:06:00,720 --> 00:06:05,950 after all these thousand nine hundred can afford the cheaper buy sold the company decided to make the 65 00:06:05,950 --> 00:06:13,960 cheaper price maybe the margin is come down a few dollars on a particular mobile but this is a very 66 00:06:13,960 --> 00:06:20,830 high the demand is very high and this thing is really good that out of thousand people 900 peoples are 67 00:06:20,830 --> 00:06:22,400 preparing their product. 68 00:06:22,510 --> 00:06:32,300 These things make the company market best because in the top on the list now it would talk about collecting 69 00:06:32,300 --> 00:06:40,850 the data then these are the steps in data sites like first we collect the data and then we store that 70 00:06:40,850 --> 00:06:47,360 data collecting means that generally collect all the data like a thousand people just assume we have 71 00:06:47,550 --> 00:06:56,060 thousand fifty then we will store all the data like in a database First we used to store the data notebooks 72 00:06:56,090 --> 00:07:01,820 you also has in store at any point in your life some data on notebooks when by and you did not able 73 00:07:01,820 --> 00:07:07,070 to use mobile it's all you when you do not have mobile it's like when you are playing something with 74 00:07:07,220 --> 00:07:12,470 your friends or you are selling something you had just written on notebook that this is I have done 75 00:07:13,400 --> 00:07:21,930 you also sometimes make a list of these routines that you follow daily that's what also the data that 76 00:07:21,930 --> 00:07:26,340 you have working on that data makes you like if you follow up of that routine maybe you will have a 77 00:07:26,340 --> 00:07:33,570 perfect body perfect shape Perfect job perfect mind then that thing that you are analyzing all the needs 78 00:07:33,570 --> 00:07:36,110 that you require to be perfect and working on that. 79 00:07:37,200 --> 00:07:40,880 So similar that is in case of computers. 80 00:07:40,890 --> 00:07:47,820 Now the data collected like the thousand and 50 then the next step is cleaning the data like maybe out 81 00:07:47,820 --> 00:07:51,370 of that were hundred and 50 fifty of children's that do not have money. 82 00:07:51,370 --> 00:07:52,090 They do not. 83 00:07:52,650 --> 00:07:55,840 They cannot even just afford any of T-Mobile. 84 00:07:55,920 --> 00:07:59,250 So companies will never take all the data. 85 00:07:59,250 --> 00:08:05,320 They just take the data that is under the market like those who are earning those What about rating 86 00:08:06,750 --> 00:08:13,050 then the company analyze the data that was the demand of the market like 900 can afford a cheap one 87 00:08:13,230 --> 00:08:22,680 hundred can afford the expensive one so that the analytics after that testing experimentation and algorithm 88 00:08:22,710 --> 00:08:29,810 then they design something like the things that can affect their company in positive they like on their 89 00:08:29,810 --> 00:08:32,340 demand they are producing cheap mobiles. 90 00:08:32,460 --> 00:08:39,550 They decided to make cheap things cheap items like again the example of MMI and may provide cheap mobiles. 91 00:08:39,600 --> 00:08:42,010 The margin is low on their mobiles. 92 00:08:42,180 --> 00:08:49,650 But what they did they stop providing earphones that they might earphone is a product that just consumed 93 00:08:49,650 --> 00:08:52,290 nearly 10 dollars for every piece. 94 00:08:52,680 --> 00:08:57,360 So they removed that one to increase their product because their margin is already low. 95 00:08:57,360 --> 00:09:00,790 But they provided the product according to the market. 96 00:09:00,830 --> 00:09:06,500 They never provide like in thousand dollars for their mobile. 97 00:09:06,630 --> 00:09:12,180 They just saw the market that what is the nature of market and what is going to work on that market. 98 00:09:12,180 --> 00:09:14,500 Not like just produce anything identical. 99 00:09:14,610 --> 00:09:16,160 Example of YouTube videos. 100 00:09:16,230 --> 00:09:21,990 There are some perfect videos but did you not get the view because there is no market for particular 101 00:09:21,990 --> 00:09:23,050 of that. 102 00:09:23,400 --> 00:09:30,940 Like if you are again a technical guy posting a video of some technical stuff then if you consider what 103 00:09:30,940 --> 00:09:33,940 led people out of that leg maximum. 104 00:09:33,960 --> 00:09:42,120 There will be 5000 technical persons that are belong to some like marketing comas arts fields. 105 00:09:42,120 --> 00:09:45,790 Some are like old person and some are children. 106 00:09:45,810 --> 00:09:54,120 So the market for technical stuff is only 5000 but out of that full leg I believe ninety thousand pepper 107 00:09:54,180 --> 00:09:55,580 entertainment. 108 00:09:55,710 --> 00:10:04,310 They all need entertainment and something they can be happy for so they vote and like 60 day days. 109 00:10:04,500 --> 00:10:09,500 Now this thing is not part of all like data scientists. 110 00:10:09,500 --> 00:10:11,720 This thing all is data science. 111 00:10:11,810 --> 00:10:19,610 But now we are talking the jobs of data science at present data science is one of the best feel for 112 00:10:19,640 --> 00:10:27,500 we can see that salary and for job purpose data science is on the top of the jobs and have highest paid 113 00:10:27,500 --> 00:10:32,260 salary Artificial Intelligence in deep learning is going to be better than that one. 114 00:10:32,270 --> 00:10:39,910 But at present they are not like the data science and other things because they are not just so advanced. 115 00:10:39,990 --> 00:10:46,110 This maybe in 2050 or 2030 that will be on top. 116 00:10:46,200 --> 00:10:47,940 But now we have the data. 117 00:10:48,310 --> 00:10:56,680 Now in this hacking this thing first but it is collecting the data is the vocal software engineers that 118 00:10:56,680 --> 00:11:02,920 is the most of the students in computer science or computer engineering go for if they do not learn 119 00:11:02,920 --> 00:11:08,650 about the data science and do not focus on data science they will remain on this first step collecting 120 00:11:08,650 --> 00:11:15,450 the data and collecting the data doesn't mean like going to person in person writing things making notes. 121 00:11:15,550 --> 00:11:17,170 This is something like here. 122 00:11:17,200 --> 00:11:24,490 Again this video collecting the likes and dislikes data from the program algorithm they collect this 123 00:11:24,490 --> 00:11:26,570 data and work on that data. 124 00:11:26,730 --> 00:11:31,420 If Dale likes are you know very high then the video became on trending. 125 00:11:31,420 --> 00:11:34,900 If the views are increasing so fast the video again beyond trending. 126 00:11:36,310 --> 00:11:46,350 And after that this part that is gleaning and storing that data like the example of 2050 peoples then 127 00:11:46,360 --> 00:11:51,450 storing that data and cleaning their data is the part of data and. 128 00:11:51,550 --> 00:11:57,910 This one includes the data engineers and this one that I am telling you is nearly most of the companies 129 00:11:57,910 --> 00:11:58,690 prefer this thing. 130 00:11:58,690 --> 00:12:06,280 They provide software in for collecting data and for storing and cleaning the data we have great engineers 131 00:12:06,280 --> 00:12:09,560 because data scientists are paid higher than these two. 132 00:12:09,610 --> 00:12:13,450 So but these two works did not require any logic. 133 00:12:13,450 --> 00:12:21,280 Logistically collecting data and storing the data which an engineer can do so they just take this data 134 00:12:21,280 --> 00:12:23,590 scientist for these two steps. 135 00:12:23,590 --> 00:12:25,710 That is for analyzing the data. 136 00:12:25,780 --> 00:12:28,480 Like now the data is formed Clean and stored. 137 00:12:28,480 --> 00:12:33,700 Now we have the list of thousand people now they will analyze that how many of them required this thing 138 00:12:33,700 --> 00:12:40,270 and acting like nine 900 require that 100 require that then they will analyze the data in the steps 139 00:12:40,720 --> 00:12:45,670 and make a graph like this one like this one. 140 00:12:45,880 --> 00:12:53,140 Like here we have 900 people refereeing the cheap mobiles and hundred are pressuring the expensive mobile 141 00:12:53,560 --> 00:12:56,540 long look at the complete one but just focus on these two. 142 00:12:56,740 --> 00:13:02,000 So they will make the visualization so that every person can understand that. 143 00:13:02,200 --> 00:13:08,290 And the graphical visualization is the best thing like if you have noticed this thing I think most of 144 00:13:08,290 --> 00:13:11,330 you have seen these things these shows. 145 00:13:11,350 --> 00:13:18,970 This one is example of customer analyses but you have seen the growth of companies like Apple Amazon 146 00:13:18,970 --> 00:13:20,580 in these kinds of graphs. 147 00:13:20,800 --> 00:13:25,370 So you cannot just write the name of like here we have fourteen thousand four hundred seventy four. 148 00:13:25,390 --> 00:13:28,180 Here we have twenty five thousand and forty three. 149 00:13:28,180 --> 00:13:30,850 You cannot just write names of everyone. 150 00:13:30,880 --> 00:13:37,510 So there will be a there must be a V by which you can represent all the data that is done with the graphical 151 00:13:37,510 --> 00:13:38,740 representations. 152 00:13:38,770 --> 00:13:46,630 People use these graphical representation by for defining all the needs and they have their x and y 153 00:13:46,630 --> 00:13:48,670 axis according to their needs. 154 00:13:48,670 --> 00:13:54,880 Like if we are talking in lax of millions then these things will be millions like 1 million 2 million 155 00:13:54,880 --> 00:13:55,900 3 million. 156 00:13:55,900 --> 00:14:02,380 If we are talking in general numbers then they will be normally like here we have a difference of 50 157 00:14:02,380 --> 00:14:06,250 on by X is difference of 20 on x axis. 158 00:14:07,630 --> 00:14:12,160 So that's what the visualization is. 159 00:14:12,160 --> 00:14:17,070 Then we have the testing the thing like. 160 00:14:17,110 --> 00:14:19,740 Now you have analyzed the data now what to do with that. 161 00:14:19,810 --> 00:14:22,030 You just don't analyze the data without any need. 162 00:14:22,510 --> 00:14:28,960 So in this step the data scientist implied that data in such a way that the company growth will increase 163 00:14:29,760 --> 00:14:36,130 like again the example they provide cheap mobiles so cheap mobiles so their market is going to be increased 164 00:14:36,760 --> 00:14:41,530 because most of the people are going to people that know the about stuff that is a and deep learning 165 00:14:41,530 --> 00:14:43,400 is also this type of data science. 166 00:14:43,570 --> 00:14:44,590 That is something like. 167 00:14:44,710 --> 00:14:46,710 And one more thing. 168 00:14:46,720 --> 00:14:54,040 Whenever you have noticed like in YouTube videos in medias you have noticed that men never talk about 169 00:14:54,040 --> 00:14:55,180 data science. 170 00:14:55,470 --> 00:14:58,750 The artificial intelligence become the machine learning comes there. 171 00:14:59,860 --> 00:15:07,390 So people get this one also has a misconception that science is about machine learning and out of the 172 00:15:07,560 --> 00:15:08,100 agency. 173 00:15:08,920 --> 00:15:16,200 Yes that one is also about that but not only for that machine learning and artificial intelligence see 174 00:15:16,240 --> 00:15:18,810 are just applications of data science. 175 00:15:19,000 --> 00:15:26,350 If you want to develop your A.I. and want to go for machine learning you must have to learn the data 176 00:15:26,350 --> 00:15:32,530 science because without data science you cannot work on that because machine learning what actually 177 00:15:32,530 --> 00:15:39,220 is machinery making the machine learn something by itself and how the machine will can we learn learn 178 00:15:39,220 --> 00:15:47,840 anything so this thing is a common example like if these are businessmen then there's nothing like he 179 00:15:48,360 --> 00:15:50,260 got and got a product and this will. 180 00:15:50,310 --> 00:15:53,990 Does that product will work and that will book shortly. 181 00:15:53,990 --> 00:15:57,080 The thing is he analyzed the data. 182 00:15:57,590 --> 00:16:01,070 He looked at what the need is like in a fruit market. 183 00:16:01,190 --> 00:16:06,860 If I sell a soap maybe it will not work because that one is a fruit market. 184 00:16:07,160 --> 00:16:17,990 But if I go for like proper fruits fruits without chemicals and good looking fruit like fresh then maybe 185 00:16:17,990 --> 00:16:18,580 that will work. 186 00:16:19,370 --> 00:16:25,100 So that is the thing is machine learning is something like a machine just learns from all the data that 187 00:16:25,100 --> 00:16:25,720 he has. 188 00:16:25,790 --> 00:16:33,020 It has like a washing machine that what it has to do with that particular thing it has to provide this 189 00:16:33,020 --> 00:16:34,030 so that so. 190 00:16:34,150 --> 00:16:41,640 So according to the clothes that are in the lake natural water in the washing machines. 191 00:16:42,080 --> 00:16:48,260 So they're just a part of the assets will be considered application of datasets that the machines just 192 00:16:48,530 --> 00:16:52,990 analyze the data and work according to data and to analyze the data. 193 00:16:53,030 --> 00:17:00,750 That machine also need to work with data since without the data the machine can never learn by itself. 194 00:17:00,760 --> 00:17:03,990 It's not something like they understand everything by their own. 195 00:17:04,060 --> 00:17:08,800 Even we humans we did not understand or we cannot learn everything. 196 00:17:08,810 --> 00:17:15,070 BI But when you start to work you just noticed other people's walking you analyze that thing that people 197 00:17:15,070 --> 00:17:21,810 are walking with their legs and that's the particular thing like you don't know how to walk when you 198 00:17:22,480 --> 00:17:29,530 get but like when you are six to seven months old you do not know how to walk maybe if someone tell 199 00:17:29,520 --> 00:17:32,150 you maybe you think that by a head I can vote. 200 00:17:32,540 --> 00:17:35,830 But that's not possible by hands you can also walk. 201 00:17:35,840 --> 00:17:40,770 That is another thing but you do not walk with your hands because the legs are made for them. 202 00:17:40,860 --> 00:17:43,090 And how did you get another leg are for that. 203 00:17:43,250 --> 00:17:49,880 By analyzing other peoples that they're voting and using their legs. 204 00:17:50,300 --> 00:17:51,440 So that's the thing. 205 00:17:51,590 --> 00:17:55,750 How the machine learning works and what the machine learning actually is. 206 00:17:55,850 --> 00:18:03,290 So do you go for machine learning you must test to go for data science first and then for Dedmon And 207 00:18:03,620 --> 00:18:10,250 after that what we are going to do in this section we are going to work on these two parts. 208 00:18:10,400 --> 00:18:17,260 We will have a predefined data and that is also clean and stored on an already structured in a way. 209 00:18:17,690 --> 00:18:24,350 We will just use that data to analyze that data inform of these beautiful graphs like these things. 210 00:18:24,350 --> 00:18:31,640 This thing is something like you cannot imagine that how you can make this one without programming and 211 00:18:31,640 --> 00:18:40,040 this one is also not something like if the focus on this one so you can just imagine too broad this 212 00:18:40,040 --> 00:18:44,570 thing here this thing without reprogramming. 213 00:18:44,720 --> 00:18:51,550 You also cannot do this thing so we are going to learn in this section that how we can analyze the data 214 00:18:52,060 --> 00:19:00,960 use that data and visualize the data in a way that you can analyze that data in this step sorry bug 215 00:19:01,000 --> 00:19:03,370 efficiently on their data in that step. 216 00:19:03,370 --> 00:19:09,070 Like if you made a graph and this one is something any product that have maximum demand then you will 217 00:19:09,220 --> 00:19:17,460 in this step before making this product in large quantity than the others so that this will increase 218 00:19:17,460 --> 00:19:18,530 your growth. 219 00:19:18,570 --> 00:19:21,030 Not like just make any product. 220 00:19:22,050 --> 00:19:28,170 So we will learn all these things that how you can get the data how you can analyze the data and how 221 00:19:28,170 --> 00:19:30,810 you can work on data that data. 222 00:19:30,900 --> 00:19:32,670 So that efficiency will increase. 223 00:19:33,480 --> 00:19:39,930 I think now you are getting interested about dissection because learning these things data science machine 224 00:19:39,930 --> 00:19:40,680 learning. 225 00:19:40,950 --> 00:19:48,060 I believe this is a dream of every software engineer and every computer science student even I am in 226 00:19:48,060 --> 00:19:48,850 Python. 227 00:19:48,990 --> 00:19:57,320 If I say by about Python my most interesting like them feel that I am most interesting is the data science. 228 00:19:57,360 --> 00:20:01,870 Even then the machine learning because machine learning is something that requires a number of calculation 229 00:20:01,870 --> 00:20:06,150 in different ways but that also work on data sent by analyzing the data. 230 00:20:06,990 --> 00:20:11,990 And this one is going to be an interesting one for you also. 231 00:20:12,300 --> 00:20:19,290 And this is all about the data science what the data sciences and the jobs in data science coping and 232 00:20:19,290 --> 00:20:22,980 data science and how actually all the things work. 233 00:20:23,160 --> 00:20:28,170 I hope you go and get a little idea module you get when we go to all these actions. 234 00:20:28,170 --> 00:20:35,580 Now this picture is also small because of that blog data diagram for the data science that generally 235 00:20:35,580 --> 00:20:45,120 someone first prepare data like by cleaning the data and transforming the data then they will like store 236 00:20:45,120 --> 00:20:50,580 that data and analyze and make the models then they visualize that more does to the community and that 237 00:20:50,580 --> 00:20:56,670 what is the demand what is the need and then they deploy that all these things. 238 00:20:57,570 --> 00:21:02,530 So in for that what this video I hope you got the idea. 239 00:21:03,060 --> 00:21:04,200 So thanks for watching. 240 00:21:04,200 --> 00:21:07,050 We will continue from the next module the name by.