1 00:00:02,000 --> 00:00:10,730 In this session, let's understand the different types of data we all know the two broad categorizations 2 00:00:10,940 --> 00:00:15,000 when it comes to data, qualitative data and quantitative data. 3 00:00:15,290 --> 00:00:22,110 But if you really see a further subdivision is there between qualitative and quantitative data. 4 00:00:22,850 --> 00:00:25,780 Let's see quantitative data. 5 00:00:26,390 --> 00:00:30,200 We have continuous and discrete motor quantitative. 6 00:00:31,070 --> 00:00:36,400 The key difference is continuous can be measured on a continuum or scale. 7 00:00:37,580 --> 00:00:40,270 You can subdivide this meaningfully. 8 00:00:40,610 --> 00:00:47,900 For example, if someone says 169 centimeters is their height, you can represent it as one point six 9 00:00:47,950 --> 00:00:48,710 millimeters also. 10 00:00:48,710 --> 00:00:48,970 Right. 11 00:00:49,960 --> 00:00:55,340 That kind of subdividing of data is not possible and discrete. 12 00:00:55,600 --> 00:00:58,090 Yeah, I'm primarily referring to count of data. 13 00:00:58,630 --> 00:01:05,280 Suppose in a class I can say there are 30 students, I cannot say thirty point one students or point 14 00:01:05,290 --> 00:01:05,890 three students. 15 00:01:05,890 --> 00:01:06,150 Right. 16 00:01:07,480 --> 00:01:10,060 Do you understand the difference between discrete and continuous? 17 00:01:11,120 --> 00:01:15,360 Right now, let's see qualitative data in qualitative. 18 00:01:15,830 --> 00:01:19,850 Let's start with binary binaries, two values are possible, right? 19 00:01:20,210 --> 00:01:21,090 Pass fail. 20 00:01:21,140 --> 00:01:21,620 Yes. 21 00:01:21,620 --> 00:01:22,850 No, OK. 22 00:01:24,880 --> 00:01:30,510 Apart from binary, you have nominal ordinal in qualitative data. 23 00:01:31,400 --> 00:01:38,850 Ordinal is about an ordered series, nominal, the inherent ranking or order is not there. 24 00:01:39,620 --> 00:01:41,960 Examples are gender, right? 25 00:01:42,230 --> 00:01:42,850 Race. 26 00:01:43,340 --> 00:01:47,310 You can't say that male is better than a female, right. 27 00:01:47,480 --> 00:01:54,530 So there is no inherent ranking or order, whereas in the case of ordinal, that order or ranking is 28 00:01:54,530 --> 00:01:56,780 possible when it comes to performance. 29 00:01:58,250 --> 00:02:05,150 Grade is better than grade B, grade BS, but better than grade see, so on and so forth, right. 30 00:02:05,630 --> 00:02:13,550 So you need to ascertain what are the different types of data that are there in the dataset that are 31 00:02:13,550 --> 00:02:16,910 taking for analysis why this is important? 32 00:02:17,750 --> 00:02:23,980 You have to understand that algorithms are developed only for quantitative data. 33 00:02:24,620 --> 00:02:29,780 So the quantitative data must be converted into quantitative data. 34 00:02:29,990 --> 00:02:35,280 That is why we need to isolate the qualitative data in our dataset. 35 00:02:35,660 --> 00:02:37,770 So please keep this up in your mind. 36 00:02:38,300 --> 00:02:41,480 OK, now let's see some real life examples. 37 00:02:43,520 --> 00:02:50,340 In the case of insurance, we do this as part of deponent, an independent variable, right? 38 00:02:51,290 --> 00:02:52,850 I'm taking the same example. 39 00:02:52,950 --> 00:02:59,510 If you see this, the insurance charges that must be charged by the insurance company is dependent on 40 00:02:59,510 --> 00:03:02,240 so many factors that you really see. 41 00:03:02,750 --> 00:03:07,850 Why is numeric X1 is numeric, extra is numeric exploders. 42 00:03:07,850 --> 00:03:10,520 Normally the rest are non numeric. 43 00:03:11,440 --> 00:03:15,410 OK, you must convert them into numeric. 44 00:03:16,260 --> 00:03:16,630 Right. 45 00:03:17,250 --> 00:03:24,060 OK, the other example is the banking example in this case, if you see the wire itself is not numeric, 46 00:03:24,060 --> 00:03:24,420 right? 47 00:03:24,600 --> 00:03:27,180 It's a case of should loan be granted or not. 48 00:03:27,180 --> 00:03:35,040 Isabell, only number of dependents, applicants, income, koplik and income loan tenure and amount 49 00:03:35,040 --> 00:03:36,610 visa numeric data. 50 00:03:37,260 --> 00:03:39,810 The rest are all non numeric. 51 00:03:40,060 --> 00:03:42,350 OK, that is one, two, three, four, five. 52 00:03:42,360 --> 00:03:47,020 Only five out of 11 factors will have numeric data. 53 00:03:47,040 --> 00:03:48,990 The remaining will have no numeric. 54 00:03:50,460 --> 00:03:50,830 Right. 55 00:03:51,420 --> 00:03:56,790 So do you understand the different types of data that will be there in your data center? 56 00:03:57,890 --> 00:03:58,420 OK.