1 00:00:00,210 --> 00:00:07,350 So let's talk about this Askey and Unicode and Encodings first, let me discuss about the character 2 00:00:07,350 --> 00:00:07,860 and Cary. 3 00:00:08,370 --> 00:00:16,770 So if you type a key on what let's say, then the computer does not know what you typed because the 4 00:00:16,770 --> 00:00:19,200 computer only understands deros. 5 00:00:19,200 --> 00:00:28,200 And once so what they did is they have created a mapping such that when you type here, a number will 6 00:00:28,200 --> 00:00:30,620 be printed on the screen. 7 00:00:32,170 --> 00:00:38,770 And according to that number, the prosecution considers that number as a character input. 8 00:00:39,580 --> 00:00:43,750 So if that seems confusing, OK, let me show you. 9 00:00:44,890 --> 00:00:51,910 So these are the aski and against schematics, you can see the decimal here from zero to 137. 10 00:00:56,090 --> 00:01:01,790 So we're asking you to sound bit and you can see the character of the disembodied means, not in the 11 00:01:01,790 --> 00:01:02,300 same way. 12 00:01:02,330 --> 00:01:06,490 Let's go over this alphabet first, then you can easily understand everyone. 13 00:01:06,770 --> 00:01:08,540 So you can see the similarity. 14 00:01:09,860 --> 00:01:15,400 If you see that someone's character, you see the similarities it be. 15 00:01:15,860 --> 00:01:23,140 So we will send this decimal in the binary and the computer are seeing this. 16 00:01:23,150 --> 00:01:26,330 We will deliver the 66 escapable. 17 00:01:26,570 --> 00:01:34,580 So when you type the application into the application, then the computer sends these 66 in binary and 18 00:01:34,580 --> 00:01:39,450 then we consider this 66 as capital B and C, I ask after you. 19 00:01:39,770 --> 00:01:47,630 So that's how you just map some number two alphabet and you send that number to the receiver and the 20 00:01:47,630 --> 00:01:47,990 receiver. 21 00:01:47,990 --> 00:01:53,630 So they will think that you can create the save up to this. 22 00:01:54,330 --> 00:01:57,100 There are so small lowercase alphabet. 23 00:01:57,140 --> 00:02:00,200 Ninety seven means here and that means B and so on. 24 00:02:00,920 --> 00:02:06,560 So in this way you can just represent 128 characters. 25 00:02:06,590 --> 00:02:13,040 I mean 127 characters with this seven bit what you want. 26 00:02:13,520 --> 00:02:14,770 Why are you using this one. 27 00:02:14,780 --> 00:02:17,420 But were you in the previous days. 28 00:02:17,450 --> 00:02:20,320 These are very helpful. 29 00:02:20,330 --> 00:02:23,140 And also these are and of right. 30 00:02:23,230 --> 00:02:29,560 Everyone uses English and everyone want to for these numbers, alphabet and some special characters. 31 00:02:29,570 --> 00:02:37,880 And these new lines are values and this acknowledgement, horizontal backspace, etc. These are more 32 00:02:37,880 --> 00:02:39,530 than enough in the olden days. 33 00:02:39,830 --> 00:02:46,070 But whenever there is the game, I mean, the websites have been starting. 34 00:02:46,640 --> 00:02:55,250 They have this whole ASCII as an issue because everyone want to include the language in applications, 35 00:02:55,520 --> 00:03:00,400 like if you want to send your message, you want to send this with these characters. 36 00:03:00,410 --> 00:03:01,640 So it is characters. 37 00:03:01,640 --> 00:03:08,510 You can only include English, but there are so many languages out there as well, Arabic, Chinese, 38 00:03:08,540 --> 00:03:09,500 Japanese, so many. 39 00:03:09,620 --> 00:03:13,010 And they have their own characters and they have their own alphabets. 40 00:03:13,010 --> 00:03:15,760 They have their own symbols, extra symbols. 41 00:03:16,130 --> 00:03:24,800 So how to include this and after this as they have this extended ASCII code for all the main values, 42 00:03:25,190 --> 00:03:26,060 main symbols. 43 00:03:26,300 --> 00:03:27,500 This is called NC. 44 00:03:27,950 --> 00:03:35,260 So NC did not came out like officially, but they just extended this one to this on the 255. 45 00:03:35,750 --> 00:03:38,540 They have added another 138 special characters. 46 00:03:38,780 --> 00:03:46,730 So even though these mathematical symbols are any other symbols, did not come helpful because everyone 47 00:03:46,730 --> 00:03:53,030 wants their own language like Japanese one, they own Japanese, because even though if you did not 48 00:03:53,030 --> 00:04:01,010 include to become very problem, the Chinese people want to send a letter to the Chinese, send a message, 49 00:04:01,280 --> 00:04:02,960 then it should be in Chinese. 50 00:04:02,970 --> 00:04:03,210 Right. 51 00:04:03,740 --> 00:04:05,660 They can convert English to Chinese. 52 00:04:06,140 --> 00:04:07,570 Everyone does not know English. 53 00:04:07,940 --> 00:04:11,710 That's what this Unicode came in to play. 54 00:04:12,380 --> 00:04:14,070 What they did is Unicode. 55 00:04:14,540 --> 00:04:24,470 They have already added these ASCII 128 or to get into the first two to six characters and then they 56 00:04:24,470 --> 00:04:27,520 have added the other languages as well. 57 00:04:30,000 --> 00:04:39,300 So you do have a compass card that can be compactors askey, so this included aski, but the this Unical 58 00:04:39,300 --> 00:04:42,470 includes the one two four byte and Goring's. 59 00:04:43,890 --> 00:04:49,920 So if the character says this year, which decimal value 65. 60 00:04:50,640 --> 00:04:58,640 So in binary it requires maximum soundbytes that Unicode will encode this year in one way. 61 00:04:59,130 --> 00:05:06,170 And if the binary digits are renting increases then Unicode at the end of the debate. 62 00:05:06,540 --> 00:05:13,330 So Unicode especially you, it is very famous in the web and you of it is variable. 63 00:05:13,530 --> 00:05:17,010 And so according to particular character, we want to encourage this. 64 00:05:17,010 --> 00:05:24,180 Unicode defines those in any number of bits we can see for any character equal to or below one or seven 65 00:05:24,180 --> 00:05:26,580 if the idea of a representation is one way. 66 00:05:26,880 --> 00:05:28,920 So these are as good as any. 67 00:05:29,250 --> 00:05:34,350 These can be represented using one by any Carterton, be using them, using one. 68 00:05:35,610 --> 00:05:38,100 And it just really shouldn't be tough for Unicode. 69 00:05:38,490 --> 00:05:40,280 This is the same as the ASCII value. 70 00:05:40,770 --> 00:05:46,920 So they have copied this ASCII and they have extended somewhat for the characters equal to or below 71 00:05:47,250 --> 00:05:55,800 to zero for the idea of representation is boobage and equal to greater than to zero for it. 72 00:05:56,700 --> 00:06:05,250 And then it is three words and furthermore it is four, which as you can see, but there is some starting 73 00:06:05,490 --> 00:06:12,870 magic number all right there, starting with which are to identify that these bytes are Unicode format, 74 00:06:13,020 --> 00:06:20,890 because if you you can see if you say all once, then it can take this 127 as a as ASCII character, 75 00:06:21,030 --> 00:06:22,020 that will be wrong. 76 00:06:23,910 --> 00:06:26,160 So there is only eight bits. 77 00:06:26,340 --> 00:06:32,530 So we get these are the ASCII and the first one is always so number of three, which is seven two four 78 00:06:32,550 --> 00:06:35,070 seven, it's 128 trade. 79 00:06:35,310 --> 00:06:37,680 We can represent 207. 80 00:06:37,940 --> 00:06:38,260 Fine. 81 00:06:38,760 --> 00:06:45,240 So if there are two bytes, then the first byte, three places are filled with one one zero and the 82 00:06:45,240 --> 00:06:47,610 second bytes for two places have left one zero. 83 00:06:47,820 --> 00:06:56,250 And you can only use this X and you can use replace the bits with all these places with X. 84 00:06:56,610 --> 00:07:02,490 So number of bits of it you can just all of it to whatever, minus one from zero. 85 00:07:03,330 --> 00:07:03,720 All right. 86 00:07:03,900 --> 00:07:10,620 In the same way, if there are three bytes anchoring and the first four which are angerer are filled 87 00:07:10,620 --> 00:07:16,590 with one one one zero and the second by Bitrate one zero and Tervita also one zero, and the private 88 00:07:16,590 --> 00:07:18,720 sector was 16, the 16. 89 00:07:18,720 --> 00:07:28,110 And you get the word to minus one in the same way if you have four weights and these five one four one 90 00:07:28,110 --> 00:07:34,500 zero and then these three weights and in the same are going to be told before disaster and you get to 91 00:07:34,500 --> 00:07:35,690 work on minus one. 92 00:07:36,930 --> 00:07:40,410 So with UTF it, you can incorporate some of these. 93 00:07:40,710 --> 00:07:43,550 And there also you do have 16 and you have 32 as well. 94 00:07:45,940 --> 00:07:51,300 And this you can see you do a test, which I'm going to this website. 95 00:07:51,940 --> 00:07:55,480 OK, these are the characters. 96 00:07:55,480 --> 00:08:02,590 If you want to get the value of yet you need to direct four zero zero one one two seven zero four zero 97 00:08:02,920 --> 00:08:03,460 zero two. 98 00:08:05,850 --> 00:08:10,380 And I want to tell you one, UNICOR. 99 00:08:15,690 --> 00:08:16,620 Get a job interview. 100 00:08:16,650 --> 00:08:18,050 That's my mother tongue. 101 00:08:18,480 --> 00:08:26,690 These are some symbols, and if you want to get the value of this one zero zero, if you want to see 102 00:08:27,090 --> 00:08:27,510 this one. 103 00:08:28,380 --> 00:08:33,890 And if you want you want to see zero zero. 104 00:08:34,230 --> 00:08:38,830 So you can also take an open and you can see DeRussy. 105 00:08:40,410 --> 00:08:46,400 So if you try UNICOR plus you'll get this deregulatory. 106 00:08:47,850 --> 00:08:50,160 So that's how these Unicode works. 107 00:08:50,760 --> 00:08:53,990 OK, let's move on to some important topic again. 108 00:08:54,390 --> 00:08:56,460 Another one, this YouTube. 109 00:08:56,640 --> 00:08:59,900 It is more Usdin of Web pages. 110 00:09:00,540 --> 00:09:01,410 OK, that's fine. 111 00:09:01,620 --> 00:09:08,820 But if you are doing this for us, this these the systems users, UTF 16. 112 00:09:10,980 --> 00:09:17,040 The main problem comes when you are sending some buffer into the applications you will send. 113 00:09:17,070 --> 00:09:17,920 Yes, right. 114 00:09:17,970 --> 00:09:20,310 Yes, these are some other normal value. 115 00:09:20,850 --> 00:09:27,320 So these years are one big one bite where you read can in seven bits. 116 00:09:27,720 --> 00:09:34,370 But this idea of what it does is it will incur everything as to why you want to wait. 117 00:09:35,250 --> 00:09:36,800 So you are 65. 118 00:09:36,910 --> 00:09:39,930 That's into it. 119 00:09:40,380 --> 00:09:41,880 Gyros, that is first rate. 120 00:09:41,880 --> 00:09:47,140 And then and then then in the second way, it will include the actual actual value. 121 00:09:47,760 --> 00:09:52,230 So there is one actual use placing in the value of it. 122 00:09:53,100 --> 00:10:03,360 So every year becomes zero zero zero zero zero zero zero zero and that is value and so be it will add 123 00:10:04,110 --> 00:10:11,160 one other rate in front of this actual value because you do have 16 and got minimum wage. 124 00:10:11,640 --> 00:10:12,180 They have done. 125 00:10:12,180 --> 00:10:18,170 They like that you do a 16 does not have a variable and then that's what it takes to which I mean, 126 00:10:18,180 --> 00:10:18,750 what value? 127 00:10:20,440 --> 00:10:27,390 So it takes to weight, so that's why you get another value that is zero zero in null value if you are 128 00:10:27,400 --> 00:10:29,110 using Windows applications. 129 00:10:29,300 --> 00:10:37,300 If you send the buffer, if it's taken as a string, then the input that the data after this nasty gets 130 00:10:37,300 --> 00:10:45,730 truncated because in every other languages, this is null by Albert. 131 00:10:46,360 --> 00:10:47,650 You need to remember this one. 132 00:10:47,980 --> 00:10:51,540 If you are from C programming, you can just easily understand this. 133 00:10:51,660 --> 00:11:04,980 This network can identify as string Terminator twice a year and then type B, C, D, sorry BCT. 134 00:11:06,220 --> 00:11:14,490 Then the application takes only the input yay three years and it will ignore the input because of this 135 00:11:14,490 --> 00:11:15,050 structure. 136 00:11:15,100 --> 00:11:16,740 This is recreating Terminator. 137 00:11:17,200 --> 00:11:24,240 That means the thing the application reads from these to input from left to right and then yay yay yay. 138 00:11:24,250 --> 00:11:29,350 Upon seeing this right you wrote the application just ignored especially the what happens in a string 139 00:11:29,370 --> 00:11:37,000 applications so that we normally most our modern applications. 140 00:11:37,000 --> 00:11:45,370 How the Unicode encoding because there are so many other different languages users using that way you 141 00:11:45,370 --> 00:11:51,670 need to make sure that you're the Unicode is there in the application. 142 00:11:51,670 --> 00:11:58,370 Then you need to exploit the Unicode according to the conversion of the weights. 143 00:12:00,010 --> 00:12:06,060 So there are some Unicode pages you will discuss about that in the section. 144 00:12:06,070 --> 00:12:14,230 If the time permits these Unicode first are not necessary for what are so many other certifications. 145 00:12:14,710 --> 00:12:21,730 But I just said that there will be this Unicode encoding and if you suddenly see this like somewhat 146 00:12:21,730 --> 00:12:26,980 some delegates are coming out, some other transformations are occurring, then you need to worry it. 147 00:12:26,980 --> 00:12:36,090 Maybe the Unicode encoding there are mainly Windows and Java applications uses this duty of sixteen. 148 00:12:36,100 --> 00:12:39,580 And so we need to deal with the studio 16. 149 00:12:39,790 --> 00:12:43,410 And if time permits, I will add this unique role for us also. 150 00:12:44,710 --> 00:12:46,520 So that's what this Caltrain goodness. 151 00:12:46,540 --> 00:12:47,640 I hope you have understood. 152 00:12:48,010 --> 00:12:52,870 We have discussed the ASCII and extended ASCII currency and then you have encodings.