1 00:00:04,980 --> 00:00:06,600 Scrap items. 2 00:00:07,170 --> 00:00:07,990 Hello, everyone. 3 00:00:08,010 --> 00:00:14,330 Today, we're going to talk about scrapping items and first of all, before jumping into the coding, 4 00:00:15,030 --> 00:00:22,320 let's talk about what item is so usually in scrap when you create a spider, which you're already familiar 5 00:00:22,320 --> 00:00:22,620 with. 6 00:00:22,920 --> 00:00:32,790 Sometimes the conversion from the HTML to item luxon structure, it is not much structured unless we 7 00:00:32,790 --> 00:00:33,450 define it. 8 00:00:33,780 --> 00:00:39,120 It is quite easy to make typo in the name and to return very incorrect data. 9 00:00:39,540 --> 00:00:45,000 And this is especially true if you work with quite large projects, which I hope you will work at some 10 00:00:45,000 --> 00:00:45,430 point. 11 00:00:45,870 --> 00:00:53,760 So in order for scrapie to define a common output data Skrappy's using items. 12 00:00:54,240 --> 00:01:01,130 So items are container's simply used to collect the scrap data from the web. 13 00:01:01,560 --> 00:01:04,070 So they'll provide you with directory. 14 00:01:04,890 --> 00:01:09,930 And they they're quite convenient because their syntax is quite easy to grasp. 15 00:01:10,260 --> 00:01:12,270 And you see this in a second. 16 00:01:12,600 --> 00:01:17,680 So here are some examples how you can use items in Python. 17 00:01:18,060 --> 00:01:22,890 So the first thing I will do is I will actually go to item mode. 18 00:01:23,590 --> 00:01:25,880 So let's enter the python mode. 19 00:01:26,640 --> 00:01:27,140 OK. 20 00:01:27,480 --> 00:01:35,400 And and the first thing that we need to do is to actually create a class. 21 00:01:35,430 --> 00:01:41,980 OK, so here we are going to define the different fields that we are going to work with later on. 22 00:01:42,270 --> 00:01:45,570 And you can write import scrapie. 23 00:01:46,290 --> 00:01:53,250 OK, and then let's write class products or products actually. 24 00:01:54,090 --> 00:01:54,930 Product. 25 00:01:56,410 --> 00:01:59,530 Scrapie Dot Eitam. 26 00:02:01,530 --> 00:02:06,810 OK, and then let's right to name equals. 27 00:02:08,240 --> 00:02:24,020 Scrappy dot field, that's it, and after that, let's try price equals to scrappy dot field as well. 28 00:02:24,670 --> 00:02:34,280 And then let's try stock equals to scrapie dot fields, OK? 29 00:02:34,580 --> 00:02:41,420 And it is very important that you go to the exact same indentation level as me, because otherwise a 30 00:02:41,420 --> 00:02:45,530 program will not accept your code as Python is an intentional level language. 31 00:02:45,860 --> 00:02:50,770 So please make sure that the code looks exactly like mine anyways. 32 00:02:51,170 --> 00:02:52,910 Let's do our last. 33 00:02:54,210 --> 00:02:56,190 Up to date it. 34 00:02:57,160 --> 00:02:58,050 Updated. 35 00:02:58,240 --> 00:03:00,070 That's it equals. 36 00:03:01,700 --> 00:03:08,990 Scrappy dot field and then serializer. 37 00:03:10,460 --> 00:03:14,570 OK, you quote a star for string. 38 00:03:15,060 --> 00:03:16,680 OK, that's it. 39 00:03:17,330 --> 00:03:19,890 So we're ready here and our press center. 40 00:03:20,270 --> 00:03:27,660 We have created a class and now since we created the class, we can actually create our first product. 41 00:03:27,950 --> 00:03:29,780 So you saw that in the class? 42 00:03:29,780 --> 00:03:33,680 We specified the metadata for each of the fields. 43 00:03:33,740 --> 00:03:37,640 OK, so now let's create a product and then we'll do product. 44 00:03:38,360 --> 00:03:42,660 OK, equals to product. 45 00:03:43,910 --> 00:03:55,400 So here we create an instance of the product class and then let's right name equals desktop PC price 46 00:03:55,400 --> 00:03:58,520 equals one thousand. 47 00:03:58,770 --> 00:03:59,250 OK. 48 00:04:00,260 --> 00:04:02,510 OK, we've got an error here. 49 00:04:02,870 --> 00:04:09,820 Well this is because in the class instead of price I wrote prints so we'll have to use it that way. 50 00:04:10,400 --> 00:04:12,300 And now we're not getting any errors. 51 00:04:12,860 --> 00:04:14,000 So for the people. 52 00:04:15,290 --> 00:04:20,600 So if now I write a print product. 53 00:04:22,410 --> 00:04:27,450 OK, actually, let me put it in parenthesis product, OK? 54 00:04:28,110 --> 00:04:37,200 You can see that our product name is the same as we named it and the price is one thousand dollars. 55 00:04:37,390 --> 00:04:42,220 OK, so this is basically our content here on third container. 56 00:04:43,120 --> 00:04:52,860 Anyways, let's now actually choose to display only the product name and how we do that simply by writing 57 00:04:52,860 --> 00:04:55,500 product and in the brackets name. 58 00:04:56,580 --> 00:04:57,080 OK. 59 00:04:58,490 --> 00:05:07,820 And if it's like that, I get a desktop issue and if I do a product, not get. 60 00:05:09,380 --> 00:05:10,850 And again, name. 61 00:05:14,010 --> 00:05:17,320 I will get the same thing. 62 00:05:17,670 --> 00:05:27,000 They stop --, so if I do on their side product in here, price it actually with em. 63 00:05:27,720 --> 00:05:33,440 So here is a product price and then we'll get one thousand, which is expected. 64 00:05:33,810 --> 00:05:39,900 And also you can see when it was last updated. 65 00:05:40,410 --> 00:05:41,620 So let's try it out. 66 00:05:41,640 --> 00:05:47,940 So if I do products product and then they are in the quotes last. 67 00:05:49,690 --> 00:05:52,830 Updated, we should get an error here. 68 00:05:54,340 --> 00:05:57,140 Let's see actual updates. 69 00:05:57,830 --> 00:06:01,180 Let's try now product. 70 00:06:02,380 --> 00:06:09,800 Not get in here, I can write last so if the right product. 71 00:06:10,820 --> 00:06:11,870 Not to get. 72 00:06:13,710 --> 00:06:15,900 And here for right last. 73 00:06:18,110 --> 00:06:19,130 Updated. 74 00:06:22,400 --> 00:06:34,160 Not set, you want the program to say, if we don't have the last updated to do not return on error, 75 00:06:34,430 --> 00:06:36,260 but to return us not set. 76 00:06:36,710 --> 00:06:41,790 So if I run this instead of the error, we can see this time they are not that message. 77 00:06:41,830 --> 00:06:47,920 So this is a very good way to display a message to the user instead of an error that is not understandable. 78 00:06:48,320 --> 00:06:55,890 So we can do the same in our way if we do a product and then we do. 79 00:06:55,990 --> 00:06:58,760 La la, la, la, for example. 80 00:07:00,200 --> 00:07:06,500 Obviously, since we don't have this product, we get the care and it says that we actually don't have 81 00:07:06,500 --> 00:07:15,320 this type of key, you know, but instead, actually what we can do here is do product and then get 82 00:07:15,680 --> 00:07:24,260 actually put it in parenthesis, get como unknown, unknown field. 83 00:07:25,420 --> 00:07:25,970 OK. 84 00:07:27,070 --> 00:07:38,650 And when they run that, he actually me fix this, so product actually here is a product that I get 85 00:07:38,950 --> 00:07:42,730 in here and said I will write Lalalala. 86 00:07:43,570 --> 00:07:50,500 OK, so you can see that if we don't have the lalala, we actually get the unknown error, which is 87 00:07:50,500 --> 00:07:52,270 basically the name of the whole exercise. 88 00:07:52,810 --> 00:08:02,800 So also we can check if certain names are actual products inside our field. 89 00:08:03,130 --> 00:08:09,310 So, you know, in our fields actually go back to the field so you can see them in our field. 90 00:08:09,330 --> 00:08:11,690 So we've got name prints and stock. 91 00:08:12,160 --> 00:08:16,030 So, for example, if I write name. 92 00:08:17,590 --> 00:08:21,760 OK, in product. 93 00:08:23,860 --> 00:08:25,060 We will get through. 94 00:08:26,110 --> 00:08:35,710 But for example, if I write lalala in product, we will get false because this is actually not in our 95 00:08:35,710 --> 00:08:37,670 name, said, OK. 96 00:08:38,230 --> 00:08:45,100 So now let's learn how you can set fuel values, for example, if a do product. 97 00:08:47,170 --> 00:08:57,430 And then I actually want to out to the field, so I do the last updated equals to the. 98 00:08:59,450 --> 00:09:00,950 This is now valid. 99 00:09:01,310 --> 00:09:08,480 So now we created the fuel to last updated and we did to be equal to our value today. 100 00:09:08,660 --> 00:09:14,240 So if we do now product and then they're right last. 101 00:09:15,820 --> 00:09:17,360 Of the eight. 102 00:09:18,410 --> 00:09:18,820 OK. 103 00:09:20,600 --> 00:09:24,970 I obviously get the volume today, so this is how you can set certain fields. 104 00:09:25,760 --> 00:09:26,280 OK. 105 00:09:26,300 --> 00:09:30,540 The next thing I want to show you is how to assess all the populated values. 106 00:09:30,890 --> 00:09:37,970 So, for example, here, obviously you can see that we have the name and we have the price. 107 00:09:38,240 --> 00:09:43,440 So the populated values are price and name. 108 00:09:43,880 --> 00:09:51,290 So if I want to check only the populated values, I can do products that keys. 109 00:09:51,680 --> 00:09:57,740 And after a ride that I can see that the populated virus or name and price. 110 00:09:57,740 --> 00:10:04,410 But also we get last updated because we just started it and we added a value to it equal to today. 111 00:10:04,940 --> 00:10:05,380 OK. 112 00:10:06,850 --> 00:10:09,730 And, for example, for a right product. 113 00:10:11,210 --> 00:10:12,950 Those items. 114 00:10:14,870 --> 00:10:15,400 OK. 115 00:10:16,340 --> 00:10:19,880 The items will actually get. 116 00:10:21,880 --> 00:10:30,350 All the dictionary of our products, so we get lots of data today, name desktop PC and we also get 117 00:10:30,370 --> 00:10:31,360 the value of the price. 118 00:10:31,360 --> 00:10:35,900 So we get all the values that we need into this dictionary here. 119 00:10:36,010 --> 00:10:37,750 So this is a good way to obtain them. 120 00:10:38,380 --> 00:10:43,510 Let's check some additional things we can do with items. 121 00:10:43,750 --> 00:10:50,860 So, for example, if I want to create a new product from the previous product, I can do product to 122 00:10:51,970 --> 00:10:53,620 equals to. 123 00:10:54,700 --> 00:10:55,510 Product. 124 00:10:56,820 --> 00:10:58,860 And then here at products. 125 00:10:59,790 --> 00:11:00,450 That's it. 126 00:11:02,330 --> 00:11:07,520 And now will create the new product, which is product from product one, and should be absolutely the 127 00:11:07,520 --> 00:11:07,920 same. 128 00:11:08,120 --> 00:11:16,340 So if I print here product to OK, I will basically get the same thing as product one. 129 00:11:16,370 --> 00:11:20,210 So I will get absolutely the same values because this is a copy. 130 00:11:20,690 --> 00:11:25,190 Also, there there's no way you can copy actually items here by doing. 131 00:11:27,950 --> 00:11:28,910 Products. 132 00:11:30,760 --> 00:11:37,420 Through, for example, could be called to product to not copy. 133 00:11:38,670 --> 00:11:44,310 OK, and here, if I do print product three. 134 00:11:45,700 --> 00:11:49,520 You can see that the product treats a direct copy of product, too. 135 00:11:49,990 --> 00:11:55,350 So there are two ways to print this and it does matter which one you use. 136 00:11:55,570 --> 00:12:00,820 One of them is by using the last call and the other one is simply by using the common. 137 00:12:01,090 --> 00:12:03,970 Both of them basically work in exactly the same way. 138 00:12:04,510 --> 00:12:10,610 OK, and the final thing I want to show you is how to create items from dictionaries. 139 00:12:11,020 --> 00:12:13,690 So let's do now product. 140 00:12:15,350 --> 00:12:21,390 So product and here I will define a name. 141 00:12:21,860 --> 00:12:22,360 OK. 142 00:12:23,810 --> 00:12:28,640 Then the name will be Laptop PC. 143 00:12:28,930 --> 00:12:36,260 OK, so this will be the name, then the price. 144 00:12:37,650 --> 00:12:40,720 Is going to be one thousand five hundred. 145 00:12:41,300 --> 00:12:49,760 OK, so I think here we've got honorable guys just because I don't add parentheses around the name. 146 00:12:50,520 --> 00:12:55,080 So let me do that right now and around price at it. 147 00:12:55,080 --> 00:13:02,340 And also, you remember that I so I have to repeat the same typo here as I created the class or it was 148 00:13:02,340 --> 00:13:02,940 a typo. 149 00:13:03,270 --> 00:13:11,280 But if I hit enter here, you can see that we actually created an item just from the dictionary that 150 00:13:11,280 --> 00:13:12,090 were created here. 151 00:13:12,090 --> 00:13:18,900 And dictionary usually has the key and item here and the name is the key and diatom is A laptop, B, 152 00:13:18,920 --> 00:13:20,670 the same with the price. 153 00:13:22,050 --> 00:13:26,210 So this is all about items that they wanted to share with you. 154 00:13:26,460 --> 00:13:31,100 This is how you create and actually collect items. 155 00:13:31,410 --> 00:13:36,690 So that's, I think, very much for watching, for watching my video. 156 00:13:36,870 --> 00:13:38,790 And I will see you in the next one.