1 00:00:05,190 --> 00:00:07,850 Extracting quotes and outdoors. 2 00:00:08,640 --> 00:00:09,470 Hello, everyone. 3 00:00:09,780 --> 00:00:18,110 Last time we learned how to extract titles with scrap into the world, we're going to expand our skills. 4 00:00:18,120 --> 00:00:25,670 So I will show you how to extract headers and specifically quotes and outwards from the website. 5 00:00:26,010 --> 00:00:30,000 So we're going to go to our quote, website again. 6 00:00:30,010 --> 00:00:32,240 But first, our exit, this step here. 7 00:00:32,550 --> 00:00:36,900 So we're going to exit the interactive console, as you can see. 8 00:00:37,080 --> 00:00:41,470 And I will write again scrapie Shell. 9 00:00:41,760 --> 00:00:48,870 And it is very important to know that when you write scrapie shell, this means get the information 10 00:00:48,870 --> 00:00:54,660 from the website and give me a tool to interact with the information from this website. 11 00:00:54,690 --> 00:01:00,960 So whatever website you write after Scrappy Shell, you're going to assess all the information for that 12 00:01:00,960 --> 00:01:06,000 website or to the court of this website so you can extract whatever you want. 13 00:01:07,290 --> 00:01:09,480 So if I write HTP. 14 00:01:11,230 --> 00:01:17,200 OK, and then quotes not to. 15 00:01:19,280 --> 00:01:21,500 Scrap dot com. 16 00:01:23,440 --> 00:01:31,860 I'm actually running here at the shell of the scrapie, as you can see, so let me write a comment. 17 00:01:32,350 --> 00:01:40,480 Let's write a response that says and then I write Deiva. 18 00:01:41,700 --> 00:01:43,620 Quote, OK. 19 00:01:45,620 --> 00:01:52,130 And you can see that we're getting actually all the calls for the website, so you can see here that 20 00:01:52,130 --> 00:02:00,110 we have data equals a class quote, and then you see items call and you can see the different quotes 21 00:02:00,380 --> 00:02:07,850 that are listed in this website, which are quite useful, actually expands because of blocks of information 22 00:02:08,060 --> 00:02:10,280 are sometimes quite huge. 23 00:02:10,640 --> 00:02:11,450 So. 24 00:02:12,860 --> 00:02:19,680 They may actually bring it a little bit up and let's try it another thing, so I will right here, our 25 00:02:19,780 --> 00:02:26,110 code variable equal to a response, not Cyesis. 26 00:02:27,250 --> 00:02:29,270 And then def. 27 00:02:30,690 --> 00:02:31,250 Dr.. 28 00:02:32,370 --> 00:02:41,070 Quote, OK, let's close a bracket and then zero and hear all the response that we've got. 29 00:02:41,090 --> 00:02:49,350 So the quote is going to distort in the quote files of read quote, you can actually see that this is 30 00:02:49,350 --> 00:02:54,770 an expected variable because the next part is actually the foundation of corruption. 31 00:02:54,780 --> 00:03:00,660 You can see that there is more information here that's assessable for us in this parameter. 32 00:03:01,410 --> 00:03:06,750 Now let's extract the title, the outer and the actual, the quote. 33 00:03:06,970 --> 00:03:16,560 OK, so if a right title equals quote unquote success and then if you right. 34 00:03:17,160 --> 00:03:19,590 Spane, not text. 35 00:03:21,190 --> 00:03:21,880 And then. 36 00:03:22,810 --> 00:03:32,200 Next, hey, let's close here, then extract, underscore. 37 00:03:33,650 --> 00:03:36,510 First, and let's close here. 38 00:03:37,130 --> 00:03:45,560 So if I were on this, of course, nothing happens, but if I displayed a title variable, OK, you 39 00:03:45,560 --> 00:03:53,800 can see the actual title, which is The World as we have created it is a process of our thinking. 40 00:03:54,410 --> 00:03:57,630 It cannot be changed without changing our thinking. 41 00:03:58,100 --> 00:04:05,210 OK, so this is the main quote or the title of the website, and this is how you can actually export 42 00:04:05,210 --> 00:04:05,390 it. 43 00:04:05,990 --> 00:04:08,450 And now let's see about Autre. 44 00:04:08,870 --> 00:04:10,760 So let's see who said that. 45 00:04:11,050 --> 00:04:13,550 OK, so if I right. 46 00:04:14,930 --> 00:04:21,380 Outer outer equals OK and then quote. 47 00:04:23,540 --> 00:04:34,280 The CIA says and here in a site, I'll write Small Dot Ilter next. 48 00:04:35,920 --> 00:04:44,370 OK, let's close it and of course, to extract it, they will write extract first, so let's read, 49 00:04:44,510 --> 00:04:52,030 extract, underscore first and that's it and let's hit enter. 50 00:04:52,870 --> 00:04:59,230 And I misspelled Outr, but this does matter because the variable is still there. 51 00:04:59,320 --> 00:05:05,770 So if I simply copy and paste this variable, OK, let me copy it. 52 00:05:06,160 --> 00:05:10,350 And so I based it and you can see the name of the order. 53 00:05:10,360 --> 00:05:14,620 So out or that setup was Albert Albert Einstein. 54 00:05:15,160 --> 00:05:23,260 And after we have the or let's talk about the tax so you can get the tax from that quote and tax are 55 00:05:23,260 --> 00:05:29,710 usually things that people that are creating the quote are adding. 56 00:05:29,710 --> 00:05:33,210 These are some keywords with which the quote can be discovered. 57 00:05:33,550 --> 00:05:43,090 So the tax are pretty easy to obtain, but actually also very important because if you get the right 58 00:05:43,090 --> 00:05:48,120 tax for something very popular, you can actually use them in order to appear in the search engine. 59 00:05:48,400 --> 00:06:04,030 So if we do tax, OK, you course cool dot cases and then if I do do tax and then a dot dot. 60 00:06:05,300 --> 00:06:07,700 And then next, OK. 61 00:06:09,380 --> 00:06:12,700 And then I will do dot extract. 62 00:06:13,310 --> 00:06:21,310 OK, so if you're on that and then the right tax, yes, I just do the spelling mistakes with the right 63 00:06:21,320 --> 00:06:21,770 tax. 64 00:06:22,100 --> 00:06:24,930 You can see all the tax from this quote. 65 00:06:25,070 --> 00:06:26,120 So they're of change. 66 00:06:27,170 --> 00:06:29,480 Deep thoughts, thinking and world. 67 00:06:29,580 --> 00:06:33,830 OK, so these are the tax that, for example, a search in the search engine. 68 00:06:33,950 --> 00:06:40,160 This quote might appear in some of the results, obviously not on the first place, but it will be there. 69 00:06:40,640 --> 00:06:49,670 So now, guys, since we found how to extract each element of the quotes, we can actually put them 70 00:06:49,670 --> 00:06:52,490 together into a python directory. 71 00:06:52,760 --> 00:06:55,600 So let's write the for loop. 72 00:06:56,090 --> 00:07:04,340 So let's write for quote in response dot success. 73 00:07:05,120 --> 00:07:06,860 And you can write here. 74 00:07:07,100 --> 00:07:11,000 Do dot quote. 75 00:07:12,880 --> 00:07:30,640 OK, and here we can start writing text equals gold dot success, and then I can do spane dot text. 76 00:07:33,130 --> 00:07:33,790 And then. 77 00:07:34,710 --> 00:07:36,180 Next, OK. 78 00:07:38,340 --> 00:07:42,210 Let's close it, don't extract. 79 00:07:43,970 --> 00:07:53,590 First, and so this is how this works, this is how you can actually extract quotes, how you can extract 80 00:07:53,600 --> 00:07:58,180 tax from the quote and how can extract the actual output of a quote. 81 00:07:58,910 --> 00:08:04,940 So thanks for watching, guys, and bear me in the next video where we're going to do a very similar 82 00:08:04,940 --> 00:08:05,270 thing. 83 00:08:05,270 --> 00:08:12,050 But now we're going to do it in my channel with the script and then we'll run it from our terminal. 84 00:08:12,440 --> 00:08:15,710 Thanks very much for watching and I'll see you in the next video.