0 1 00:00:00,500 --> 00:00:01,320 Are you ready? 1 2 00:00:01,320 --> 00:00:03,130 Here's the solution. 2 3 00:00:03,390 --> 00:00:13,030 Going to the top of the file, I'll add another constant and I'll call this constant "SKULL_FILE" and it's going 3 4 00:00:13,030 --> 00:00:18,180 to point to the "skull-icon.png", 4 5 00:00:21,190 --> 00:00:31,290 this one right here, Shift+Enter on this cell, scroll back down and now it's time to put all of this text 5 6 00:00:31,530 --> 00:00:39,180 that's in the "shakespeare-hamlet.txt" file from the NLTK resources into a single string, 6 7 00:00:40,710 --> 00:00:53,610 so I'll call a variable "hamlet_corpus" and set that equal to a "nltk.corpus.gutenberg. 7 8 00:00:53,850 --> 00:01:04,890 words('shakespeare-hamlet.txt')" the same file name as right 8 9 00:01:04,890 --> 00:01:17,740 here. The word list is going to be equal to "[''.join()]" and I'll join the 9 10 00:01:17,740 --> 00:01:31,030 word up from my loop, so "for word in hamlet_corpus" will give us our list of words and then I'll say "hamlet_ 10 11 00:01:31,450 --> 00:01:44,570 as_string = ' '.join(word_list)". That's our play as a string. 11 12 00:01:44,920 --> 00:01:54,810 Now all I need to do is maybe go "skull_icon = Image.open()", 12 13 00:01:54,880 --> 00:02:02,550 so here we're using pillow, feeding the relative path to the file and now it's time to create that mask. 13 14 00:02:02,650 --> 00:02:15,730 So I'll say "image_mask" is equal to a new pillow image object with mode equal to "RGB", 14 15 00:02:16,810 --> 00:02:33,640 the size equal to the "skull_icon.size" and color equal to white - (255, 255, 255)" inside 15 16 00:02:33,700 --> 00:02:44,830 a tuple. The rgb_array for this skull image will simply be a numpy array created from the image mask. 16 17 00:02:47,190 --> 00:02:56,310 Now we can create our word cloud, "word_cloud" is gonna be equal to "WordCloud( 17 18 00:02:56,790 --> 00:03:10,310 mask = rgb_array)" from Hamlet. I'm going to choose my background color as "white" and then I'm 18 19 00:03:10,310 --> 00:03:18,560 going to pick a color map. The color map that I'm going to go for for my skull image is gonna be called 19 20 00:03:18,860 --> 00:03:30,080 "bone". I figure that's gonna be a nice combination. And for the maximum words, "max_words" argument, 20 21 00:03:30,300 --> 00:03:40,080 I'm going to pick... I'll start out with 50 and then I'm going to up it, just so you can see how a higher 21 22 00:03:40,080 --> 00:03:51,090 number will actually look a lot better. Then I'll use "word_cloud.generate()" to generate my image, but 22 23 00:03:51,480 --> 00:03:54,920 of course I have to feed in which string I want to use, 23 24 00:03:54,930 --> 00:03:58,290 so "hamlet_as_string". 24 25 00:03:58,290 --> 00:03:59,910 That's it for the word cloud. 25 26 00:03:59,910 --> 00:04:01,050 Now it's all gonna be 26 27 00:04:01,050 --> 00:04:07,230 matplotlib. "plt.figure()" sets the size of our figure, 27 28 00:04:07,230 --> 00:04:10,460 I'm gonna go again with 16 and 8. 28 29 00:04:10,470 --> 00:04:17,610 I think that's a good science for me and for the video. I'll use the "imshow()" method, feed in the word_ 29 30 00:04:17,610 --> 00:04:26,130 cloud along with an interpolation of "bilinear", 30 31 00:04:26,510 --> 00:04:29,700 so I think that's spelled right. 31 32 00:04:29,910 --> 00:04:30,830 Good. 32 33 00:04:30,960 --> 00:04:35,310 I'll remove the axes, so "axis('off')" 33 34 00:04:39,040 --> 00:04:41,650 and then I'll show my chart. 34 35 00:04:41,710 --> 00:04:44,930 The moment of truth of course is hitting Shift+Enter on this. 35 36 00:04:45,010 --> 00:04:45,970 Let's see what it looks like. 36 37 00:04:51,600 --> 00:04:52,010 All right. 37 38 00:04:52,020 --> 00:04:57,740 So I've got a value error: "canvas size is too small". 38 39 00:04:57,870 --> 00:05:00,180 Any idea what went wrong? 39 40 00:05:00,180 --> 00:05:03,540 The reason is I've got an entirely white canvas. 40 41 00:05:03,660 --> 00:05:06,740 The RGB array is fully white. 41 42 00:05:06,750 --> 00:05:14,440 There is no place for the word cloud to draw anything and that's because I haven't called 42 43 00:05:14,580 --> 00:05:20,460 "image_mask.paste()" with the skull icon. 43 44 00:05:20,460 --> 00:05:22,780 So I'll need to write "image_mask. 44 45 00:05:22,840 --> 00:05:33,550 paste(skull_icon, box = skull_icon)" and if I refresh my cell now, then this error 45 46 00:05:33,640 --> 00:05:34,390 should disappear. 46 47 00:05:35,570 --> 00:05:46,090 So here's my image and nobody in the world will be able to tell that this is indeed a skull image. 47 48 00:05:46,160 --> 00:05:53,050 The culprit this time is the maximum number of words. This number here is too small. 48 49 00:05:53,060 --> 00:05:57,690 It doesn't actually help us see any of the detail in the image. 49 50 00:05:57,920 --> 00:06:03,070 So I'm gonna change it from 50 to 600 and refresh my cell. 50 51 00:06:04,320 --> 00:06:10,830 For this many words your computer might actually run for a little while because it's doing a lot of 51 52 00:06:10,830 --> 00:06:15,690 work to generate this image, but I think the wait was worth it. 52 53 00:06:15,690 --> 00:06:22,410 We've got a beautiful word cloud here with a fantastic color scheme and a lot of the details in the 53 54 00:06:22,410 --> 00:06:27,600 word cloud being made visible by the high number of words in the word cloud. 54 55 00:06:27,600 --> 00:06:34,770 Now that we're done with you to practice word clouds and using the NLTK resources it's time to create 55 56 00:06:34,890 --> 00:06:35,820 a word cloud 56 57 00:06:35,820 --> 00:06:44,390 for our ham and our spam messages. Those we will create from our dataset and I'll see you in the next lesson. 57 58 00:06:44,400 --> 00:06:44,940 Take care.