1 00:00:00,360 --> 00:00:04,370 So I'd say we're pretty much done with training our base classifier. 2 00:00:04,710 --> 00:00:11,550 Let's put our code for testing it and evaluating our algorithm into a separate notebook. 3 00:00:11,560 --> 00:00:20,480 So come to my projects folder and create a new notebook and I will name this notebook. 4 00:00:20,480 --> 00:00:33,420 0 7 B's classifier hyphen testing inference and Evaluation at the top. 5 00:00:33,530 --> 00:00:37,520 Of course we'll add our notebook imports. 6 00:00:37,520 --> 00:00:39,900 These are gonna be our usual suspects. 7 00:00:39,980 --> 00:00:47,990 If you still have your training notebook open you can copy none pine pandas paste them in here and then 8 00:00:47,990 --> 00:00:49,730 just add Matt. 9 00:00:49,750 --> 00:00:55,960 Plot lib dot pi plot as BLT. 10 00:00:56,090 --> 00:01:01,850 We're gonna be doing some graphing and visualization in this notebook so we're gonna need map plot lib 11 00:01:02,270 --> 00:01:08,630 and Seabourn as S.A. because we've got map plot lib in here. 12 00:01:08,690 --> 00:01:19,430 We'll add some Python notebook magic with scent not plot lib in line just below we'll add our constants 13 00:01:20,360 --> 00:01:26,110 here right and grab some of the same file paths that we used in our training file. 14 00:01:26,180 --> 00:01:31,430 I'll grab all our constants copy them and I'll paste them over. 15 00:01:31,490 --> 00:01:37,200 The only thing I'll do is I'll delete the training unescorted data on a school file. 16 00:01:37,340 --> 00:01:44,990 So our training data and our test data all we need in this notebook are our probabilities that we've 17 00:01:44,990 --> 00:01:51,860 worked out for our tokens and our features and target for our test data set. 18 00:01:52,220 --> 00:01:57,710 And these two we've prepared in the previous notebook and we've got them right here. 19 00:01:57,710 --> 00:02:06,900 So let me hit shift enter on the cell and we're ready to load the data load the data. 20 00:02:06,950 --> 00:02:07,950 There we go. 21 00:02:08,150 --> 00:02:13,470 All our stuff is in text files so this should be fairly easy. 22 00:02:13,490 --> 00:02:23,810 Our features will store in a variable called X test and this will be equal to NDP dot load t 60 parentheses 23 00:02:25,100 --> 00:02:32,100 test feature matrix comma the limit to single quotes with a space. 24 00:02:32,290 --> 00:02:40,390 Our target will be y at a school test and that will be NDP dot low to T. 25 00:02:40,820 --> 00:02:41,930 You guessed it. 26 00:02:42,260 --> 00:02:47,870 Test target file to limit a single quotes space. 27 00:02:48,260 --> 00:02:49,380 Next one up. 28 00:02:49,700 --> 00:02:53,060 Call this one token problem. 29 00:02:53,150 --> 00:02:55,190 Bill at TS. 30 00:02:55,250 --> 00:03:00,770 Need the probability that a token is spam. 31 00:03:00,770 --> 00:03:11,270 So if we end p dot low T T token spam probability limit a space. 32 00:03:11,630 --> 00:03:24,030 Copy this added two more times for the other two probability files the ham and the probability all on 33 00:03:24,470 --> 00:03:26,790 the school tokens 34 00:03:29,500 --> 00:03:35,710 these two of course will point to the other file paths home on a school probability on a score file 35 00:03:36,430 --> 00:03:40,910 and took in all probability file. 36 00:03:40,990 --> 00:03:41,970 There we go. 37 00:03:41,980 --> 00:03:45,870 Just let me hit shift enter and that's it. 38 00:03:45,940 --> 00:03:48,680 We're all set up and ready to go. 39 00:03:48,700 --> 00:03:50,950 This is where the real work begins. 40 00:03:50,960 --> 00:03:52,300 I'll see you in the next lesson. 41 00:03:52,300 --> 00:03:52,770 Take care.