1 00:00:01,270 --> 00:00:08,140 So far, we've split the text into chunks and embedded them into vectors, which 2 00:00:08,150 --> 00:00:10,680 were then inserted into a Pinecone index. 3 00:00:11,170 --> 00:00:13,920 Now, let's see how to ask questions and 4 00:00:13,930 --> 00:00:14,880 do similarity searches. 5 00:00:15,670 --> 00:00:16,860 Here's how it works. 6 00:00:17,410 --> 00:00:19,300 The user defines a query. 7 00:00:19,890 --> 00:00:22,400 The query is embedded into a vector. 8 00:00:23,150 --> 00:00:29,400 A similarity search is performed in the vector database and the text behind the 9 00:00:29,410 --> 00:00:34,040 most similar vectors is the answer to the user's question. 10 00:00:35,050 --> 00:00:36,860 I am defining a query. 11 00:00:38,110 --> 00:00:40,900 So query equals. 12 00:00:42,010 --> 00:00:45,160 I've used the famous speech by Winston Churchill. 13 00:00:45,590 --> 00:00:48,000 We shall fight on the beaches and the 14 00:00:48,010 --> 00:00:50,220 query is where should we fight. 15 00:00:56,070 --> 00:00:58,560 Now, I'm extracting all the relevant chunks. 16 00:00:58,930 --> 00:01:09,260 So result equals vector store dot similarity search of query and I'm 17 00:01:09,270 --> 00:01:10,380 printing the result. 18 00:01:17,250 --> 00:01:19,760 We see how it extracted the chunks 19 00:01:19,770 --> 00:01:22,100 relevant to the query. 20 00:01:22,110 --> 00:01:26,920 I can also iterate over the chunks and 21 00:01:26,930 --> 00:01:29,160 print only the chunk text. 22 00:01:29,970 --> 00:01:37,960 So for r in result print r dot page 23 00:01:37,970 --> 00:01:48,680 content and for readability I'm adding 50 dashes after each chunk. 24 00:01:53,170 --> 00:01:54,660 I've got four chunks. 25 00:01:56,010 --> 00:01:59,520 These chunks represent the answer but you 26 00:01:59,530 --> 00:02:01,580 can't give them to users like this. 27 00:02:01,910 --> 00:02:04,300 We need the answer in a natural language. 28 00:02:04,850 --> 00:02:07,020 That's where an LLM comes in. 29 00:02:07,490 --> 00:02:10,200 We'll retrieve the most relevant chunks 30 00:02:10,210 --> 00:02:15,120 of text and feed them to the language model for the final answer. 31 00:02:15,890 --> 00:02:16,540 Let's do this. 32 00:02:17,190 --> 00:02:21,080 I'm importing the retrieval QA chain from 33 00:02:21,090 --> 00:02:27,820 lang chain chains import retrieval QA. 34 00:02:31,200 --> 00:02:34,630 I'm also importing chat open AI the LLM 35 00:02:34,640 --> 00:02:40,670 from lang chain chat models import chat open AI. 36 00:02:41,640 --> 00:03:00,150 I'm defining the LLM chat open AI of model equals gpt 3 .5 turbo or gpt 4 and 37 00:03:00,160 --> 00:03:03,870 the temperature equals one. 38 00:03:06,180 --> 00:03:09,610 Next, I'll expose this index in a 39 00:03:09,620 --> 00:03:11,050 retriever interface. 40 00:03:11,500 --> 00:03:13,470 The retriever interface is a generic 41 00:03:13,480 --> 00:03:18,250 interface that makes it easy to combine documents with language models. 42 00:03:18,860 --> 00:03:32,640 So retriever equals vector store as retriever of search type equals similarity. 43 00:03:35,740 --> 00:03:46,170 Similarity as a string and the second argument search k works equals and the dictionary. 44 00:03:46,560 --> 00:03:54,890 The key is k and the value an integer three k equals three means that it will 45 00:03:54,900 --> 00:03:58,490 return the three most similar chunks to the user's query. 46 00:03:59,780 --> 00:04:07,110 And finally, I'm creating a chain to answer questions chain equals retrieval 47 00:04:07,120 --> 00:04:22,940 QA dot from chain type of LLM equals LLM chain type equals stuff and the retriever 48 00:04:22,950 --> 00:04:24,540 equals retriever. 49 00:04:28,470 --> 00:04:32,360 The default chain type equals stuff uses 50 00:04:32,370 --> 00:04:35,720 all of the text from the documents in the prompt. 51 00:04:36,370 --> 00:04:39,240 I'm running the code in this cell. 52 00:04:40,750 --> 00:04:46,900 And I've got this warning because I have misspelled the word model. 53 00:04:47,530 --> 00:04:49,300 It's model not mode. 54 00:04:49,550 --> 00:04:50,720 I'm running the cell again. 55 00:04:53,230 --> 00:04:53,940 There are no warnings. 56 00:04:55,770 --> 00:04:57,960 Let's ask some questions about the 57 00:04:57,970 --> 00:05:03,110 content of the document query where should we fight? 58 00:05:04,940 --> 00:05:09,010 And the answer equals chain dot run of query. 59 00:05:09,380 --> 00:05:10,410 I'm running the chain. 60 00:05:16,560 --> 00:05:17,370 I'm running it. 61 00:05:18,200 --> 00:05:19,910 And I'm printing the answer. 62 00:05:29,040 --> 00:05:33,950 And I've got the correct answer from the document in an actual language format. 63 00:05:37,300 --> 00:05:43,190 Let's ask something different query equals who was the king of Belgium at 64 00:05:43,200 --> 00:05:43,850 that time? 65 00:05:53,590 --> 00:05:54,820 It's in the document. 66 00:06:05,930 --> 00:06:06,260 Perfect. 67 00:06:07,070 --> 00:06:08,440 This is the correct answer. 68 00:06:10,810 --> 00:06:13,980 And what about the French armies? 69 00:06:26,030 --> 00:06:26,380 Great. 70 00:06:28,610 --> 00:06:30,220 I'm running the code again. 71 00:06:41,200 --> 00:06:43,410 If you ask it again, you'll get a 72 00:06:43,420 --> 00:06:44,290 different answer. 73 00:06:46,080 --> 00:06:48,270 Each answer is original. 74 00:06:49,640 --> 00:06:52,110 This is what the transformer technology means. 75 00:06:54,570 --> 00:06:54,880 Great. 76 00:06:55,210 --> 00:06:57,200 I hope you enjoyed this section. 77 00:06:57,630 --> 00:06:58,460 You've learned a lot. 78 00:06:58,950 --> 00:07:03,420 You've learned how to install a link chain in pine cone and how to set up the environment. 79 00:07:03,930 --> 00:07:09,320 You've also learned about LLM models and how to call them from link chain, prompt 80 00:07:09,330 --> 00:07:14,860 templates and how to create dynamic prompts, simple and sequential chains and 81 00:07:14,870 --> 00:07:16,120 the link chain agents. 82 00:07:16,510 --> 00:07:19,860 We then moved on to embeddings and vector stores. 83 00:07:20,670 --> 00:07:24,740 At the end, we saw how to combine everything together to create the 84 00:07:24,750 --> 00:07:27,560 backbone of an OPL application. 85 00:07:28,090 --> 00:07:32,080 OPL stands for open AI, pine cone and 86 00:07:32,090 --> 00:07:32,800 link chain. 87 00:07:33,310 --> 00:07:35,780 In the next section, we'll put together 88 00:07:35,790 --> 00:07:41,100 everything we've learned so far and develop an LLM powered application that 89 00:07:41,110 --> 00:07:44,840 can answer questions about the content of private documents. 90 00:07:45,490 --> 00:07:49,420 We'll use Python, link chain, pine cone and open AI. 91 00:07:49,930 --> 00:07:50,880 This will be really fun.