1
00:00:01,270 --> 00:00:08,140
So far, we've split the text into chunks
and embedded them into vectors, which

2
00:00:08,150 --> 00:00:10,680
were then inserted into a Pinecone index.

3
00:00:11,170 --> 00:00:13,920
Now, let's see how to ask questions and

4
00:00:13,930 --> 00:00:14,880
do similarity searches.

5
00:00:15,670 --> 00:00:16,860
Here's how it works.

6
00:00:17,410 --> 00:00:19,300
The user defines a query.

7
00:00:19,890 --> 00:00:22,400
The query is embedded into a vector.

8
00:00:23,150 --> 00:00:29,400
A similarity search is performed in the
vector database and the text behind the

9
00:00:29,410 --> 00:00:34,040
most similar vectors is the answer to the
user's question.

10
00:00:35,050 --> 00:00:36,860
I am defining a query.

11
00:00:38,110 --> 00:00:40,900
So query equals.

12
00:00:42,010 --> 00:00:45,160
I've used the famous speech by Winston Churchill.

13
00:00:45,590 --> 00:00:48,000
We shall fight on the beaches and the

14
00:00:48,010 --> 00:00:50,220
query is where should we fight.

15
00:00:56,070 --> 00:00:58,560
Now, I'm extracting all the relevant chunks.

16
00:00:58,930 --> 00:01:09,260
So result equals vector store dot
similarity search of query and I'm

17
00:01:09,270 --> 00:01:10,380
printing the result.

18
00:01:17,250 --> 00:01:19,760
We see how it extracted the chunks

19
00:01:19,770 --> 00:01:22,100
relevant to the query.

20
00:01:22,110 --> 00:01:26,920
I can also iterate over the chunks and

21
00:01:26,930 --> 00:01:29,160
print only the chunk text.

22
00:01:29,970 --> 00:01:37,960
So for r in result print r dot page

23
00:01:37,970 --> 00:01:48,680
content and for readability I'm adding 50
dashes after each chunk.

24
00:01:53,170 --> 00:01:54,660
I've got four chunks.

25
00:01:56,010 --> 00:01:59,520
These chunks represent the answer but you

26
00:01:59,530 --> 00:02:01,580
can't give them to users like this.

27
00:02:01,910 --> 00:02:04,300
We need the answer in a natural language.

28
00:02:04,850 --> 00:02:07,020
That's where an LLM comes in.

29
00:02:07,490 --> 00:02:10,200
We'll retrieve the most relevant chunks

30
00:02:10,210 --> 00:02:15,120
of text and feed them to the language
model for the final answer.

31
00:02:15,890 --> 00:02:16,540
Let's do this.

32
00:02:17,190 --> 00:02:21,080
I'm importing the retrieval QA chain from

33
00:02:21,090 --> 00:02:27,820
lang chain chains import retrieval QA.

34
00:02:31,200 --> 00:02:34,630
I'm also importing chat open AI the LLM

35
00:02:34,640 --> 00:02:40,670
from lang chain chat models import chat
open AI.

36
00:02:41,640 --> 00:03:00,150
I'm defining the LLM chat open AI of
model equals gpt 3 .5 turbo or gpt 4 and

37
00:03:00,160 --> 00:03:03,870
the temperature equals one.

38
00:03:06,180 --> 00:03:09,610
Next, I'll expose this index in a

39
00:03:09,620 --> 00:03:11,050
retriever interface.

40
00:03:11,500 --> 00:03:13,470
The retriever interface is a generic

41
00:03:13,480 --> 00:03:18,250
interface that makes it easy to combine
documents with language models.

42
00:03:18,860 --> 00:03:32,640
So retriever equals vector store as
retriever of search type equals similarity.

43
00:03:35,740 --> 00:03:46,170
Similarity as a string and the second
argument search k works equals and the dictionary.

44
00:03:46,560 --> 00:03:54,890
The key is k and the value an integer
three k equals three means that it will

45
00:03:54,900 --> 00:03:58,490
return the three most similar chunks to
the user's query.

46
00:03:59,780 --> 00:04:07,110
And finally, I'm creating a chain to
answer questions chain equals retrieval

47
00:04:07,120 --> 00:04:22,940
QA dot from chain type of LLM equals LLM
chain type equals stuff and the retriever

48
00:04:22,950 --> 00:04:24,540
equals retriever.

49
00:04:28,470 --> 00:04:32,360
The default chain type equals stuff uses

50
00:04:32,370 --> 00:04:35,720
all of the text from the documents in the prompt.

51
00:04:36,370 --> 00:04:39,240
I'm running the code in this cell.

52
00:04:40,750 --> 00:04:46,900
And I've got this warning because I have
misspelled the word model.

53
00:04:47,530 --> 00:04:49,300
It's model not mode.

54
00:04:49,550 --> 00:04:50,720
I'm running the cell again.

55
00:04:53,230 --> 00:04:53,940
There are no warnings.

56
00:04:55,770 --> 00:04:57,960
Let's ask some questions about the

57
00:04:57,970 --> 00:05:03,110
content of the document query where
should we fight?

58
00:05:04,940 --> 00:05:09,010
And the answer equals chain dot run of query.

59
00:05:09,380 --> 00:05:10,410
I'm running the chain.

60
00:05:16,560 --> 00:05:17,370
I'm running it.

61
00:05:18,200 --> 00:05:19,910
And I'm printing the answer.

62
00:05:29,040 --> 00:05:33,950
And I've got the correct answer from the
document in an actual language format.

63
00:05:37,300 --> 00:05:43,190
Let's ask something different query
equals who was the king of Belgium at

64
00:05:43,200 --> 00:05:43,850
that time?

65
00:05:53,590 --> 00:05:54,820
It's in the document.

66
00:06:05,930 --> 00:06:06,260
Perfect.

67
00:06:07,070 --> 00:06:08,440
This is the correct answer.

68
00:06:10,810 --> 00:06:13,980
And what about the French armies?

69
00:06:26,030 --> 00:06:26,380
Great.

70
00:06:28,610 --> 00:06:30,220
I'm running the code again.

71
00:06:41,200 --> 00:06:43,410
If you ask it again, you'll get a

72
00:06:43,420 --> 00:06:44,290
different answer.

73
00:06:46,080 --> 00:06:48,270
Each answer is original.

74
00:06:49,640 --> 00:06:52,110
This is what the transformer technology means.

75
00:06:54,570 --> 00:06:54,880
Great.

76
00:06:55,210 --> 00:06:57,200
I hope you enjoyed this section.

77
00:06:57,630 --> 00:06:58,460
You've learned a lot.

78
00:06:58,950 --> 00:07:03,420
You've learned how to install a link
chain in pine cone and how to set up the environment.

79
00:07:03,930 --> 00:07:09,320
You've also learned about LLM models and
how to call them from link chain, prompt

80
00:07:09,330 --> 00:07:14,860
templates and how to create dynamic
prompts, simple and sequential chains and

81
00:07:14,870 --> 00:07:16,120
the link chain agents.

82
00:07:16,510 --> 00:07:19,860
We then moved on to embeddings and vector stores.

83
00:07:20,670 --> 00:07:24,740
At the end, we saw how to combine
everything together to create the

84
00:07:24,750 --> 00:07:27,560
backbone of an OPL application.

85
00:07:28,090 --> 00:07:32,080
OPL stands for open AI, pine cone and

86
00:07:32,090 --> 00:07:32,800
link chain.

87
00:07:33,310 --> 00:07:35,780
In the next section, we'll put together

88
00:07:35,790 --> 00:07:41,100
everything we've learned so far and
develop an LLM powered application that

89
00:07:41,110 --> 00:07:44,840
can answer questions about the content of
private documents.

90
00:07:45,490 --> 00:07:49,420
We'll use Python, link chain, pine cone
and open AI.

91
00:07:49,930 --> 00:07:50,880
This will be really fun.