1
00:00:00,700 --> 00:00:08,430
Hi guys, in the previous video, we used
the retrieval QA chain to ask questions

2
00:00:08,440 --> 00:00:15,690
against a vector store, that works well,
but has one disadvantage, it fails to

3
00:00:15,700 --> 00:00:17,830
preserve conversational history.

4
00:00:19,000 --> 00:00:23,330
A common requirement for reg or retrieval

5
00:00:23,340 --> 00:00:30,270
-augmented generation apps is support for
follow -up questions.

6
00:00:30,280 --> 00:00:34,230
Requirements can contain references to
past chat history.

7
00:00:35,320 --> 00:00:40,730
In this video, I ll show you how to save
the chat history so that you can ask

8
00:00:40,740 --> 00:00:42,050
follow -up questions.

9
00:00:43,380 --> 00:00:46,990
I am importing chatOpenAI to instantiate

10
00:00:47,000 --> 00:00:48,170
the LLM object.

11
00:00:49,000 --> 00:00:51,890
Instead of using the retrieval QA chain,

12
00:00:52,000 --> 00:00:56,130
we ll use another chain called conversationalRetrievalChain.

13
00:00:56,460 --> 00:01:00,990
I am importing it from langChain .chains

14
00:01:01,000 --> 00:01:05,910
import conversationalRetrievalChain.

15
00:01:06,580 --> 00:01:09,830
This chain is used to have a conversation

16
00:01:10,380 --> 00:01:12,230
based on the retrieved documents.

17
00:01:13,140 --> 00:01:14,890
I will also import the

18
00:01:14,900 --> 00:01:21,350
ConversationBufferMemory class, which
acts as a buffer for storing conversation memory.

19
00:01:22,420 --> 00:01:29,550
From langChain .memory import ConversationBufferMemory.

20
00:01:32,690 --> 00:01:34,930
I am creating the LLM object.

21
00:01:37,410 --> 00:01:44,340
I ll use gpt4 -turbo, but you can also
use gpt3 .5 -turbo if you are using a

22
00:01:44,350 --> 00:01:45,540
free OpenAI account.

23
00:01:51,030 --> 00:01:56,600
gpt4 -turbo -preview and the temperature

24
00:01:56,610 --> 00:01:57,840
equals 0.

25
00:02:01,020 --> 00:02:04,150
In REG systems, external data is

26
00:02:04,160 --> 00:02:09,170
retrieved and then passed to the LLM when
doing the generation step.

27
00:02:09,340 --> 00:02:11,750
I am creating the retriever.

28
00:02:14,090 --> 00:02:17,760
Retriever equals vector store as

29
00:02:17,770 --> 00:02:33,330
retriever and the arguments are search
type equals similarity and the search

30
00:02:33,340 --> 00:02:47,200
keywords equals and the key is k and the
value lets say 5.

31
00:02:47,950 --> 00:02:56,140
A retriever is a crucial component that
helps LLMs find and access relevant information.

32
00:02:56,870 --> 00:03:01,920
It does this by searching for relevant
data and retrieving the information.

33
00:03:03,270 --> 00:03:09,480
In this example, the retriever will
search by similarity and will retrieve

34
00:03:09,490 --> 00:03:12,840
the top k most similar chunks of data.

35
00:03:13,990 --> 00:03:17,120
I am creating a memory object that will

36
00:03:17,130 --> 00:03:20,560
be passed to the conversational retrieval
chain as an argument.

37
00:03:21,690 --> 00:03:29,640
The memory will be automatically updated
with the questions and the answers memory

38
00:03:29,650 --> 00:03:41,440
equals conversation memory buffer of
memory key equals chat history and the

39
00:03:41,450 --> 00:03:43,420
return messages equals true.

40
00:03:48,510 --> 00:03:51,900
Memory is specifically designed to store

41
00:03:51,910 --> 00:03:55,860
and manage conversation history within
the link chain application.

42
00:03:58,150 --> 00:04:02,980
Memory key equals chat history gives your
memory a label.

43
00:04:03,610 --> 00:04:08,480
When retrieving or interacting with the
stored conversation, you'll use the key

44
00:04:08,490 --> 00:04:09,340
chat history.

45
00:04:11,830 --> 00:04:14,280
I am creating the conversational

46
00:04:14,290 --> 00:04:15,620
retrieval chain.

47
00:04:32,530 --> 00:04:41,740
Equals retriever memory equals memory

48
00:04:43,640 --> 00:04:51,050
chain type equals stuff and the verbose
equals true.

49
00:04:52,580 --> 00:04:58,030
Chain type equals stuff means use all of
the text from the documents.

50
00:04:58,760 --> 00:05:00,050
I am running the code.

51
00:05:03,130 --> 00:05:06,400
Next, I will define a function called ask

52
00:05:06,410 --> 00:05:10,440
question to make it easier to send the questions.

53
00:05:13,430 --> 00:05:16,120
The parameters will be queue for the

54
00:05:16,130 --> 00:05:17,660
question and the chain.

55
00:05:18,630 --> 00:05:22,760
I am running the chain by calling chain .invoke.

56
00:05:25,090 --> 00:05:33,140
It takes a dict that contains the
question as an argument and returns a result.

57
00:05:49,000 --> 00:05:54,940
Just to ensure that data is loaded,
splitted into chunks and embedded, I will

58
00:05:54,950 --> 00:05:59,160
copy and paste the code that does that
from a previous cell.

59
00:06:07,400 --> 00:06:09,490
I am sending the first question.

60
00:06:09,940 --> 00:06:12,450
It is the same question we asked before.

61
00:06:14,060 --> 00:06:18,950
How many pairs of questions and answers
had the stack overflow dataset?

62
00:06:21,880 --> 00:06:28,950
And the result equals ask question of
queue and crc, the chain.

63
00:06:29,620 --> 00:06:31,350
And I am printing the result.

64
00:06:32,180 --> 00:06:32,950
I am running the code.

65
00:06:35,910 --> 00:06:36,800
It is running the chain.

66
00:06:37,830 --> 00:06:38,460
Very well.

67
00:06:39,090 --> 00:06:43,680
The answer is in the document and will be returned.

68
00:06:44,390 --> 00:06:48,040
The stack overflow dataset had 8 million

69
00:06:48,050 --> 00:06:50,600
pairs of questions and answers.

70
00:06:51,550 --> 00:06:54,980
Note that the result is a dictionary that

71
00:06:54,990 --> 00:07:05,380
contains 3 key value pairs, the question
sent by the user, the chat history and

72
00:07:05,390 --> 00:07:06,160
the answer.

73
00:07:07,330 --> 00:07:11,800
If you only want the answer, use result

74
00:07:11,810 --> 00:07:13,080
of answer.

75
00:07:16,150 --> 00:07:19,320
Let's test if it remembers the last question.

76
00:07:21,830 --> 00:07:36,880
Q equals, multiply that number by 10 and
the result equals ask question of queue

77
00:07:36,890 --> 00:07:38,400
and crc.

78
00:07:39,190 --> 00:07:40,560
I am running it.

79
00:07:43,500 --> 00:07:45,210
I am also printing the result.

80
00:07:49,710 --> 00:07:53,040
Note that it knew exactly what I was

81
00:07:53,050 --> 00:07:53,840
referring to.

82
00:07:54,830 --> 00:07:57,440
The result of multiplying the number of

83
00:07:57,450 --> 00:08:03,880
pairs of questions and answers in the
dataset which is 8 million by 10 would be

84
00:08:03,890 --> 00:08:04,600
80 million.

85
00:08:05,130 --> 00:08:05,600
Very well.

86
00:08:09,110 --> 00:08:10,040
Let's try another one.

87
00:08:17,290 --> 00:08:27,860
Divide the result by 80 and I am running it.

88
00:08:33,180 --> 00:08:37,290
The result of dividing 80 million by 80
is 1 million.

89
00:08:39,280 --> 00:08:43,770
To display the chat history which
contains all the questions and their

90
00:08:43,780 --> 00:08:49,910
answers, iterate over the content of the
chat history key like this for item in

91
00:08:49,920 --> 00:08:56,030
result of chat history, print item.

92
00:09:00,130 --> 00:09:02,920
You will see all the questions and the

93
00:09:02,930 --> 00:09:04,700
answers from the conversation.

94
00:09:07,570 --> 00:09:07,900
That's it.

95
00:09:08,410 --> 00:09:13,580
In this video, you learned how to use the
conversational retrieval chain and the

96
00:09:13,590 --> 00:09:17,660
conversation buffer memory classes to add
memory to your rack system.

97
00:09:18,250 --> 00:09:22,360
In the next video, we will dive deeper
into it and see how to use the custom

98
00:09:22,370 --> 00:09:24,480
prompt with prompt templates.