Knowledge Retention
Knowledge Retention is a multi-turn metric to determine if your chatbot remembers details well.
Overview
The knowledge retention metric is a multi-turn metric that uses LLM-as-a-judge to evaluate whether your chatbot remembers important information provided by the user throughout the conversation.
Required Parameters
These are the parameters you must supply in your test case to run evaluations for knowledge retention metric:
A list of Turns as exchanges between user and assistant.
Parameters of Turn:
The role of the person speaking, it’s either user or assistant
The content provided by the role for the turn
How Is It Calculated?
The knowledge retention metric first uses an LLM to extract information from the content of all user turns, then uses the same LLM to check if any assistant turns contain content that fails to recall this information.
The final score is the proportion of assisant turns with knowledge attrition found in the conversation.
Create Locally
You can create the KnowledgeRetentionMetric in deepeval as follows:
Here’s a list of parameters you can configure when creating a KnowledgeRetentionMetric:
A float to represent the minimum passing threshold.
A string specifying which of OpenAI’s GPT models to use OR any custom LLM
model of type
DeepEvalBaseLLM.
A boolean to enable the inclusion a reason for its evaluation score.
A boolean to enable concurrent execution within the measure() method.
A boolean to enforce a binary metric score: 0 for perfection, 1 otherwise.
A boolean to print the intermediate steps used to calculate the metric score.
This can be used for multi-turn E2E
Create Remotely
For users not using deepeval python, or want to run evals remotely on Confident AI, you can use the knowledge retention metric by adding it to a single-turn metric collection. This will allow you to use knowledge retention metric for:
- Multi-turn E2E testing
- Online and offline evals for traces and spans