Evaluating Summary Quality with Arize's Tools

·Jul 19, 2024 02:11 AM

Hi, I'd like to try using the summary eval from Arize and was wondering what happens when the {input} is larger than context window of the GPT call.

You are comparing the summary text and it's original document and trying to determine
if the summary is good. Here is the data:
    [BEGIN DATA]
    ************
    [Summary]: {output}
    ************
    [Original Document]: {input}
    [END DATA]
Compare the Summary above to the Original Document and determine if the Summary is
comprehensive, concise, coherent, and independent relative to the Original Document.
Your response must be a single word, either "good" or "bad", and should not contain any text
or characters aside from that. "bad" means that the Summary is not comprehensive,
concise, coherent, and independent relative to the Original Document. "good" means the
Summary is comprehensive, concise, coherent, and independent relative to the Original Document.

✅1

Evaluating Summary Quality with Arize's Tools

2 comments

Evaluating Summary Quality with Arize's Tools

2 comments