Phoenix Metrics for RAG Evaluation: Key Insights

How many metrics does Phoenix provide for RAG evaluation?? Are there any metrics called Response_Evaluation?

· Sorted by Oldest

How many metrics does Phoenix provide for RAG evaluation?? Are there any metrics called Response_Evaluation?

· Sorted by Oldest

Jason
·
https://docs.arize.com/phoenix/llm-evals/quickstart-retrieval-evals What we recommend for RAG is: Retrieval Eval (Chunk Level) - This drives NDCG/MRR for retrieval evaluation Q&A Eval - Did you answer the question correctly Hallucination - Did you make up information in the answer Human vs AI - If you have ground truth answers this can be very helpful to tune your system We have focused on testable metrics that we know work and give signal. I know there are other content Evals out there, you can add a template to tackle something custom. They just wont be vetted or tested.

Jason
·
https://docs.arize.com/phoenix/llm-evals/quickstart-retrieval-evals What we recommend for RAG is: Retrieval Eval (Chunk Level) - This drives NDCG/MRR for retrieval evaluation Q&A Eval - Did you answer the question correctly Hallucination - Did you make up information in the answer Human vs AI - If you have ground truth answers this can be very helpful to tune your system We have focused on testable metrics that we know work and give signal. I know there are other content Evals out there, you can add a template to tackle something custom. They just wont be vetted or tested.