Thank you very much for such good recommendations, Aparna D.!
Yes, I'm using Retrieval. I built my bot using Llama's OpenAiAgent with Query Engine.
The bot is working well in general terms, but there are some clear points for improvement, which I believe will be resolved with prompt engineering. Before moving on to improvements, I need to focus on an evaluation system...
Do you have any specific guidance or reference for my case?
I thank you in advance for your generosity in sharing knowledge.
Perfect content, Aparna D.. Thank you very much!
Would you have any guidance on how to conduct llm evals for a customer service bot? I couldn't see how standard Q&A templates could be applied. Could you give me some idea about it?
Should I consider previous chat messages and relevant information to answer the questions in "context"? Are you thinking about including this template?
Hello everyone, I'm new to this community!
I'm building a llm based chatbot for customer service support, with a private knowledge base.
I am in doubt about two questions:
1. What are the best approaches to evaluating a chatbot from end to end?
2. How can I do this from Phoenix? (I'm having trouble building my golden dataset)
Could someone help me, please?