Hey guys, my organisation is just getting started with Arize and I'm struggling to set up tracing/evals for document retrieval and was wondering if someone could help. I'm sure I'm missing something obvious, but I want to use UI-configured evals (running on Arize compute) to evaluate the relevance of retrieved documents that will be fed into an LLM application. The retrieval takes an input query and returns a list of chunks/documents. I want to run the relevance eval for every input-output pair (i.e. the same input with each of the output documents). However, all I can figure out how to do is run the eval on one document only, I can't see how to iterate through them. Is this possible with evals or do I need to write customise tracing to flatten the output into many spans?
Hi Tom W.. To understand your question fully. You are saying that for each retrieval span, you have multiple documents being retrieved. And you want to run relevance eval for each document thats retrieved for one span. Correct? Could you share a screenshot of your retrieval span? Along with its attributes, from the span attributes tab? I can help you once I see the format of your spans
Thanks for getting back guys. 馃敀[private user] , yes that's exactly correct. The span attributes are per the following json. I've simplified the output documents and removed most to make it more readable, but essentially this list can have an arbitrary number of elements.
Here is a screenshot, hopefully it conveys that the output value is a list of documents. The fields i want to use in each document are title and body:
馃敀[private user] Very happy to do that to the extent timezones permit! I'm UTC +1.
Thanks a lot 馃敀[private user]. Tomorrow would work well, but unfortunately Thursday I can't do. My email address is tom.whitehead@9fin.com
