How can I add annotation to phoenix spans?
Hey R茅mi C. thanks for checking out the projects. Right now we only have one annotation mechanism which is evaluations (https://docs.arize.com/phoenix/tracing/how-to-tracing/llm-evaluations#span-evaluations) My guess is that you are looking to hand annotate the spans - would that be a correct assumptions? What specific annotations are you looking to add to spans? We definitely have this on the roadmap so would love to know more about your use-case.
My guess is that you are looking to hand annotate the spans - would that be a correct assumptions?
Yes absolutely
I at first I'd like to add two columns:
The true expected result (json string)
add a boolean to say if both of those match
Ideally I'd like to do it programmatically
But for the cases where I don't have access to the ground truth, I'd like to use the prediction as a pre-annotation and then add a second column where I can add the correct version
I have already instrumented my llm call manually (because some of the extra args I need were no picked up by the auto instrumentation for openai [I'm running against vllm])
From https://docs.arize.com/phoenix/evaluation/how-to-evals/online-evals and https://docs.arize.com/phoenix/evaluation/how-to-evals/bring-your-own-evaluator I gather there's at least a private mechanism to do so, but I'm not sure if its expsoed or if I have to dig
I see, very interesting - so I think we are actually thinking of tackling some of the above via golden datasets over which you would do a sweep of your application or LLM. The traces generated from that sweep (or run) would show the expected "golden" result would be shown alongside the trace itself (along with any evaluations and annotations)
If I'm reading the above correctly however I think you could do some of this using the log_evaluations flow where you could just have the evaluation contain the true result in the explanation and the label be whether or not they match or not. You can do this programmatically as you found above. Unfortunately we don't have a way to add these annotations via the UI but you could use our eval logging to accomplish some of what you are doing I believe.
Very cool
Is it something I can do after the inital run?
Yup, so the basic flow would be log the traces, then query the traces for the data to evaluate ( https://docs.arize.com/phoenix/tracing/how-to-tracing/extract-data-from-spans), and then log back your evaluation (annotation) result. you can do these in batches if you would like.
Your UI will end up looking like this: https://phoenix-demo.arize.com/projects/UHJvamVjdDow
and you can filter by those labels you add
One last thing: is there a way to add the correct label?
