Hi, I’m running experiments and log evaluation results to phoenix. Sometimes the feedback isn’t updated. How can I debug the problem? The code is here: https://gist.github.com/goofansu/d01464f95ead23695e8075c0020f4a61 arize-phoenix version: 8.6.1
💡 Hint: Mention RunLLM in the thread for followups.
these evals are sent before the span is received by the server, so there’s a chance that they missed each other just long enough that the server has nowhere to put the evals
also, these are evals of experiment “evaluator”
so maybe there could be a better approach here since they are capturing the same info
i guess the output of the “evaluator” could be surfaced better in the UI
i think these “evaluator” spans shouldn’t need to be annotated (for a second time), since they’re mostly there for record keeping after running the evaluators. The output of the evaluator should have already been used to annotate the experiment runs themselves, which are the real subjects of evaluation.
maybe we should surface the output of the evaluators in the output column of the table. that would show the same info as what the screenshot is attempting to do
Dustin N. No worry, maybe I just use the wrong way to log feedback. Roger Y. I also think it’s not necessary to log_evaluations manually in the evaluator. But if I just run evaluation in experiments and without log_evaluations, there is no feedback in the evaluator traces. For feedback I mean the screenshot:
Roger Y. I changed to evaluate the AGENT spans instead of run experiment for my requirement.
Log traces of my agent for question, answer and reference.
run_evals against the AGENT kind spans and return evaluation results.
log_evaluations for the above evaluation results.
Now there are feedbacks with explanations in each span. Now my understanding is:
Experiment evaluations are used to track the overall benchmarks.
Span evaluations are used to see the detail of feedbacks.
Correct me if I’m wrong. Thanks
Yea i think that’s correct. Incidentally, the Feedback section in the UI also has another purpose: you can annotate manually via the “Annotate” button on the upper right hand side.
I see. That’s a probably a good convenience enhancement that we can add to smooth this process for you. We appreciate you bringing this to our attention.
can i ask what method do you use to download the root spans?
