Hi, I followed https://docs.arize.com/phoenix/evaluation/how-to-evals/bring-your-own-evaluator#logging-evaluations-to-phoenix and trying to save the evaluation results to Phoenix, but it reports: ValueError: The dataframe index must be ['context.span_id'] but was '[None]'. My question is where does the context.span_id come from? Even after I added the index manually, it still cannot save the results. The code is: https://gist.github.com/goofansu/da6f1a7e07a3099021d93b60367411fa
It seems likely that the context.span_id is a crucial identifier used by Phoenix to associate evaluation results with specific spans. According to Phoenix documentation, the span_id must be present as an index or a column in the evaluation DataFrame to correctly log evaluations. This identifier is typically generated during the tracing process and should be extracted from the trace data.If manually adding the context.span_id does not resolve the issue, it might be due to incorrect or missing span IDs. You should ensure that the span IDs in your DataFrame match those in the Phoenix trace data. You can retrieve the correct span IDs using methods like get_qa_with_reference or get_retrieved_documents from the Phoenix library, as mentioned in the Phoenix session evaluation code.Would you like more detail on how to extract and use context.span_id correctly? Sources:
- Phoenix session evaluation code
- Phoenix documentation
馃挕聽Hint: Mention RunLLM in the thread for followups.
But I didn鈥檛 use data from Phoenix, I just fake data and try the evaluation.
Hey James S., we definitely recommend pulling that dataframe from Phoenix rather than trying to manually construct. Can you add some traces to Phoenix and try trace_df = px.Client(endpoint="http://127.0.0.1:6006").get_spans_dataframe()?
Ah, yes, I鈥檓 just trying:
from phoenix.otel import register
from openinference.instrumentation.litellm import LiteLLMInstrumentor
tracer_provider = register(
endpoint="http://localhost:6006/v1/traces",
)
LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)
pulling that dataframe from Phoenix rather than trying to manually construct.
Two questions:
Can I pull dataset from phoenix and convert it to dataframe? As llm_classify requires the data to be dataframe or array list.
If llm_classify can run with the pulled dataset, is the context.span_id added in the result automatically?
Thanks
Can I pull dataset from phoenix and convert it to dataframe? As llm_classify requires the data to be dataframe or array list.
There is dataset.as_dataframe. In fact, the problem is I don鈥檛 know what function shall I use to run evaluations, there are several methods:
phoenix.experiments.run_experiment against dataset. Result is saved in phoenix.
phoenix.experiments.eval_experiment against experiment. Result is saved in phoenix.
phoenix.evals.llm_classify against dataframe or list, Result isn鈥檛 saved in phoenix.
When shall I use phoenix.evals.llm_classify ? As it doesn鈥檛 save the result to phoenix. Shall I use phoenix built-in evaluators in experiments?
experiments already have intputs / outputs and expected values defined on each experiment run, so defining a function that takes those values and evaluates them is much more straightforward, we provide examples in our documentation and supply some by default: https://docs.arize.com/phoenix/datasets-and-experiments/how-to-experiments/using-evaluators
