Hi, in this tutorial: https://docs.arize.com/phoenix/evaluation/how-to-evals/running-pre-tested-evals/q-and-a-on-retrieved-data How can I send the results back to phoenix?
馃挕聽Hint: Mention RunLLM in the thread for followups.
1. Prepare Evaluation Data: Ensure your evaluation results are in a DataFrame format, with necessary columns like context.span_id to associate evaluations with specific spans.
Where does the context.span_id come from? The dataset is:
N_EVAL_SAMPLE_SIZE = 3
df = download_benchmark_dataset(
task="qa-classification", dataset_name="qa_generated_dataset"
)
df_sample = (
df.sample(n=N_EVAL_SAMPLE_SIZE)
.reset_index(drop=True)
.rename(
columns={
"question": "input",
"context": "reference",
"sampled_answer": "output",
}
)
)Found I asked a similar question here: https://arize-ai.slack.com/archives/C04R3GXC8HK/p1738917892261619?thread_ts=1737099011.156239&cid=C04R3GXC8HK This one can be ignored.
