Troubleshooting Missing Evaluations in Experiment Dataset Examples

Nico G. · 2025-04-17T17:52:15.504Z

Hello, what would be the likely cause of evaluations not showing up on each example of our dataset in experiment? But I do see the average evaluations when I just click on dataset. I wanted to see how each example is on my defined evaluations.

The likely cause of evaluations not showing up on each example in your dataset could be related to how the evaluations are logged or displayed. According to Phoenix documentation, when logging evaluations, ensure that your test_results dataframe includes a context.span_id column with the corresponding span ID. This is crucial for the evaluations to be associated with the correct examples in Phoenix.Additionally, if you are manually constructing dataframes for evaluations, ensure that they are compatible with Phoenix's logging methods. The log_evaluations method is intended for dataframes computed via evaluation helpers like run_evals. If the evaluations are not appearing as expected, verify that the data structure and logging process align with Phoenix's requirements.Would you like more detail on how to ensure your evaluations are correctly logged and displayed? Sources: - GitHub Issue #6525 - Phoenix documentation

💡 Hint: Mention RunLLM in the thread for followups.

5 comments