Hey guys, when logging evaluations, what is supposed to be displayed where it shows --? Is it any column I should have in my df? Thanks!
The -- in the Phoenix UI typically indicates missing or undefined data for a particular column in your evaluation DataFrame. To ensure that your evaluations are displayed correctly, your DataFrame should include the necessary columns such as span_id, label, score, and explanation as these are used by Phoenix to associate evaluations with spans and display them in the UI. If you are seeing --, it might be due to missing values in these columns or a mismatch in the expected column names. Ensure that your DataFrame is correctly formatted and that all necessary columns are populated with appropriate data. Would you like more detail on how to format your DataFrame for logging evaluations in Phoenix?Sources:
- Evaluating Phoenix Traces
- Log Evaluation Results
๐กย Hint: Mention RunLLM in the thread for followups.
You're welcome! Yes, ensuring the score column is correctly populated can help resolve the issue. If the score column is missing or not correctly populated, metrics like ndcg and precision might not be computed, leading to -- being displayed. Make sure your DataFrame includes a score column with appropriate values. If you need further assistance, feel free to ask!Sources:
- Evaluating Phoenix Traces
๐กย Hint: Mention RunLLM in the thread for followups.
In this feedback tab, it would be nice to be able to populate custom data here. For this eval specifically, the first thing i'd put there would be the question & answer, so I don't have to find the span that actually have those. there's also non-techinical people working with us... for engineers ew find our way to find those, but having all the info in that tab would help non-techs to analyze large batch of conversations.
It's nice that when hovering the feedback tag, it displays the options to filter by matching or excluding that label. It would be nice to the chips at the top to have that same functionality. Sometimes we have to scroll a lot until find a feedback we want to filter. It's possible to just type the filter, but I'm also thinking about the non-tech users again.
