Hi, I'm currently running some custom evaluations using llm_classify from phoenix.evals as part of an LLM-as-a-judge workflow. Our goal is to evaluate model performance on a local DataFrame and visualize the results in Phoenix. How can we log these custom evaluations to Phoenix so that they appear in the dashboard for performance tracking? Thanks in advance!
馃挕聽Hint: Mention RunLLM in the thread for followups.
馃挕聽Hint: Mention RunLLM in the thread for followups.
馃挕聽Hint: Mention RunLLM in the thread for followups.
馃挕聽Hint: Mention RunLLM in the thread for followups.
We restructured our workflow and ended up on this solution. Make a brand new project Trace LLM calls Fetch the LLM calls with px_client.query_spans Then use our custom eval function on the output Finally use px.Client().log_evaluations with the eval dataframe to log it into phoenix 馃槃 Is there any other solutions that would work better?
Yes 馃敀[private user], i can join a huddle in 5 minutes?
