Hello Phoenix Team, I have a dataframe which has AI predictions as well as the ground truth label columns in it. I am able to upload it as dataset. Now, I calculate accuracy, precision, recall, f1 using scikit learn library. How can I log those results as Experiment in phoenix for that dataset?
馃挕聽Hint: Mention RunLLM in the thread for followups.
I have logged my dataset to Phoenix...I struggle to implement the whole code because I don't understand how I can retrieve the entire dataframe inside the task function? Based on docs, it seems like task iterates over each row of the dataframe...but, while calculating metrics, I calculate it over the entire dataframe at once..
import pandas as pd
from phoenix.trace import SpanEvaluations
import phoenix as px
metrics_df = pd.DataFrame({
'span_id': [1], # Dummy span_id for global metrics
'accuracy': [accuracy],
'precision': [precision],
'recall': [recall],
'f1': [f1]
})
client = px.Client()
client.log_evaluations(SpanEvaluations(eval_name="Global Metrics", dataframe=metrics_df))RunLLM I am trying to use this code snippet; however, I struggle to use new self contained client in place of px.Client() in this code.. Can you please tell me how can I log evaluations with new self-contained client?
馃挕聽Hint: Mention RunLLM in the thread for followups.
import pandas as pd
from phoenix.trace import SpanEvaluations
import phoenix as px
metrics_df = pd.DataFrame({
'span_id': [1], # Dummy span_id for global metrics
'accuracy': [accuracy],
'precision': [precision],
'recall': [recall],
'f1': [f1]
})
client = px.Client()
client.log_evaluations(SpanEvaluations(eval_name="Global Metrics", dataframe=metrics_df))RunLLM If I use the code above to log_evaluations, where in the Phonix UI can I find them?
馃挕聽Hint: Mention RunLLM in the thread for followups.
Hello Priyan, Thank you for the response. Can you send me some pointers or code snippets on how to log metrics to all datapoints in my dataset?
Priyan, Quick question...I am using this as an example: https://arize.com/docs/phoenix/cookbook/tracing-and-annotations/generating-synthetic-datasets-for-llm-evaluators-and-agents#upload-agent-dataset What I don't understand is that how does task_function in this experiment receive input and reference arguments inside the function while running the experiment... I don't see anything being passed in run_experiment command in that example... It gets very confusing for me to understand how I actually pass any arguments to the task and evaluator functions??
ok..got it..and how to do I access metadata columns in task and evaluator? Do I just use word metadata?
def task(input, metadata):
question = input["question"]
answer = metadata["experiment_id"]