Newbie. Sorry if I missed it in the docs. How can I retrieve annotation explanations in phoenix? I'd like to be able to show for each span it's eval name, label, score, explanation as well as human annotations, but I don't see where this lives. For instance if I am pulling down spans of interest like in the code below: spans_df = client.get_spans_dataframe("evals['Q&A Correctness'].label == 'incorrect'") Appreciate the help in advance :)
To retrieve annotation explanations in Phoenix, you need to join the spans dataframe with the evaluations dataframe. Phoenix does not directly support selecting evals as part of the select statement in a query. Instead, you can retrieve the spans and evaluations separately and then join them using pandas. Here's how you can do it: 1. Retrieve Spans DataFrame: Use get_spans_dataframe to get the spans DataFrame.
2. Retrieve Evaluations: Use get_evaluations to get the evaluations DataFrame.
3. Join DataFrames: Join the spans and evaluations DataFrames on the span_id.Here's a code snippet to achieve this: LBKSimport phoenix as px import pandas as pdLBKSclient = px.Client()LBKS# Retrieve spans dataframe spans_df = client.get_spans_dataframe("evals['Q&A Correctness'].label == 'incorrect'")LBKS# Retrieve evaluations span_evals = client.get_evaluations() hallucination_evals = [span_eval for span_eval in span_evals if span_eval.eval_name == "Q&A Correctness"] assert len(hallucination_evals) == 1 hallucination_eval = hallucination_evals[0] evals_df = hallucination_eval.dataframeLBKS# Join spans and evaluations dataframes joined_df = spans_df.join(evals_df.rename(columns={ 'label': 'qa_correctness_label', 'score': 'qa_correctness_score', 'explanation': 'qa_correctness_explanation' }), how='inner')LBKSprint(joined_df)LBKSThis will give you a DataFrame with the span's eval name, label, score, explanation, and any human annotations.Would you like more detail?Sources:
- Querying Annotations in Phoenix with SpanQuery
- Span Evaluations
馃挕聽Hint: Mention RunLLM in the thread for followups.
To retrieve details of human annotations in Phoenix, you can use the GraphQL API to query for span annotations. This will allow you to get the name, label, score, and explanation for human annotations associated with each span. Here's a step-by-step approach: 1. GraphQL Query: Use a GraphQL query to fetch span annotations, including both LLM and human annotations.Here's a code snippet to achieve this: LBKSfrom httpx import ClientLBKSquery = """ query ($spanId: GlobalID!) { span: node(id: $spanId) { ... on Span { id spanAnnotations { name score label explanation annotatorKind } } } } """LBKSspan_id = "U3BhbjoxMA==" # replace with your span id client = Client(base_url="[http://localhost:6006](http://localhost:6006)") response = [client.post](http://client.post)( "/graphql", json={ "query": query, "variables": { "spanId": span_id, }, }, )LBKSassert response.status_code == 200 response_json = response.json() assert response_json.get("errors") is None print(response_json["data"])LBKSThis will return the annotations for the specified span, including the name, label, score, explanation, and annotator kind (HUMAN or LLM).Would you like more detail?Sources:
- GraphQL Query for Span Annotations
- Span Annotations
馃挕聽Hint: Mention RunLLM in the thread for followups.
Can you give this code a try? project_id is the string in the url, e.g. `UHJvamVjdDox` in http://localhost:6006/projects/UHJvamVjdDox
from httpx import Client
query = """
query ($projectId: GlobalID!) {
project: node(id: $projectId) {
... on Project {
spans {
edges {
node {
context {
spanId
}
spanAnnotations {
name
score
label
explanation
annotatorKind
}
}
}
}
}
}
}
"""
project_id = "UHJvamVjdDox" # replace with your project id
base_url = "http://localhost:6006"
response = Client(base_url=base_url).post(
"/graphql",
json={
"query": query,
"variables": {
"projectId": project_id,
},
},
)
response_json = response.json()
df = pd.json_normalize(response_json["data"]["project"]["spans"]["edges"])
df = df.rename({"node.context.spanId": "span_id"}, axis=1)
df = df.set_index("span_id")
df = df.loc[df.loc[:, "node.spanAnnotations"].apply(len) > 0]
exploded = df.loc[:, "node.spanAnnotations"].explode()
df = pd.json_normalize(exploded).set_index(exploded.index)
df
Thank you Roger Y.. I substituted in my project_id which was the same demo project you had and my base_url = "https://app.phoenix.arize.com" but I get the following error trying to run the client. <Response [401 Unauthorized]>
oh that鈥檚 bummer鈥his is complicated then because it would need the cookies which expires periodically
I think it would work to send the request with an API key header.
ok i tried it and it does work
response = Client(base_url=base_url, headers={"api_key": "..."}).post(thanks, Xander!
Should the headers be {"Authorization": "Bearer <api-key>"}?
for hosted it鈥檚 api_key=
Got it, good call.
Forgot about that difference with hosted vs. self-hosted.
Thank you. Much appreciated. That fixed the authentication. I will have to work on the structure a bit. The final df only shows eight rows coming back from the demo project "demo_llama_index". Must be something in the explode or normalize lines.
