Troubleshooting Retrievability Evaluation Logging Errors in Phoenix
Hi team, I’m trying to set up a retriever relevancy evaluator but I get an error on logging the evaluator results to the server. My script is:
import phoenix as px
from phoenix.trace import DocumentEvaluations
from phoenix.session.evaluation import get_retrieved_documents
from phoenix.evals import (
RAG_RELEVANCY_PROMPT_RAILS_MAP,
RAG_RELEVANCY_PROMPT_TEMPLATE,
VertexAIModel,
llm_classify,
)
px_client = px.Client(endpoint="http://localhost:6006")
retrieved_documents_df = get_retrieved_documents(px_client)
print(retrieved_documents_df)
retrieved_documents_eval = llm_classify(
dataframe=retrieved_documents_df,
model=VertexAIModel(project="b6i-mainline", location="us-central1"),
template=RAG_RELEVANCY_PROMPT_TEMPLATE,
rails=list(RAG_RELEVANCY_PROMPT_RAILS_MAP.values()),
provide_explanation=True,
)
retrieved_documents_eval["score"] = (
retrieved_documents_eval.label[~retrieved_documents_eval.label.isna()] == "relevant"
).astype(int)
print(retrieved_documents_eval)
px_client.log_evaluations(
DocumentEvaluations(eval_name="Relevance", dataframe=retrieved_documents_eval)
)
And then I get:
Traceback (most recent call last):
File "/Users/gsolovev/PycharmProjects/bsci/benchsci/services/gen_ai/common/instrumentation/evals.py", line 65, in <module>
px_client.log_evaluations(
File "/Users/gsolovev/PycharmProjects/bsci/venv/lib/python3.11/site-packages/phoenix/session/client.py", line 200, in log_evaluations
).raise_for_status()
^^^^^^^^^^^^^^^^^^
File "/Users/gsolovev/PycharmProjects/bsci/venv/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 415 Client Error: Unsupported Media Type for url: http://localhost:6006/v1/evaluations
print statement returns:
label ... score
context.span_id document_position ...
421618672b42bde7 0 relevant ... 1
1 relevant ... 1
2 relevant ... 1
3 unrelated ... 0
4 unrelated ... 0
5 relevant ... 1
6 unrelated ... 0
7 unrelated ... 0
8 unrelated ... 0
9 unrelated ... 0
10 unrelated ... 0
11 unrelated ... 0
12 unrelated ... 0
13 unrelated ... 0
14 unrelated ... 0
15 relevant ... 1
16 unrelated ... 0
17 unrelated ... 0
18 unrelated ... 0
19 unrelated ... 0Any ideas what could be wrong?
