Hi team, I’m trying to set up a retriever relevancy evaluator but I get an error on logging the evaluator results to the server. My script is:
import phoenix as px
from phoenix.trace import DocumentEvaluations
from phoenix.session.evaluation import get_retrieved_documents
from phoenix.evals import (
RAG_RELEVANCY_PROMPT_RAILS_MAP,
RAG_RELEVANCY_PROMPT_TEMPLATE,
VertexAIModel,
llm_classify,
)
px_client = px.Client(endpoint="http://localhost:6006")
retrieved_documents_df = get_retrieved_documents(px_client)
print(retrieved_documents_df)
retrieved_documents_eval = llm_classify(
dataframe=retrieved_documents_df,
model=VertexAIModel(project="b6i-mainline", location="us-central1"),
template=RAG_RELEVANCY_PROMPT_TEMPLATE,
rails=list(RAG_RELEVANCY_PROMPT_RAILS_MAP.values()),
provide_explanation=True,
)
retrieved_documents_eval["score"] = (
retrieved_documents_eval.label[~retrieved_documents_eval.label.isna()] == "relevant"
).astype(int)
print(retrieved_documents_eval)
px_client.log_evaluations(
DocumentEvaluations(eval_name="Relevance", dataframe=retrieved_documents_eval)
)
And then I get:
Traceback (most recent call last):
File "/Users/gsolovev/PycharmProjects/bsci/benchsci/services/gen_ai/common/instrumentation/evals.py", line 65, in <module>
px_client.log_evaluations(
File "/Users/gsolovev/PycharmProjects/bsci/venv/lib/python3.11/site-packages/phoenix/session/client.py", line 200, in log_evaluations
).raise_for_status()
^^^^^^^^^^^^^^^^^^
File "/Users/gsolovev/PycharmProjects/bsci/venv/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 415 Client Error: Unsupported Media Type for url: http://localhost:6006/v1/evaluations
print statement returns:
label ... score
context.span_id document_position ...
421618672b42bde7 0 relevant ... 1
1 relevant ... 1
2 relevant ... 1
3 unrelated ... 0
4 unrelated ... 0
5 relevant ... 1
6 unrelated ... 0
7 unrelated ... 0
8 unrelated ... 0
9 unrelated ... 0
10 unrelated ... 0
11 unrelated ... 0
12 unrelated ... 0
13 unrelated ... 0
14 unrelated ... 0
15 relevant ... 1
16 unrelated ... 0
17 unrelated ... 0
18 unrelated ... 0
19 unrelated ... 0Any ideas what could be wrong?
Hi Gregory S., I think the score might need to be a float instead of an int.
retrieved_documents_eval["score"] = (
retrieved_documents_eval.label[~retrieved_documents_eval.label.isna()] == "relevant"
).astype(int)Unfortunately that didn’t work. Seeing the same error.
Gregory S. Are you running the Phoenix client and the Phoenix server from the same runtime?
You appear to be hitting this line in the handler for eval ingestion.
Our px.Client.log_evaluations method sends with "application/x-pandas-arrow" content-type headers. https://github.com/Arize-ai/phoenix/blob/276d8dc884662f429eff9b683559a1261949ecaa/src/phoenix/session/client.py#L191
This suggests there is a mismatch between server and client.
I run the server through docker. And the Client() is only being used as in the script above. Any suggestions how I can troubleshoot deeper on my end?
Gregory S. can you try maybe running a simple debug server and creating a client that points to the debug server so we can check the request payload? Here's an example debug server using flask:
from flask import Flask, request
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
app = Flask(__name__)
@app.route("/", defaults={"path": ""})
@app.route("/<path:path>", methods=["GET", "POST", "PUT", "DELETE", "PATCH"])
def catch_all(path):
logging.info(f"Received a {request.method} request at /{path}")
logging.info(f"Headers: {request.headers}")
body = request.get_data(as_text=True)
logging.info(f"Body: {body}")
return f"Caught {request.method} request at /{path}\n"
if __name__ == "__main__":
app.run(debug=True, port=5000) # Runs on http://localhost:5000
2024-05-08 10:48:44,074 - INFO - Body:
2024-05-08 10:48:44,075 - INFO - 127.0.0.1 - - [08/May/2024 10:48:44] "GET /arize_phoenix_version HTTP/1.1" 200 -
2024-05-08 10:49:03,047 - INFO - Received a POST request at /v1/evaluations
2024-05-08 10:49:03,047 - INFO - Headers: Host: localhost:5000
User-Agent: python-requests/2.31.0
Accept-Encoding: gzip, deflate, br
Accept: */*
Connection: keep-alive
Content-Type: application/x-pandas-arrow
Project-Name: default
Content-Length: 6512
2024-05-08 10:49:03,047 - INFO - Body: �����
P�p����q{"eval_id": "4b9140fd-2570-437b-adfb-1763f6071503", "eval_name": "Relevance", "eval_type": "DocumentEvaluations"}arize
��{"index_columns": ["context.span_id", "document_position"], "column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", "numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": [{"name": "label", "field_name": "label", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": "explanation", "field_name": "explanation", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": "score", "field_name": "score", "pandas_type": "float64", "numpy_type": "float64", "metadata": null}, {"name": "context.span_id", "field_name": "context.span_id", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": "document_position", "field_name": "document_position", "pandas_type": "int64", "numpy_type": "int64", "metadata": null}], "creator": {"library": "pyarrow", "version": "10.0.1"}, "pandas_version": "1.5.0"}pandas���L0���,document_position
@t��� context.span_idl�������scor����
explanation����
label����x
@
TT`hT`@��!*2;DMV_hqz������relevantrelevantrelevantunrelatedunrelatedrelevantunrelatedunrelatedunrelatedunrelatedunrelatedunrelatedunrelatedunrelatedunrelatedrelevantunrelatedunrelatedunrelatedunrelatedX�:�p�L�/�
�
}
�
Y
�
The reference text discusses the relationship between TLR4 and caspase1, specifically mentioning that TAK-242 inhibits TLR4 and restores the expression of NLRP3 inflammasome-related proteins, including pro-caspase1 and caspase1. This information is directly relevant to understanding the relationship between caspase1 and TLR4.
LABEL: relevant The reference text discusses the relationship between TLR4 and caspase1, specifically mentioning the downregulation of TLR4 and NLRP3 inflammasome markers, including pro-caspase1 and caspase1. This information is directly relevant to the question, which asks about the relationship between Caspase1 and tlr4.
LABEL: relevant The reference text discusses the relationship between TLR4 and NLRP3 inflammasome markers, including caspase1. It mentions that HMGB1 treatment increases the expression of caspase1 and the ratio of caspase1/pro-caspase1, while TLR4 antagonist TAK-242 induces the opposite effects. This information is directly relevant to the question, which asks about the relationship between caspase1 and TLR4.
LABEL: relevant The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text mentions "caspase1", which is related to the question.
LABEL: relevant The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text discusses the relationship between TLR2 and TLR4, but does not mention Caspase1.
LABEL: unrelated The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text mentions "cleaved-caspase1" and "total-caspase1", which are related to Caspase1. However, it does not mention tlr4.
LABEL: unrelated The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text discusses the relationship between COLEC12 and TLR4, while the question asks about the relationship between Caspase1 and tlr4. The reference text does not mention Caspase1, so it cannot help answer the question.
LABEL: unrelated The Question asks about the relationship between Caspase1 and tlr4. The Reference text mentions that the production of IL-1beta is regulated by TLR4-MyD88-IL-1beta pathway and NLRP3-ASC-Caspase1-IL1beta pathway. This means that Caspase1 is involved in the TLR4 pathway. Therefore, the Reference text contains information that can help answer the Question.
LABEL: relevant The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text does not mention Caspase1 or tlr4, so it cannot help answer the question.
LABEL: unrelated The reference text discusses the relationship between TLR4 and ROS/RNS, but does not mention Caspase1.
LABEL: unrelated The reference text discusses the relationship between TLR4 and regulatory T cells and Th17 cells, while the question asks about the relationship between Caspase1 and TLR4. The reference text does not mention Caspase1, so it does not contain information that can help answer the question.
LABEL: unrelated�?�?�?�?�? 0@P`p�������� 0@421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7421618672b42bde7
����
2024-05-08 10:49:03,066 - INFO - 127.0.0.1 - - [08/May/2024 10:49:03] "POST /v1/evaluations HTTP/1.1" 200 -what docker image are you using?
arizephoenix/phoenix:version-2.9.3
What version of the client are you using, just to double check?
