Hello team! Can you help to resolve the following issue? I pulled spans from my instance of Phoenix. Run the evaluation using run_eval. The evaluation process was partially completed, was interrupted due to the problem:
Retries exhausted after 11 attempts: Error code: 400 - {'error': {'message': "Sorry, I can't answer that question.", 'type': 'invalid_request_error', 'code': '400', 'display_message': "Sorry, I can't answer that question."}}And now I can't open the required project in the Phoenix web app. The error appears:
Error fetching GraphQL query 'ProjectPageQuery' with variables '{"id":"UHJvamVjdDoy","timeRange":{"start":"2024-10-20T14:00:00.000Z","end":"2025-10-21T14:00:00.000Z"}}': [{"message":"an unexpected error occurred","locations":[{"line":64,"column":11}],"path":["project","rootSpans","edges",0,"rootSpan","spanAnnotations",0,"score"]},{"message":"an unexpected error occurred","locations":[{"line":64,"column":11}],"path":["project","rootSpans","edges",17,"rootSpan","spanAnnotations",1,"score"]},{"message":"an unexpected error occurred","locations":[{"line":64,"column":11}],"path":["project","rootSpans","edges",36,"rootSpan","spanAnnotations",0,"score"]}]Could you be so kind to suggest how I can resolve it? Thanks in advance!
The error message indicates that there is an issue with the GraphQL query related to the 'ProjectPageQuery'. This could be due to a mismatch in the expected data structure or a problem with the data itself. The error "an unexpected error occurred" suggests that the error is being masked, which is consistent with the behavior defined in the exceptions.py file where unexpected errors are masked with a generic message.To resolve this issue, you can try the following steps: 1. Clear Browser Cache: As seen in similar issues, clearing the browser cache can sometimes resolve unexpected errors in the Phoenix web app. This is because cached data might be causing discrepancies between the client and server.2. Check Data Integrity: Ensure that the data being evaluated and logged back to Phoenix is complete and correctly formatted. The error during the evaluation process might have resulted in incomplete or corrupted data being logged.3. Review Evaluation Process: Since the evaluation process was interrupted, review the steps and data used in the run_eval function to ensure there are no issues that could cause the evaluation to fail.Would you like more detailed guidance on any of these steps?Sources:
- exceptions.py file
馃挕聽Hint: Mention RunLLM in the thread for followups.
馃敀[private user] Thanks for the answer! I use arize-phoenix-evals==0.16.1 Phoenix deployed in my local environment. The only feature that I invoke a model from GSL instance
from phoenix.evals import (
HallucinationEvaluator,
OpenAIModel,
QAEvaluator,
RelevanceEvaluator,
run_evals,
)
# DEFINE EVALUATION MODEL
eval_model = OpenAIModel(
api_version=env_config.openai_api_version,
azure_deployment=env_config.model_name,
azure_endpoint=env_config.gsl_endpoint,
api_key=api_key,
temperature=0.0,
max_tokens=3000,
)
# Define your evaluators
hallucination_evaluator = HallucinationEvaluator(eval_model)
qa_evaluator = QAEvaluator(eval_model)
relevance_evaluator = RelevanceEvaluator(eval_model)And the evaluation
import nest_asyncio
nest_asyncio.apply()
RUN EVALUATIONS
[hallucination_eval_df] = run_evals(
dataframe=qa_with_reference,
evaluators=[hallucination_evaluator],
provide_explanation=True,
)
[qa_eval_df] = run_evals(
dataframe=qa_with_reference,
evaluators=[qa_evaluator],
provide_explanation=True,
)
[relevancy_eval_df] = run_evals(
dataframe=retrieved_documents,
evaluators=[relevance_evaluator],
provide_explanation=True,
)
# ANALYZE RESULTS
results_qa = qa_with_reference.copy()
results_qa["qa_eval"] = qa_eval_df["label"]
results_qa["qa_explanation"] = qa_eval_df["explanation"]
results_qa.head()
results_hal = qa_with_reference.copy()
results_hal["hallucination_eval"] = hallucination_eval_df["label"]
results_hal["hallucination_explanation"] = hallucination_eval_df["explanation"]
results_hal.head()
results_rel = retrieved_documents.copy()
results_rel["hallucination_eval"] = relevancy_eval_df["label"]
results_rel["hallucination_explanation"] = relevancy_eval_df["explanation"]
results_rel.head()
# PUSH RESULTS TO PHOENIX
px.Client().log_evaluations(
SpanEvaluations(eval_name="Hallucination", dataframe=hallucination_eval_df),
SpanEvaluations(eval_name="QA Correctness", dataframe=qa_eval_df),
DocumentEvaluations(eval_name="Relevance", dataframe=relevancy_eval_df),
)During the evaluation run there are issues like
Exception in worker on attempt 1: raised BadRequestError('Error code: 400 - {\'error\': {\'message\': "Sorry, I can\'t answer that question.", \'type\': \'invalid_request_error\', \'code\': \'400\', \'display_message\': "Sorry, I can\'t answer that question."}}')I'm still not sure if the issue related to Phoenix or GSL (in process of investigation), but the evaluation interrupted after the 11 attempts. And there is no possibility to load the required project in Phoenix web app
Hello 馃敀[private user], the issue was related to the limitation of the prompt length on Bedrock side. As an experiment, I reduced the length of the data into 'content' and 'reference' fields and the errors gone away. But seems that it's better to think how to handle such issues and doesn't make the Project 'broken' into Phoenix server. Thanks for the quick response and you help!
