Hi in the ariZe ai while I was trying to log evals back to traces I am getting weird error in plight arrow logger and says cannot writespanrequest
The error you're seeing when logging evals ("cannot writespanrequest" or similar pyarrow logger errors) typically occurs if the evaluation DataFrame is missing required fields—most importantly, the span_id column, which is needed to associate evaluations with traces. The span_id must be present and in the correct format (hex string, not integer or bytes) for Phoenix to log evaluations successfully. If you manually create the DataFrame, ensure span_id is included and properly formatted; otherwise, Phoenix cannot join the evals to the correct spans, leading to serialization or write errors in pyarrow or the logger. See this Slack thread and Phoenix evaluation docs for details.Would you like more detail on how to extract or format span_id, or on troubleshooting pyarrow errors?Sources: - https://arize-ai.slack.com/archives/C04R3GXC8HK/p1721332908.765159 - Phoenix evaluation docs
💡 Hint: Mention RunLLM in the thread for followups.
bo some snippets from your code, particularly for how you ran the evals on traces, would be helpful. for now, i suggest validating your code against the following checklist:
context.span_id should contain strings
label columns (eval.<name>.label) should only contain strings, score columns (eval.<name>.score) should be int/floats, explanation should be strings
manually cross check some context.span_id values with your actual traces in Phoenix UI.
there are a few more complicated cases you could be hitting but it's hard to diagnose without your code.
def _log_to_arize_sync(self, eval_df: pd.DataFrame) -> bool: """ Synchronous logging to Arize using log_evaluations_sync API. Follows the pattern from Arize's trace-level-evals tutorial: arize_client.log_evaluations_sync(dataframe=log_df, model_id=model_id) """ try: # Arize Platform SDK: client.log_evaluations_sync() response = self.client.log_evaluations_sync( dataframe=eval_df, model_id=self.model_id ) # Check response status if response: logger.debug(f"Arize log_evaluations_sync successful: {response}") return True else: logger.error(f"Arize API returned status: {response.status_code}") return False except Exception as e: logger.error(f" Arize API error: {e}", exc_info=True) return False def _build_arize_evaluations_dataframe( self, trace_id: str, runtime_id: str, evaluation_result: Dict[str, Any], trace_metadata: Dict[str, Any] ) -> pd.DataFrame: """ Build DataFrame in Arize Platform trace-level evaluations format. Following the pattern from Arize's trace-level-evals tutorial: https://docs.arize.com/arize/llm-large-language-models/llm-traces/trace-evaluations Column naming pattern: trace_eval.{EvalName}.label and trace_eval.{EvalName}.explanation Example columns: - context.trace_id: The trace ID - context.span_id: The root span ID - trace_eval.StepBudgeting.label: "correct" or "incorrect" - trace_eval.StepBudgeting.explanation: Explanation text Creates ONE ROW per trace with all guardrail evaluations as columns. """ guardrail_results = evaluation_result.get("guardrail_results", []) span_id = trace_metadata.get("span_id") if not span_id: # Try alternative field names span_id = trace_metadata.get("root_span_id") if not span_id: span_id = trace_id[:16] if len(trace_id) > 16 else trace_id logger.warning(f" span_id not found in trace_metadata, using derived value: {span_id}") # Start with trace identifiers row_data = { "context.trace_id": trace_id, "context.span_id": span_id, } # Add each guardrail as separate columns (label + explanation) for gr in guardrail_results: guardrail_name = gr.get("guardrail_name", "unknown") status = gr.get("status", "unknown") # Map status to Arize label format (correct/incorrect) label = "correct" if status == "passed" else "incorrect" explanation = gr.get("message", "No details provided") # Convert guardrail_name to PascalCase for column name # e.g., "step_budgeting" -> "StepBudgeting" eval_name = "".join(word.capitalize() for word in guardrail_name.split("_")) # Add columns following Arize naming pattern row_data[f"trace_eval.{eval_name}.label"] = label row_data[f"trace_eval.{eval_name}.explanation"] = explanation # Create single-row DataFrame (one row per trace) df = pd.DataFrame([row_data]) logger.debug(f"Built Arize evaluations: {len(guardrail_results)} guardrails for trace {trace_id}") logger.debug(f" Columns: {list(df.columns)}") return df 🔒[private user] 🔒[private user] below is how arize client is created self.client = ArizeClient( space_id=self.space_id, api_key=self.api_key ) using https://app.arize.com/
Testing SPAN-Level Evaluations ================================================================================ 1️⃣ Configuration: Space ID: API Key: Model ID: Span ID: 34da4809fa1bbca3 2️⃣ Creating SPAN-level evaluation DataFrame... Format: eval.{Name}.label ✅ DataFrame created: Shape: (1, 3) Columns: ['context.span_id', 'eval.StepBudgeting.label', 'eval.StepBudgeting.explanation'] 📋 Data preview: context.span_id eval.StepBudgeting.label eval.StepBudgeting.explanation 34da4809fa1bbca3 correct Agent used 5 out of 10 allowed steps - PASS 3️⃣ Initializing Arize Client... ✅ Client initialized 4️⃣ Logging evaluations to Arize... 📤 Calling log_evaluations_sync()... Error logging evaluation data to Arize Traceback (most recent call last): File "C:\Users\bbb\Desktop\G-sdk\ag-sdk\gs\Lib\site-packages\arize\pandas\logger.py", line 1810, in _log_arrow_flight with flight_writer: ^^^^^^^^^^^^^ File "pyarrow/ipc.pxi", line 630, in pyarrow.lib._CRecordBatchWriter.__exit__ File "pyarrow/_flight.pyx", line 1276, in pyarrow._flight.MetadataRecordBatchWriter.close File "pyarrow/_flight.pyx", line 55, in pyarrow._flight.check_flight_status pyarrow._flight.FlightInternalError: Flight returned internal error, with message: failed to write WriteSpanEvaluationRequest. gRPC client debug context: UNKNOWN:Error received from peer ipv4:34.117.96.240:443 {created_time:"2025-11-04T06:04:20.273324+00:00", grpc_status:13, grpc_message:"failed to write WriteSpanEvaluationRequest"}. Client context: OK 📥 Response: None ⚠️ Response is None Check Arize dashboard to verify if evaluations appeared
