Hello! After deploying our app to the production environment (where we have a high volume of traces/spans), we started seeing this error in evaluation tasks: TaskRuntimeError: Approximately 1000 spans were skipped due to error: rpc error: code = DeadlineExceeded desc = context deadline exceeded TaskEvaluationUpdateError: There was an error updating the data with the evaluations. Reach out to support for help. Could you guys help us understand what’s causing the deadline exceeded and how to prevent spans from being skipped?
🔒[private user] Thanks for the explanation. One clarification: each evaluation task is scheduled to run every 2–10 minutes, so individual tasks are not long-running by design. Given that, it seems the context deadline exceeded is more related to span volume and processing throughput during each run rather than task duration itself. We’ll double-check that we’re fully using the Batch Span Processor (no SimpleSpanProcessor anywhere), and review batch size, export delay, and timeouts. Question: is there a recommended upper bound of spans per evaluation run or best practice for splitting evaluations by time window / filtering spans in high-throughput production environments to avoid these deadlines?
Hello, 🔒[private user]! It's happening in a task called "general evaluator", where I have 2 evals: one for hallucination, and another for user frustration
I have two other tasks that are working fine, but this one is failing in most of the runs.
