Transient error Internal Server Error encountered while exporting span batch, retrying in 8s.Hello, I am seeing this error after integrating arize-otel for litellm. We have our agents running within ECS Fargate on AWS. Any help with this would be really appreciated
According to these 2: https://github.com/langfuse/langfuse/issues/7741 https://github.com/open-telemetry/opentelemetry-python/discussions/3110 Looks like a server side error - We would need Arize team's input to figure out what's wrong
# Import open-telemetry dependencies
from arize.otel import Transport, register
from openinference.instrumentation.langchain import LangChainInstrumentor
# Import the instrumentor from OpenInference
from openinference.instrumentation.litellm import LiteLLMInstrumentor
def initialize_arize_otel(env: Env, api_key: str, dist: str):
# Setup OTel via our convenience function
tracer_provider = register(
# use https instead og default grpc which gives error on prod
# UNKNOWN:Error received from peer ipv4:34.49.228.251:443
# {created_time:"2025-08-20T03:28:58.201124279+00:00", grpc_status:2,
# grpc_message:"All spans failed. Found 13 errors: "}
endpoint="https://otlp.arize.com/v1/traces", # this is the HTTPS endpoint
transport=Transport.HTTP,
# default configuration
space_id="<SPACE_ID>", # in app space settings page
api_key=api_key, # in app space settings page
project_name=f"{dist}-dev", # name this to whatever you would like
)
# Instrument LangChain (which includes LangGraph)
LangChainInstrumentor().instrument(tracer_provider=tracer_provider)
# Instrument LiteLLM
LiteLLMInstrumentor().instrument(tracer_provider=tracer_provider)
this is our code used to initialize tracing
