RunLLM I'm having some issue getting traces for my completion calls I am using Litellm to make calls to claude hosted on bedrock What instrumentors do i need to use? I used LitellmInstrumentor and it didn't result in any traces To confirm i used litellm with an openai model and OpenAiInstrumentor and it worked, should i instead be using an AnthropicInstrumentor or a Bedrock Instrumentor? IDeeally i can just use Litellm and it'll support all models that litellm supports
To instrument LiteLLM calls for models hosted on AWS Bedrock, you should use the BedrockInstrumentor rather than the LiteLLMInstrumentor. The BedrockInstrumentor is specifically designed to handle AWS Bedrock models, including those from Anthropic like Claude. According to the Bedrock Integration Guide, you can instrument your boto3 client to send traces to Phoenix.Here's a basic setup: 1. Install the necessary packages:
```LBKSLBKS2. **Instrument your application**:
```LBKS import boto3
from openinference.instrumentation.bedrock import BedrockInstrumentor
from phoenix.otel import registerLBKS # Register the Phoenix tracer
tracer_provider = register(project_name="my-llm-app")LBKS # Instrument the boto3 client
BedrockInstrumentor().instrument(tracer_provider=tracer_provider)LBKS # Create a boto3 client for Bedrock
session = boto3.session.Session()
client = session.client("bedrock-runtime")
```LBKSLBKS3. **Make calls using the instrumented client**:
```LBKS response = client.invoke_model(modelId="anthropic.claude-v2", body=prompt)
```LBKSLBKSThis setup will ensure that your calls to AWS Bedrock models are traced and sent to Phoenix for observability.LBKSWould you like more detail on any specific part?LBKSSources:
- [https://docs.arize.com/phoenix/tracing/integrations-tracing/bedrock#P%3A1](https://docs.arize.com/phoenix/tracing/integrations-tracing/bedrock#P%3A1)
- [https://github.com/Arize-ai/phoenix/blob/main/tutorials/integrations/bedrock_tracing_tutorial.ipynb](https://github.com/Arize-ai/phoenix/blob/main/tutorials/integrations/bedrock_tracing_tutorial.ipynb)
馃挕聽Hint: Mention RunLLM in the thread for followups.
The LiteLLMInstrumentor is designed to work with the LiteLLM library, which allows you to call various LLM APIs using the OpenAI format. However, it may not fully support all the intricacies of AWS Bedrock's specific implementation, especially when it comes to tracing. The BedrockInstrumentor is specifically tailored to handle the AWS Bedrock environment, ensuring that all calls to Bedrock models are properly traced and sent to Phoenix.While the LiteLLMInstrumentor can be used for models that are directly compatible with the OpenAI API format, it might not capture all the necessary details for Bedrock models due to differences in how these models are accessed and managed. If you are using LiteLLM to switch between different model providers, you might need to use the specific instrumentor for each provider to ensure full compatibility and traceability.Would you like more detail on any specific part?Sources:
- https://github.com/Arize-ai/openinference/issues/1303
- https://github.com/Arize-ai/openinference/issues/1233
馃挕聽Hint: Mention RunLLM in the thread for followups.
i am calling completion and acompletion with stream True/False but none of the 4 combinations lead to any traces for me when i use claude 3.7 on aws bedrock
ah makes sense then i'll wait for that change before trying litellm instrumentor, will go with bedrock for now, lmk if you want any examples from my side
actually this is interesting, seem to have solved the issue for us specifically Some context :
in our production setting we have defined a CachedLitellm class that essentially adds a cache check for any completion or acompletion call to litellm
we have kept the names of methods of CachedLitellm as completion and acompletion same as litellmI tested 2 things
script that calls litellm.completion and acompletion directly
attached as snippet in the next message
This actually brings the traces and spans i need (see first image)
This only uses LitellmInstrumentor
script that calls my cached_litellm.completion and acompletion
not attached this but prety much the same
This doesn't bring in any llm spans (see seecond image)
finally i changed my cached_litellm.acompletion and completion to have handle_ prefix and tested the script again with cached_litellm
Huzzah! the llm spans are now showing up
I guess the instrumentor didn't like what we were doing by overrding the function names in our cached litellm class?
see third image
Curious what was the issue you also replicated, was it the first point (1) above? On a separate not, should i be seeing spans at the final streaming accumulated output or should i be seeing stream events/chunks be seaprate spans?
here is the script for all 1-3 only difference in 2 and 3 is instead of litellm.completion i do cached_litellm_obj.completion / cached_litellm_obj.handle_completion
# %%
import asyncio
from hx_platform.tools.local_settings_util import load_settings_to_env
load_settings_to_env()
# %%
import os
import logging
from openinference.instrumentation.litellm import LiteLLMInstrumentor
from openinference.semconv.resource import ResourceAttributes
from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from openinference.instrumentation.openai import OpenAIInstrumentor
import litellm
# LlamaIndexInstrumentor().instrument()
LiteLLMInstrumentor().instrument()
# OpenAIInstrumentor().instrument()
# Use service_name if provided, otherwise fall back to project_name
resource_attributes = {}
resource_attributes[ResourceAttributes.PROJECT_NAME] = 'test4'
resource = Resource.create(resource_attributes)
tracer_provider = trace_sdk.TracerProvider(resource=resource)
multi_span_processor = trace_sdk.ConcurrentMultiSpanProcessor()
if os.environ.get("PHOENIX_COLLECTOR_ENDPOINT"):
# Add OTLP exporter
multi_span_processor.add_span_processor(
SimpleSpanProcessor(
OTLPSpanExporter("http://localhost:6006/v1/traces")
)
)
else:
logging.info(
"PHOENIX_COLLECTOR_ENDPOINT not set. Phoenix Tracing will not be enabled."
)
tracer_provider.add_span_processor(multi_span_processor)
trace_api.set_tracer_provider(tracer_provider)
MODEL_NAME = 'bedrock/anthropic.claude-3-7-sonnet-20250219-v1:0'
MODEL_ID = 'arn:aws:bedrock:<YOUR_AWS_REGION>:<YOUR_AWS_ACCOUNT_ID>:inference-profile/<MODEL_NAME>:<MODEL_VERSION>'
AWS_ACCESS_KEY_ID = os.environ.get("AWS_ACCESS_KEY_ID")
AWS_SECRET_ACCESS_KEY = os.environ.get("AWS_SECRET_ACCESS_KEY")
AWS_REGION = os.environ.get("AWS_REGION_NAME")
# %%
TOOLS = [
{
"type": "function",
"function": {
"name": "search_knowledge_base",
"description": "Search the knowledge base for relevant information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
},
"limit": {
"type": "integer",
"description": "Maximum number of results to return",
"default": 5
}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "get_incident_details",
"description": "Get details about a specific incident",
"parameters": {
"type": "object",
"properties": {
"incident_id": {
"type": "string",
"description": "The unique identifier for the incident"
}
},
"required": ["incident_id"]
}
}
}
]
INPUT_MESSAGES = [
{"role": "system", "content": "You are an AI assistant helping with incident management. You can search for information and retrieve incident details to help users."},
{"role": "user", "content": "What are the recent critical incidents in our production environment?"}
]
tracer = trace_api.get_tracer(__name__)
# %%
## No Stream Sync Completion Claude With Thinking + Tools
with tracer.start_as_current_span("litellm.completion_sync_with_tools_and_thinking") as span:
out = litellm.completion(
model=MODEL_NAME,
model_id=MODEL_ID,
api_key=None,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
aws_region_name=AWS_REGION,
messages=INPUT_MESSAGES,
thinking={"type": "enabled", "budget_tokens": 2000},
temperature=1
)
print(out)
# %%
# No Stream Async Completion Claude With Thinking + Tools
with tracer.start_as_current_span("litellm.acompletion_async_with_tools_and_thinking") as span:
out2 = asyncio.run(
litellm.acompletion(
model=MODEL_NAME,
model_id=MODEL_ID,
api_key=None,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
aws_region_name=AWS_REGION,
messages=INPUT_MESSAGES,
thinking={"type": "enabled", "budget_tokens": 2000},
temperature=1,
tools=TOOLS
)
)
print(out2)
# %%
with tracer.start_as_current_span("litellm.completion_sync_with_tools_and_thinking") as span:
out3 = litellm.completion(
model=MODEL_NAME,
model_id=MODEL_ID,
api_key=None,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
aws_region_name=AWS_REGION,
messages=INPUT_MESSAGES,
temperature=1,
thinking={"type": "enabled", "budget_tokens": 2000},
stream=True,
tools=TOOLS
)
for chunk in out3:
print(chunk)
# %%
async def run():
with tracer.start_as_current_span("litellm.acompletion_async_with_tools_and_thinking") as span:
out4 = await litellm.acompletion(
model=MODEL_NAME,
model_id=MODEL_ID,
api_key=None,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
aws_region_name=AWS_REGION,
messages=INPUT_MESSAGES,
temperature=1,
thinking={"type": "enabled", "budget_tokens": 2000},
stream=True,
tools=TOOLS
)
async for chunk in out4:
print(chunk)
asyncio.run(run())
# %%
