I ran into an issue connecting to localhost:4317 and I appreciate your help. I put the full instruction in the thread to make it easier to follow.
I’m using Litellm framework for testing phoenix. The reason I’m using Litellm is because that it the framework we are using in our codebase. I’m using k8s and helm charts to create a statefulset and service for phoenix. In my helm chart, I’m using the latest phoenix image. I’ve passed following environment variables to the helm chart
- name: PHOENIX_WORKING_DIR
value: /mnt/data
- name: PHOENIX_PORT
value: "6006"
- name: PHOENIX_SQL_DATABASE_URL
value: {{ .Values.phoenix.database.url | quote }}
- name: PHOENIX_COLLECTOR_ENDPOINT
value: "<http://localhost:6006>"
I checked the deployment after the k8s deployment. I can access phoenix UI in localhost:6006 Inside the pod, arize-phoenix-otel is also installed as instructed. BTW, I port-forwarded following ports to be accessible: "6006:6006", "4317:4317” 6006 is for phoenix and 4317 is for opentelemetry. Then as a next step, I went to the pod I’m using for LLM calls and installed the litellm related libraries one more time to make sure everything is set up correctly
pip install openinference-instrumentation-litellm litellm arize-phoenix-otelThen I opened python and ran the following code based on instruction here: https://docs.arize.com/phoenix/tracing/integrations-tracing/litellm
from phoenix.otel import register
# configure the Phoenix tracer
tracer_provider = register(
project_name="my-llm-app", # Default is 'default'
auto_instrument=True # Auto-instrument your app based on installed OI dependencies
)Below is what I get
🔭 OpenTelemetry Tracing Details 🔭
| Phoenix Project: my-llm-app
| Span Processor: SimpleSpanProcessor
| Collector Endpoint: localhost:4317
| Transport: gRPC
| Transport Headers: {'user-agent': '****'}
|
| Using a default SpanProcessor. `add_span_processor` will overwrite this default.
|
| ⚠️ WARNING: It is strongly advised to use a BatchSpanProcessor in production environments.
|
| `register` has set this TracerProvider as the global OpenTelemetry default.
| To disable this behavior, call `register` with `set_global_tracer_provider=False`.Then, added my OpenAI API key:
import os
os.environ["OPENAI_API_KEY"] = "PASTE_YOUR_API_KEY_HERE"and used litellm as normal
import litellm
completion_response = litellm.completion(model="gpt-3.5-turbo",
messages=[{"content": "What's the capital of China?", "role": "user"}])
print(completion_response)Then I’m getting this error:
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 8s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 16s.I can share my helm templates as well if it's going to help.
can you try the 6006 port (http) via the following to see if works?
from phoenix.otel import register
tracer_provider = register(endpoint="http://localhost:6006/v1/traces")
sure
we generally recommend the gRPC port. but in this case it seems like that there may be some kind of network problem
I got the same error. As you see tracing details points to localhost:6006 this time, but litellm still struggles with connection
>>> tracer_provider = register(endpoint="http://localhost:6006/v1/traces")
Overriding of current TracerProvider is not allowed
🔭 OpenTelemetry Tracing Details 🔭
| Phoenix Project: default
| Span Processor: SimpleSpanProcessor
| Collector Endpoint: http://localhost:6006/v1/traces
| Transport: HTTP + protobuf
| Transport Headers: {}
|
| Using a default SpanProcessor. `add_span_processor` will overwrite this default.
|
| ⚠️ WARNING: It is strongly advised to use a BatchSpanProcessor in production environments.
|
| `register` has set this TracerProvider as the global OpenTelemetry default.
| To disable this behavior, call `register` with `set_global_tracer_provider=False`.
>>> completion_response = litellm.completion(model="gpt-3.5-turbo",
... messages=[{"content": "What's the capital of China?", "role": "user"}])
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 8s.
that’s very strange. are you able to reach the webpage http:://localhost:6006/?
and what if you try it this way?
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
I made some modification on my manifest files. Looks like the ports for http and grpc connections were not exposed internally. So, I added those specifications to my manifest file and also made the modification you proposed and in combination it works. I need to dig deeper though to see how these things are internally connected.
I'll share my manifest file here in case someone is interested.
{{- if .Values.phoenix.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ .Values.phoenix.name }}
labels:
app: {{ .Values.phoenix.name }}
environment: {{ .Values.env.name | quote }}
spec:
ports:
- port: 443
protocol: TCP
targetPort: 6006
selector:
app: {{ .Values.phoenix.name }}
type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: {{ .Values.phoenix.name }}
labels:
app: {{ .Values.phoenix.name }}
environment: {{ .Values.env.name | quote }}
spec:
replicas: 1
selector:
matchLabels:
app: {{ .Values.phoenix.name }}
serviceName: {{ .Values.phoenix.name }}
template:
metadata:
annotations:
prometheus.io/path: /metrics
prometheus.io/port: "9090"
prometheus.io/scrape: "true"
labels:
app: {{ .Values.phoenix.name }}
environment: {{ .Values.env.name | quote }}
spec:
# Add init container to wait for PostgreSQL
initContainers:
- name: wait-for-postgresql
image: postgres:17
command: ['sh', '-c',
'until pg_isready -h $POSTGRES_HOST -p 5432 -U $POSTGRES_USER;
do echo waiting for postgresql; sleep 2; done;']
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: {{ include "asc-agents.fullname" . }}-secret
key: pg-user
- name: POSTGRES_HOST
valueFrom:
secretKeyRef:
name: {{ include "asc-agents.fullname" . }}-secret
key: pg-fqdn
containers:
- name: {{ .Values.phoenix.name }}
image: "{{ .Values.phoenix.image | default "docker.io/arizephoenix/phoenix:version-8.26.1" }}"
args: ["-m", "phoenix.server.main", "serve"]
env:
- name: PHOENIX_WORKING_DIR
value: /mnt/data
- name: PHOENIX_PORT
value: "6006"
- name: PHOENIX_SQL_DATABASE_URL
value: {{ .Values.phoenix.database.url | quote }}
- name: PHOENIX_COLLECTOR_ENDPOINT
value: "http://localhost:6006"
- name: PHOENIX_ENDPOINT
value: "http://asc-agents-phoenix:443"
ports:
- containerPort: 6006
- containerPort: 4317
- containerPort: 9090
volumeMounts:
- mountPath: /mnt/data
name: {{ .Values.phoenix.name }}
readinessProbe:
httpGet:
path: /metrics
port: 6006
initialDelaySeconds: 10
periodSeconds: 10
livenessProbe:
httpGet:
path: /metrics
port: 6006
initialDelaySeconds: 15
periodSeconds: 20
resources:
requests:
memory: 1Gi
cpu: 500m
limits:
memory: 4Gi
cpu: 500m
volumeClaimTemplates:
- metadata:
name: {{ .Values.phoenix.name }}
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: {{ .Values.phoenix.storage | default "8Gi" }}
storageClassName: {{ .Values.phoenix.storageClass | default "standard" | quote}}
{{- end }}
The values are the following
phoenix:
name: xxx
image: docker.io/arizephoenix/phoenix:latest
enabled: true
storage: 8Gi
storageClass: standard
env:
- name: PHOENIX_COLLECTOR_ENDPOINT
value: "http://xxx:443"
spec:
ports:
- name: http
port: 443
protocol: TCP
targetPort: 6006
- name: grpc
port: 4317
protocol: TCP
targetPort: 4317
above is my local values, general values are as follows
phoenix:
enabled: false
name: xxx
image: docker.io/arizephoenix/phoenix:version-8.26.1
database:
url: postgresql://anyxxx:anyxxx@api-postgresql.default:5432/phoenix
storage: 8Gi
storageClass: standard
I also created a database called phoenix in my postgres
