Hi all, who knows how to make evaluation traces (llm evaluator of experiment) fully visible (not only output)? I have this setup, which should log all traces
tracer_provider = register(protocol="http/protobuf", project_name=project_name, auto_instrument=True)To make evaluation traces (LLM evaluator of experiment) fully visible in Phoenix—including more than just the output column—you should log your evaluation results as annotations on spans. Evaluations can include label, score, and explanation fields, and when logged using SpanEvaluations, these will appear as annotations in the Phoenix UI, making them fully visible alongside the trace output. You can use built-in or custom evaluators, and log results with: LBKSfrom phoenix.trace import SpanEvaluations px.Client().log_evaluations( SpanEvaluations(eval_name="Your Eval Name", dataframe=your_eval_df) )LBKSMake sure your evaluation DataFrame includes the span_id and desired columns (label, score, explanation) for full visibility. For more, see Log Evaluation Results and Evaluators Overview.Would you like more detail on how to structure your DataFrame or customize the UI display?Sources:
- Log Evaluation Results
- Evaluators Overview
💡 Hint: Mention RunLLM in the thread for followups.
evaluation run is itselt a llm invokation, which should be traced, but it shows only output, not full llm call details
yes, I see that project
but inside it, trace consists only of result, without llm invocation details
thanks, let me debug that
Hi John G. what versions of openinference-instrumentation-openai openai arize-phoenix do you have? This sample successfully trace Grok, but not Openai
from openinference.semconv.resource import ResourceAttributes
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from phoenix.config import get_env_host, get_env_port
from phoenix.otel import register
tracer_provider = register(protocol="http/protobuf", project_name="test_project", auto_instrument=True)
trace.set_tracer_provider(tracer_provider)
tracer = trace.get_tracer(__name__)
collector_endpoint = f"http://{get_env_host()}:{get_env_port()}/v1/traces"
span_exporter = OTLPSpanExporter(endpoint=collector_endpoint)
simple_span_processor = SimpleSpanProcessor(span_exporter=span_exporter)
trace.get_tracer_provider().add_span_processor(simple_span_processor)
import os
import openai
OPENAI_API_KEY=''
client = openai.OpenAI(api_key=OPENAI_API_KEY)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a haiku."}],
)
print(response.choices[0].message.content)
# import os
# from groq import Groq
# GROQ_API_KEY = ""
# client = Groq(
# api_key=GROQ_API_KEY
# )
# chat_completion = client.chat.completions.create(
# messages=[
# {
# "role": "user",
# "content": "Explain the importance of low latency LLMs",
# }
# ],
# model="llama3-70b-8192",
# )
# print(chat_completion.choices[0].message.content)