Understanding LLM Operations in Phoenix Sessions: A Query | Arize AI Community

Arize AI Community Icon

Understanding LLM Operations in Phoenix Sessions: A Query | Arize AI Community

Xander S.
·
Hey David K., good question. TLDR, in order for trace data to appear in Phoenix, the model or application has to be instrumented with an OpenInference tracer. In the first example using llm_generate, the OpenAIModel (an instance of phoenix.experimental.evals.OpenAIModel) is not instrumented by default, so we don't expect any traces or spans from those LLM calls to appear inside Phoenix. As a side note, it's possible to instrument the openai Python SDK client that is wrapped by phoenix.experimental.evals.OpenAIModel, in which case, you would see traces in Phoenix for these calls. You can give it a try with:
from phoenix.trace.exporter import HttpExporter from phoenix.trace.openai import OpenAIInstrumentor tracer = Tracer(exporter=HttpExporter()) OpenAIInstrumentor(tracer).instrument()
In the second case, you are using a LlamaIndex application that has been instrumented. There are multiple ways to instrument a LlamaIndex application; the most common and recommended way uses a call to set_global_handler("arize_phoenix"). This causes your LlamaIndex application to submit OpenInference traces corresponding to the various events within the application (embeddings, retrievals, LLM calls, etc.). It sounds like you might be passing the same phoenix.experimental.evals.OpenAIModel instance you used for llm_generate into your LlamaIndex application. If so, it's actually a coincidence that your LlamaIndex application is still working. LlamaIndex has their own set of LLM classes (e.g., from llama_index.llms import OpenAI, see here), and we would definitely recommended using those model classes rather than the Phoenix eval model classes when you're running your LlamaIndex application. Hope that helps!
David K.
·
Thanks for the quick and detailed answer, Xander! I'll try out your instrumenting code. Regarding the LlamaIndex LLM classes, I do already use those model classes for the second example I provided. Thanks for the heads-up though!
👍1
David K.
·
Hi Xander -- FYI, I tried the tracer code below and it didn't generate any trace in my Phoenix session. (Note I added "from phoenix.trace.tracer import Tracer", as it was needed for proper operation): import json from phoenix.experimental.evals import OpenAIModel, llm_generate ### The following code should allow these generated questions to be traced by Phoenix from phoenix.trace.exporter import HttpExporter from phoenix.trace.openai import OpenAIInstrumentor from phoenix.trace.tracer import Tracer tracer = Tracer(exporter=HttpExporter()) OpenAIInstrumentor(tracer).instrument() ### def output_parser(response: str, index: int): try: return json.loads(response) except json.JSONDecodeError as e: return {"__error__": str(e)} questions_df = llm_generate( dataframe=document_chunks_df, template=generate_questions_template, model=OpenAIModel( model_name="gpt-3.5-turbo-instruct", ), output_parser=output_parser, ) Regards,
👀1
Xander S.
·
Give this notebook a go 🙂
David K.
·
Thanks, Xander. This code works for me. However, it also illustrates the issue I've been seeing:
1.
I run with gpt-4-1106-preview, and Phoenix creates the traces properly
2.
Then I run with gpt-3.5-turbo-instruct, and output_df is still created, but Phoenix doesn't add any traces
3.
Then I run again with gpt-4-1106-preview, and Phoenix adds 3 more traces properly
I don't understand why Phoenix doesn't recognize gpt-3.5-turbo-instruct in this context, but it does in some other contexts, where it has been successfully generating traces for me.
David K.
·
I should note that when I "run again", I'm not creating a new session. I'm just running the code block beginning with "dataframe = pd.DataFrame(" again.
Xander S.
·
Hey David K., that's a good callout. At this time, we only instrument the OpenAI chat completions API. gpt-3.5-turbo-instruct uses the legacy completions API, which is set for deprecation. If that particular model is important for your work, feel free to file an enhancement ticket on our GitHub!

·

We have a new instrumentor that can do this, but it’s still in beta. You can give a try using the code below.

pip install openai openinference-instrumentation-openai opentelemetry-sdk opentelemetry-exporter-otlp

Replace the old instrumentor with the code below. This will capture and send the spans to Phoenix

from openinference.instrumentation.openai import OpenAIInstrumentor
from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

resource = Resource(attributes={})
tracer_provider = trace_sdk.TracerProvider(resource=resource)
span_exporter = OTLPSpanExporter(endpoint="http://127.0.0.1:6006/v1/traces")
span_processor = SimpleSpanProcessor(span_exporter=span_exporter)
tracer_provider.add_span_processor(span_processor=span_processor)
trace_api.set_tracer_provider(tracer_provider=tracer_provider)

OpenAIInstrumentor().instrument()

David K.
·
Thanks, Roger - I'll definitely try it out! Is the intent that this will also work for non-OpenAI models? I'm particularly interested in llama-cpp, but it would be good to know more generally. Xander -- Regarding the legacy Completions API being phased out, it looks like GPT-3.5-turbo-instruct is actually the newer model that OpenAI points to for people using those legacy Completions models: https://openai.com/blog/gpt-4-api-general-availability So I think the intent is for GPT-3.5-turbo-instruct to be around for a while. Regards, David
👍1
Roger Y.
·
I’m particularly interested in llama-cpp, but it would be good to know more generally.
this should work for anything that the openai client can connect to, e.g. if llama-cpp is being served on local server that pretends to be openai