Customizing Model Names in Phoenix Traces for Llama2CPP Usage
Hi – We’re using our own LLM (llama2cpp) within an OpenAI wrapper, and we can successfully get Phoenix to provide traces for it. However, we’ve run into a couple of issues: Question 1: Our LLM is always erroneously called “gpt-3.5-turbo” as the model name in the Phoenix trace. Is there a parameter we can override to change it to an arbitrary name? It seems like using “model_name = “ in OpenAIModel might actual force an OpenAI model to be used, but I could be wrong here. We define our LLM as follows: llm=OpenAI( api_base=http://xxx.yyy.zzz.aaa/, ) service_context= ServiceContext.from_defaults(llm=llm) Then we generate answers to the questions in our vector store, using the LLM: vector_index = VectorStoreIndex(train_nodes, service_context=service_context) query_engine = vector_index.as_query_engine(verbose=True) We have verified that the answers are generated using llama2cpp, but the Phoenix trace reads: { "id": "chatcmpl-c2a035ab-488e-4e21-a6ac-dbc51a5397d3", "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": " <block of content printed here>", "role": "assistant" } } ], "created": 1707240657, "model": "gpt-3.5-turbo", "object": "chat.completion", "usage": { "completion_tokens": 236, "prompt_tokens": 833, "total_tokens": 1069 } } How can we change the “model” line highlighted above to reflect a user-defined name for our local LLM? Question 2: As a second issue, as noted in the id field above, our local LLM is somehow defaulting to ChatCompletion, instead of the Completion trace that we get when running, say, GPT3.5. This might be something we need to address through LlamaIndex, but there’s still an issue on the Phoenix side: Whereas Completion creates a Query/Retrieval span where all retrieved documents can be recovered through Phoenix, ChatCompletion doesn’t allow access to any retrieved documents, as far as I can tell. In other words, get_retrieved_documents(px.active_session()) returns an empty dataframe (with only column headers showing) when using ChatCompletion. Is this expected behavior? Is there a different way to access the retrieved documents when using ChatCompletion? Thanks and regards, David
