I'm currently using phoenix.evals.llm_classify and it would be awesome to check the llm evaluation calls in a different project in phoenix. Found this suggestion about it but is closed as not planned. I'm just confirming here: it's not possible to achieve this with the current phoenix setup, is it? Thanks!
As per the GitHub issue, the feature to automatically instrument LLM evaluation calls into a different project in Phoenix using a phoenix_trace flag is not currently implemented. The issue was closed without this enhancement being planned, indicating that this functionality is not available in the current Phoenix setup.Would you like more details on how to manually manage evaluations or any other aspect of Phoenix? Sources:
- GitHub Issue #2551
馃挕聽Hint: Mention RunLLM in the thread for followups.
Hi Arthur M. we鈥檝e implemented projects as a source attribute, so aren鈥檛 really intended to change if the traces are coming from the same place. That being said, we do provide a using_project context manager that is normally intended for use in notebook environments to temporarily patch the source attributes and send traces to another project
Maybe give that a shot?
Hmm, not sure if I understood that correctly. But I'll try explain my side better. Today I have a separate project just to run evals against a llama index application. We don't run evals in notebooks, we have plain scripts that integrate with phoenix client and phoenix evals Then we have something like this:
results = phoenix.evals.llm_classify(
dataframe=df,
template=template,
model=phoenix.evals.OpenAIModel(model=MODEL_NAME, temperature=TEMPERATURE, api_key=API_KEY),
rails=rails,
verbose=False,
provide_explanation=provide_explanation,
progress_bar_format=None,
)I assume that for executing the evals, the phoenix.evals.OpenAIModel is probably being used. My question is: is this OpenAiModel instrumented? I like to see LLM calls directly on phoenix, I would like to see a call in there for, for example, the HALLUCINATION_SPAN_PROMPT_BASE_TEMPLATE filled with it's parameters system_message, user_message and output
