Hi all, I am trying to set up Arize for an LLM API classification use case (using an LLM API to classify text). It seems that there's Generative AI ModelType and MultiLabel type. Is there a way to have both the LLM tracing, as well as the classification metrics?
Phoenix currently supports LLM tracing and classification metrics, but they are handled separately. LLM tracing is designed to capture the execution of LLMs and the surrounding application context, while classification metrics are typically used for evaluating model performance on tasks like text classification.For LLM tracing, you can use the OpenAIInstrumentor to capture prompt and response pairs. For classification metrics, you can use the Phoenix Evals library to evaluate model performance.Would you like more detail on how to set up both LLM tracing and classification metrics in Phoenix?Sources: - AutoGen Support - Evaluate Human vs AI Classifications
๐กย Hint: Mention RunLLM in the thread for followups.
Ah, seems like I can attach evaluations to traces and then potentially create a custom metric.
yes, feel free to let us know if you have more questions
