RunLLM is it possible to reference the reference output in the prompt playground within my prompt templates?
馃挕聽Hint: Mention RunLLM in the thread for followups.
馃挕聽Hint: Mention RunLLM in the thread for followups.
RunLLM in my case, I want to use the prompt playground to run a prompt over a dataset of traces of the retriever type. my retrieved documents are reference outputs. therefor, I want the LLM to see these outputs and judge the retrieved results. that鈥檚 why I need to be able to reference the reference outputs (not just inputs) in my prompt. does that make sense? or am I misusing phoenix?
馃挕聽Hint: Mention RunLLM in the thread for followups.
馃挕聽Hint: Mention RunLLM in the thread for followups.
馃挕聽Hint: Mention RunLLM in the thread for followups.
Hey Andrew, your request makes a lot of sense. We tend to think of the reference as the area where ground truth is so it's where the evaluators will eventually pull data from for reference based evaluations. But I think making it possible to "chat" with your spans like this makes sense too. I filed an issue. Let me think through the design. It's sorta like you want the prompt variables to be scoped to the entire example so you can do input / output / metadata in your prompt. https://github.com/Arize-ai/phoenix/issues/10912
exactly! that would be amazing. It would dramatically improve the user experience for non-technical domain experts. My goal is to enable them to work comfortably and effectively entirely within Phoenix without needing developers in the loop. As it stands, there is little they can do in Phoenix without relying on developers to run scripts (experiments and processing traces to populate datasets). I want them to be able to use our application then easily find the relevant traces within Phoenix, play with them in the prompt playground (i.e., running ad-hoc LLM judge prompts against traces), annotate traces, and add examples to datasets. Right now this workflow is high friction. thanks for considering 馃檹
Makes too much sense. Will get something cooking
