Prompt Management Feature with Code Execution in Phoenix?

Tiago F. · 2025-09-05T12:40:15.390Z

Hi is there a prompt management feature that also connects to the code version and can execute that code from the prompt playground (the whole trace) instead of just instrumenting the llm calls? Is that planned for phoenix? Thanks

4 comments

· Sorted by Oldest

RunLLM
APP
·
·
Phoenix's prompt management allows you to create, store, modify, and version prompts, and connect them to code via SDKs (Python/TypeScript). You can pull prompts by name, version, or tag, and use them in your code, ensuring your code and prompt versions stay in sync. However, executing the entire code version (not just LLM calls) directly from the prompt playground is not currently a documented feature—Phoenix's playground focuses on prompt/LLM invocation, tracing, and span replay, not full code execution or workflow replay from source code. Planned and current features emphasize prompt iteration, span replay, and experiment tracking, but not full codebase execution from the playground interface as of now.There are ongoing discussions and feature requests about deeper integration, such as replaying LLM spans, annotating runs, and linking prompt versions to traces, but executing arbitrary code versions from the playground is not explicitly supported or planned in the public roadmap or documentation. Prompt management and span replay are designed to facilitate prompt and LLM experimentation, not full workflow execution from code versions in the playground. For more, see the prompt management and playground docs, and related GitHub issues and specs on prompt management and replay features (docs, specs, issue).Would you like more detail on how prompt management and span replay work, or on the current limitations regarding code execution from the playground?Sources: - Prompt Playground and Replay Issue - docs - specs
💡 Hint: Mention RunLLM in the thread for followups.
Tiago F.
·
·
I guess not...also the agent graph is ok, but a framework-agnostic "LangGraph Studio" would be much more powerfull.
Tiago F.
·
·
here is an example of what I mean: https://langfuse.com/docs/evaluation/dataset-runs/native-run
Xander S.
·
·
Hey Tiago F., not sure if this is what you're after, but you can select the dataset and prompt in playground from the dropdowns in the image below

RunLLM
APP
·
·
Phoenix's prompt management allows you to create, store, modify, and version prompts, and connect them to code via SDKs (Python/TypeScript). You can pull prompts by name, version, or tag, and use them in your code, ensuring your code and prompt versions stay in sync. However, executing the entire code version (not just LLM calls) directly from the prompt playground is not currently a documented feature—Phoenix's playground focuses on prompt/LLM invocation, tracing, and span replay, not full code execution or workflow replay from source code. Planned and current features emphasize prompt iteration, span replay, and experiment tracking, but not full codebase execution from the playground interface as of now.There are ongoing discussions and feature requests about deeper integration, such as replaying LLM spans, annotating runs, and linking prompt versions to traces, but executing arbitrary code versions from the playground is not explicitly supported or planned in the public roadmap or documentation. Prompt management and span replay are designed to facilitate prompt and LLM experimentation, not full workflow execution from code versions in the playground. For more, see the prompt management and playground docs, and related GitHub issues and specs on prompt management and replay features (docs, specs, issue).Would you like more detail on how prompt management and span replay work, or on the current limitations regarding code execution from the playground?Sources: - Prompt Playground and Replay Issue - docs - specs
💡 Hint: Mention RunLLM in the thread for followups.
Tiago F.
·
·
I guess not...also the agent graph is ok, but a framework-agnostic "LangGraph Studio" would be much more powerfull.
Tiago F.
·
·
here is an example of what I mean: https://langfuse.com/docs/evaluation/dataset-runs/native-run
Xander S.
·
·
Hey Tiago F., not sure if this is what you're after, but you can select the dataset and prompt in playground from the dropdowns in the image below