Hi guys, having a lot of phun ( 馃槈 ) using phoenix in our dev workflow.. We've been attempting to set up a longitudinal framework which shows us performance of evals across multiple experiments. In the self-hosted docker version this doesn't seem to be supported. Is this a premium feature?
It seems likely that Phoenix does not have premium features, as it is an open-source platform. According to a GitHub issue, Phoenix is entirely open-source and free to use, with no paid features. However, the ability to track performance across multiple experiments might not be directly supported in the self-hosted version. Would you like more detail on how to potentially set up a framework for tracking performance across experiments using Phoenix? Sources: - GitHub Issue on Phoenix Features
馃挕聽Hint: Mention RunLLM in the thread for followups.
Hey Damith S. lots of improvements to experiments coming down the pipeline https://github.com/Arize-ai/phoenix/issues/3398 Time series and histograms will be a part of that upcoming work. Phoenix doesn鈥檛 have any feature gates so these will be available as we ship them! You can download and pull experiments via the API which can be used to accomplish this right now. Please let us know if you have any additional ideas on the issue and we can prioritize them:)
Thanks Mikyo (sorry for radio silence). Also, wondering if anyone has used Phoenix for voice evals. E.g., for openai's realtime api evaluations.
Would be a really big plus to have
