are traces retained indefinitely in self-hosted Phoenix by default?

Andrew S.

·Mar 29, 2025 04:45 PM

8 comments

· Sorted by Oldest

RunLLM
APP
·
No, traces are not retained indefinitely by default in self-hosted Phoenix. According to <https://github.com/Arize-ai/phoenix/issues/6594|this GitHub issue>, Phoenix has a retention policy based on the number of spans, which means that as new traces are ingested, older spans may be removed to manage database size. This policy can be configured per project. Would you like more detail on configuring retention policies in Phoenix? Sources: - <https://github.com/Arize-ai/phoenix/issues/6594|GitHub Issue on Retention Policy>
💡 Hint: Mention RunLLM in the thread for followups.
Andrew S.
·
yes RunLLM
RunLLM
APP
·
To configure retention policies in Phoenix, you can set up a project-specific retention policy that determines how long traces are retained. According to <https://github.com/Arize-ai/phoenix/pull/6703|this GitHub pull request>, Phoenix allows for the configuration of trace data retention policies, which can be set based on the number of spans or a specific timeframe. This means you can specify a maximum number of spans to retain or a time period after which traces should be purged. 1. *Setting Retention Policy*: You can configure a retention policy for each project, which will automatically delete older traces once the specified limit is reached. This can be done through the Phoenix UI or by using environment variables during the initial deployment. 2. *Database Configuration*: Phoenix uses a SQL database (SQLite by default, with PostgreSQL support) to store trace data. You can specify the database URL and schema using environment variables like `PHOENIX_SQL_DATABASE_URL` and `PHOENIX_SQL_DATABASE_SCHEMA` as described in the <https://github.com/Arize-ai/phoenix/blob/main/docs/deployment/configuration.md|configuration documentation>. 3. *Automatic Deletion*: The retention policy can be set to automatically delete traces after a certain number of spans or a specific time period, ensuring that the database does not grow indefinitely. If you need further assistance on setting up these configurations, feel free to ask! Sources: - <https://github.com/Arize-ai/phoenix/pull/6703|GitHub Pull Request on Trace Retention Policies> - <https://github.com/Arize-ai/phoenix/blob/main/docs/deployment/configuration.md|configuration documentation>
💡 Hint: Mention RunLLM in the thread for followups.
Mikyo
·
Hey Andrew S. - retention policies are coming in the next major release. You can delete traces from your projects based on time right now in this dropdown.
Andrew S.
·
ok, cool. thanks! should we be concerned about performance issues if we don’t delete them? Just curious how quickly we will need to come up with a data retention solution such as exporting traces and storing in s3 or redshift or something. we’re getting over 3K traces/day and I expect that to grow a lot
Andrew S.
·
using postgres
Mikyo
·
There is one performance issue that we are aware of when your number of traces grows rapidly - it's if you have fragmented traces (e.g. phoenix never sees the root span). We're currently working on an index to fix that specific problem. I think offloading your traces to S3 and keeping your phoenix snappy makes sense to us so we're working on the retention policies and then we will create triggers and webhooks in the future for more automation.
Andrew S.
·
awesome

RunLLM
APP
·
No, traces are not retained indefinitely by default in self-hosted Phoenix. According to <https://github.com/Arize-ai/phoenix/issues/6594|this GitHub issue>, Phoenix has a retention policy based on the number of spans, which means that as new traces are ingested, older spans may be removed to manage database size. This policy can be configured per project. Would you like more detail on configuring retention policies in Phoenix? Sources: - <https://github.com/Arize-ai/phoenix/issues/6594|GitHub Issue on Retention Policy>
💡 Hint: Mention RunLLM in the thread for followups.
Andrew S.
·
yes RunLLM
RunLLM
APP
·
To configure retention policies in Phoenix, you can set up a project-specific retention policy that determines how long traces are retained. According to <https://github.com/Arize-ai/phoenix/pull/6703|this GitHub pull request>, Phoenix allows for the configuration of trace data retention policies, which can be set based on the number of spans or a specific timeframe. This means you can specify a maximum number of spans to retain or a time period after which traces should be purged. 1. *Setting Retention Policy*: You can configure a retention policy for each project, which will automatically delete older traces once the specified limit is reached. This can be done through the Phoenix UI or by using environment variables during the initial deployment. 2. *Database Configuration*: Phoenix uses a SQL database (SQLite by default, with PostgreSQL support) to store trace data. You can specify the database URL and schema using environment variables like `PHOENIX_SQL_DATABASE_URL` and `PHOENIX_SQL_DATABASE_SCHEMA` as described in the <https://github.com/Arize-ai/phoenix/blob/main/docs/deployment/configuration.md|configuration documentation>. 3. *Automatic Deletion*: The retention policy can be set to automatically delete traces after a certain number of spans or a specific time period, ensuring that the database does not grow indefinitely. If you need further assistance on setting up these configurations, feel free to ask! Sources: - <https://github.com/Arize-ai/phoenix/pull/6703|GitHub Pull Request on Trace Retention Policies> - <https://github.com/Arize-ai/phoenix/blob/main/docs/deployment/configuration.md|configuration documentation>
💡 Hint: Mention RunLLM in the thread for followups.
Mikyo
·
Hey Andrew S. - retention policies are coming in the next major release. You can delete traces from your projects based on time right now in this dropdown.
Andrew S.
·
ok, cool. thanks! should we be concerned about performance issues if we don’t delete them? Just curious how quickly we will need to come up with a data retention solution such as exporting traces and storing in s3 or redshift or something. we’re getting over 3K traces/day and I expect that to grow a lot
Andrew S.
·
using postgres
Mikyo
·
There is one performance issue that we are aware of when your number of traces grows rapidly - it's if you have fragmented traces (e.g. phoenix never sees the root span). We're currently working on an index to fix that specific problem. I think offloading your traces to S3 and keeping your phoenix snappy makes sense to us so we're working on the retention policies and then we will create triggers and webhooks in the future for more automation.
Andrew S.
·
awesome