For self-hosted phoenix instance. How do i scale it? How can add more workers and web_concurrency?
It seems likely that Phoenix can be scaled both vertically (by increasing CPU/memory) and horizontally (by running multiple Phoenix containers behind a single PostgreSQL database). Phoenix is stateless, so you can add more instances as needed to handle increased load. There is no explicit mention of a "web_concurrency" parameter, but scaling out by adding more containers is supported, and you should monitor memory and database usage to adjust resources accordingly. For production, enabling batch processing and using gRPC transport is recommended for efficiency. You can scale up by increasing resources (CPU, RAM) per container, and scale out by deploying multiple Phoenix containers connected to the same Postgres instance. Regularly monitor memory and disk usage, and enable database backups for reliability. There is no documentation on a specific "web_concurrency" setting, so concurrency is managed by the number of running containers and the resources allocated to each one, not by a single config variable for worker count or concurrency level. Would you like more detail on deployment or resource tuning?Sources: - Production Guide | Arize Phoenix - Phoenix container scaling discussion (Slack)
💡 Hint: Mention RunLLM in the thread for followups.
Hi Darshan, In line with what the bot here is saying, i think there are a couple ways to scale a self-hosted Phoenix instance.
Run multiple instances (containers/VMs) behind a load-balancer
Use a production database like PostgreSQL (not SQLite) and enable batch processing of data to improve throughput
In regards to web_concurrency, there isn't a direct setting in phoenix. but you can try and achieve it with the running more instances + giving each the sufficient capacity. 🔒[private user] might be able able to give you more info
When PostgreSQL database is used, the phoenix instance itself can serve a large number of concurrent requests asynchronously because the requests are generally I/O bound. (Python doesn’t have workers or web_concurrency configuration.)
the railway deployment option works great
