In a multi-tenant LLM application, data isolation is your responsibility as the application builder, not the model or library provider鈥檚. Caching is an architectural decision, and any caching layer (proxy-level like LiteLLM, or implicit provider-side caching behind a shared API key) must be designed with tenant boundaries in mind.
If responses are cached purely on prompt similarity (or prompt hash) without tenant/user scoping, then the scenario you described is real.