Caching in LLM Apps: Risks to User Data Privacy and Security
Hi all, not really an Arize AI question, but I believe this community would have valuable insights in this matter. I was thinking about caching in LLM applications and its implications for user data privacy/security. Suppose I have a chat server application that serves multiple users simultaneously with prompt caching at the proxy level (e.g., using LiteLLM) or even without proxy but using a single API key for the entire application (which would use the provider API internal caching), is there a risk that user-specific data could be inadvertently cached and then served to other users? Example: user A's query invoke a tool calling to retreive his account balance and then the LLM generated response is cached. User B makes a similar query that hits the cache? Could the response containing user A's account balance be served to user B? If yes, is there a mechanism to prevent this from happening? Has anyone faced this kind of issue in production systems? Any insights on how developers are controling the LLM caching across different user sessions or even between different phases (ie application cache versus user cache)?
