New LlamaIndex Agent Chat Endpoints: Buffered vs Stream Responses

·Mar 22, 2024 02:32 PM

Hi guys, I'm working on a LlamaIndex Agent and I exposed 2 endpoints for chatting with it. One for a buffered response and another one for a stream response. The endpoints are exactly the same apart from the way they stream their responses. On phoenix, they're displayed differently. The stream endpoint doesnt group the spans like the buffered one does. And the tokens are not displayed for the stream LLM calls also.

New LlamaIndex Agent Chat Endpoints: Buffered vs Stream Responses

3 comments

New LlamaIndex Agent Chat Endpoints: Buffered vs Stream Responses

3 comments