The OpenAIInstrumentor does not explicitly mention capturing the first token latency in a stream response in the provided data sources. It is designed to instrument calls to OpenAI's API for observability, but specific metrics like first token latency are not detailed. Would you like more information on how to potentially capture this metric manually?Sources:
- https://github.com/Arize-ai/phoenix/issues/1977
馃挕聽Hint: Mention RunLLM in the thread for followups.
Hey Xiaohan W., we don't currently capture time to first token, but this definitely makes sense as metric you'd want to keep track of. Feel free to open an enhancement request on GitHub!