Issue with Token Count for Amazon Bedrock Models in Docker

·Mar 17, 2024 03:38 PM

I am not getting the token count for Amazon Bedrock models (especially AWS titan express). I am using docker method of deploying the phoenix.

10 comments

· Sorted by Oldest

Mikyo
·
Hi Chaitamya. Are you only using LlamaIndex instrumentation? Also are you using streaming? We do have some limitations with token counting across vendors
Chaitanya
·
I am using llama-index instrumentation. I am not using streaming in responses. Where are the limitations listed for vendors?
Mikyo
·
Bedrock we should have token counts. Let me see if LlamaIndex just isn’t parsing them correctly as they are in the headers
Chaitanya
·
Any update on this Mikyo?
Mikyo
·
Hey Chaitanya - Dustin N. is gonna look into this one. In general it's probably something missing in LlamaIndex so we won't have a fix super soon unless you would like to instrument bedrock too.
👍1
Dustin N.
·
Hi Chaitanya looks like Amazon Bedrock models do not have token counts accessible to us when instrumenting LlamaIndex, we've submitted a ticket but for now additionally using our AWS Bedrock instrumentation might be your best bet
🙌1
Chaitanya
·
Hi Dustin N. Is there any update on the ticket? I am using Bedrock instrumentation but there will be two spans -- one for llama-index and other for bedrock . It will be difficult to segregate these two spans. Hope you understand.
Mikyo
·
Hey Chaitanya - I understand the stacking of two LLM calls is annoying and we will definitely need to address it. A few questions for you if you could get some feedback.
Would you be okay with suppressing the LLM span from LlamaIndex or do you need the info from the templating and the prompt template variables?
Are you getting the proper token counts from the LLM spans from the bedrock instrumentation?
How are you using token counts today? Is it for cost measuring? Is it something different?
Let us know! In general getting the token counts from within LlamaIndex will take a bit more time as it requires code changes to LlamaIndex which is harder for us to control or ask you to upgrade so wanting to make sure we have a mechanism by which we can unblock you. Thanks again for using Phoenix!
Chaitanya
·
Hi Mikyo. I am getting proper token counts from Bedrock instrumentation. I am interested in suppressing either of the LLM spans. The token counts are useful to estimate the cost and see how it measures in chat mode and query mode. Its okay I understand. Do you have any plans to integrate with haystack?
Mikyo
·
Chaitanya we do not have haystack support and probably don't have bandwidth at the current moment to tackle this. I filed a ticket. https://github.com/Arize-ai/openinference/issues/363 Definitely let us know your use-case in the ticket. We'd definitely would like to support it in the long term.

Mikyo
·
Hi Chaitamya. Are you only using LlamaIndex instrumentation? Also are you using streaming? We do have some limitations with token counting across vendors
Chaitanya
·
I am using llama-index instrumentation. I am not using streaming in responses. Where are the limitations listed for vendors?
Mikyo
·
Bedrock we should have token counts. Let me see if LlamaIndex just isn’t parsing them correctly as they are in the headers
Chaitanya
·
Any update on this Mikyo?
Mikyo
·
Hey Chaitanya - Dustin N. is gonna look into this one. In general it's probably something missing in LlamaIndex so we won't have a fix super soon unless you would like to instrument bedrock too.
👍1
Dustin N.
·
Hi Chaitanya looks like Amazon Bedrock models do not have token counts accessible to us when instrumenting LlamaIndex, we've submitted a ticket but for now additionally using our AWS Bedrock instrumentation might be your best bet
🙌1
Chaitanya
·
Hi Dustin N. Is there any update on the ticket? I am using Bedrock instrumentation but there will be two spans -- one for llama-index and other for bedrock . It will be difficult to segregate these two spans. Hope you understand.
Mikyo
·
Hey Chaitanya - I understand the stacking of two LLM calls is annoying and we will definitely need to address it. A few questions for you if you could get some feedback.
Would you be okay with suppressing the LLM span from LlamaIndex or do you need the info from the templating and the prompt template variables?
Are you getting the proper token counts from the LLM spans from the bedrock instrumentation?
How are you using token counts today? Is it for cost measuring? Is it something different?
Let us know! In general getting the token counts from within LlamaIndex will take a bit more time as it requires code changes to LlamaIndex which is harder for us to control or ask you to upgrade so wanting to make sure we have a mechanism by which we can unblock you. Thanks again for using Phoenix!
Chaitanya
·
Hi Mikyo. I am getting proper token counts from Bedrock instrumentation. I am interested in suppressing either of the LLM spans. The token counts are useful to estimate the cost and see how it measures in chat mode and query mode. Its okay I understand. Do you have any plans to integrate with haystack?
Mikyo
·
Chaitanya we do not have haystack support and probably don't have bandwidth at the current moment to tackle this. I filed a ticket. https://github.com/Arize-ai/openinference/issues/363 Definitely let us know your use-case in the ticket. We'd definitely would like to support it in the long term.