Troubleshooting Empty Token Fields in FastAPI with OpenTelemetry

·Feb 21, 2024 08:39 AM

Hi Team, I have a fast api application, i am sending traces to phoenix ui using opentelemetry. I am able to get traces and spans , but why is the total tokens , input, output still empty ? please help me out if i am missing something.

14 comments

· Sorted by Oldest

Priya

from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.sdk.resources import Resource
from opentelemetry.instrumentation.openai import OpenAIInstrumentor
import os


def instrument():
    endpoint = os.getenv('PHOENIX_COLLECTOR_ENDPOINT')

    resource = Resource(attributes={})
    # Create a resource with the defined attributes
    tracer_provider = trace_sdk.TracerProvider(resource=resource)
    # Adjust the endpoint URL to include the correct path for receiving traces
    span_exporter = OTLPSpanExporter(endpoint=endpoint)
    span_processor = SimpleSpanProcessor(span_exporter=span_exporter)
    tracer_provider.add_span_processor(span_processor=span_processor)
    trace_api.set_tracer_provider(tracer_provider=tracer_provider)


instrument()
OpenAIInstrumentor().instrument()

app = FastAPI()

Priya


@router.post("/get_answer/")
def get_answer(chat: BotQuery, token: Annotated[str, Depends(verify_token)]):
    tracer = trace.get_tracer(__name__)
    with tracer.start_as_current_span("get_answer"):
        """ This function is used to get an answer to a given question from the LLM agent.

        Args:
            chat (BotQuery): A dataclass object containing the question and other related information.
            user_id (Annotated[str, Depends): The user_id of the user who is asking the question.

        Raises:
            HTTPException: The function checks if user_id is an instance of HTTPException (which would indicate an error in the authentication process),
                and raises that exception if so.
            user_id: _description_

        Returns:
            Answer: An Answer object containing the answer to the question, its confidence score, and any relevant references.
        """

        if type(token) == HTTPException:
            raise token

        span = trace.get_current_span()

        user_id = token.username
        question_id = chat.question_id
        # Question ID is an optional field;
        # For slack, we currently set this value as the unique idenitifer of the slack message
        if not chat.question_id:
            question_id = str(uuid.uuid4())

        scope, question = chat.scope, chat.question
        key = f"Scope: {scope}, Question: {question}"
        value = get_key_value(key)

        reference_list, confidence_score, answer_content = list(), 0.0, ""
        if value:
            logging.info(f"Found '{key}' in cache")
            answer_obj = pickle.loads(value)
            reference_list = answer_obj.references
            confidence_score = answer_obj.confidence_score
            answer_content = answer_obj.answer
        else:
            # Ask LLM Agent for an answer
            logging.info(f"Fetching results from LLM for {question}")
            vector_store = get_vector_store_path(scope)
            app = ScopedDocQaApp(
                vector_store, embedding_resource=embedding_resource, generative_model=generative_model)
            answer = app.get_answer(question)
            answer_references = answer.get("source_documents", [])
            for _, reference in enumerate(answer_references):
                ref = Reference(content=reference.page_content,
                                url_link=reference.metadata['source_url'],
                                relevance_score=reference.metadata['relevance_score'])
                reference_list.append(ref)

            answer_content = answer.get(
                "result", "Sorry, we can not answer this question.")
            confidence_score = answer.get("confidence_score", 0.0)

            answer_obj = Answer(
                question_id=question_id,
                confidence_score=confidence_score,
                answer=answer_content,
                references=reference_list)

            # Update our cache
            answer_pickle_obj = pickle.dumps(answer_obj)
            set_key_value(key, answer_pickle_obj)

            # Update our db store with question and response
            logging.debug('Push conversation to chat history store')
            collection_name = os.getenv("GOOGLE_FIRESTORE_COLLECTION_NAME")

            user_ref = db.collection(collection_name).document(user_id)
            subcollection_name = os.getenv("GOOGLE_FIRESTORE_CHAT_SUBCOLLECTION_NAME")
            document_ref = user_ref.collection(
                subcollection_name).document(question_id)
            document_ref.set({
                "question": question,
                "answer": answer_content,
                "confidence_score": confidence_score,
                "scope": scope,
                "question_source": chat.source,
                "reference": pickle.dumps(reference_list)
            })

            document_ref.update({"created_time": firestore.SERVER_TIMESTAMP})

            span.set_attribute("output", answer_obj.answer)
            span.set_attribute("input", question)
            if answer_obj.answer:
                span.set_status(trace.Status(trace.StatusCode.OK))
                span.set_attribute("status", "OK")
            else:
                span.set_status(trace.Status(trace.StatusCode.ERROR))
                span.set_attribute("status", "No Answer")
    return answer_obj

Dustin N.
·
Hi Priya, thanks for trying out our OpenInference instrumentors! As far as I understand, the OpenAI instrumentor should try to determine the token count information from your quests if that information is available. Because the calls to the OpenAI SDK are abstracted away in the code sample you sent it's hard to tell exactly what's happening, but out of curiosity, are you using the chat API or the chat_completion API?
Priya
·
actually i am getting the count , if i move inside the info. But it should also be available under the columns in the ui and the one at the top, how do we do that? Well I am using AzureChatOpenAI

Roger Y.

Hi Priya. Can you trying changing this line

from opentelemetry.instrumentation.openai import OpenAIInstrumentor

from openinference.instrumentation.openai import OpenAIInstrumentor

the former is actually not our package. Ours is this.

Roger Y.

also, you would want to change these two lines

span.set_attribute("output", answer_obj.answer)
span.set_attribute("input", question)

span.set_attribute("output.value", answer_obj.answer)
span.set_attribute("input.value", question)

for them to show up in the UI

Priya
·
hey, thankyou pointing the issue. It works. Can you tell me how to show the tokencount as well ? as you can see it comes here but does not appear on top.
Priya
·
and also the spankind ? "kind": "SpanKind.INTERNAL", "parent_id": null, "start_time": "2024-02-22T08:49:27.469885Z", "end_time": "2024-02-22T08:49:51.773638Z", "status": {
Priya
·
btw if i use from openinference.instrumentation.openai import OpenAIInstrumentor then am i getting only traces, but if i use opentelemetry i get the spans as well, is there any other thing i am missing out ?
Roger Y.
·
Can you tell me how to show the tokencount as well ?
this should be resolved by using openinference.instrumentation.openai
then am i getting only traces, but if i use opentelemetry i get the spans as well
can you elaborate on what you mean by this? do you have a screenshot that shows the difference?
Priya
·
if i use opentelemetry i am getting this :
Priya
·
if i use openinference i am getting this : only the traces, no spans ?

Roger Y.

Thanks for sharing the screenshots. Sorry for the inconvenience. Would you mind running openinference one more time but with the following code to turn on logging beforehand?

for name, logger in logging.root.manager.loggerDict.items():
    if name.startswith("openinference.") and isinstance(logger, logging.Logger):
        logger.setLevel(logging.DEBUG)
        logger.handlers.clear()
        logger.addHandler(logging.StreamHandler())

Roger Y.
·
and just to confirm, you’re using openai >= 1.0.0, correct? you can check this using the following code snippet
>>> from importlib.metadata import version >>> version("openai") '1.12.0'

Priya

from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.sdk.resources import Resource
from opentelemetry.instrumentation.openai import OpenAIInstrumentor
import os


def instrument():
    endpoint = os.getenv('PHOENIX_COLLECTOR_ENDPOINT')

    resource = Resource(attributes={})
    # Create a resource with the defined attributes
    tracer_provider = trace_sdk.TracerProvider(resource=resource)
    # Adjust the endpoint URL to include the correct path for receiving traces
    span_exporter = OTLPSpanExporter(endpoint=endpoint)
    span_processor = SimpleSpanProcessor(span_exporter=span_exporter)
    tracer_provider.add_span_processor(span_processor=span_processor)
    trace_api.set_tracer_provider(tracer_provider=tracer_provider)


instrument()
OpenAIInstrumentor().instrument()

app = FastAPI()

Priya


@router.post("/get_answer/")
def get_answer(chat: BotQuery, token: Annotated[str, Depends(verify_token)]):
    tracer = trace.get_tracer(__name__)
    with tracer.start_as_current_span("get_answer"):
        """ This function is used to get an answer to a given question from the LLM agent.

        Args:
            chat (BotQuery): A dataclass object containing the question and other related information.
            user_id (Annotated[str, Depends): The user_id of the user who is asking the question.

        Raises:
            HTTPException: The function checks if user_id is an instance of HTTPException (which would indicate an error in the authentication process),
                and raises that exception if so.
            user_id: _description_

        Returns:
            Answer: An Answer object containing the answer to the question, its confidence score, and any relevant references.
        """

        if type(token) == HTTPException:
            raise token

        span = trace.get_current_span()

        user_id = token.username
        question_id = chat.question_id
        # Question ID is an optional field;
        # For slack, we currently set this value as the unique idenitifer of the slack message
        if not chat.question_id:
            question_id = str(uuid.uuid4())

        scope, question = chat.scope, chat.question
        key = f"Scope: {scope}, Question: {question}"
        value = get_key_value(key)

        reference_list, confidence_score, answer_content = list(), 0.0, ""
        if value:
            logging.info(f"Found '{key}' in cache")
            answer_obj = pickle.loads(value)
            reference_list = answer_obj.references
            confidence_score = answer_obj.confidence_score
            answer_content = answer_obj.answer
        else:
            # Ask LLM Agent for an answer
            logging.info(f"Fetching results from LLM for {question}")
            vector_store = get_vector_store_path(scope)
            app = ScopedDocQaApp(
                vector_store, embedding_resource=embedding_resource, generative_model=generative_model)
            answer = app.get_answer(question)
            answer_references = answer.get("source_documents", [])
            for _, reference in enumerate(answer_references):
                ref = Reference(content=reference.page_content,
                                url_link=reference.metadata['source_url'],
                                relevance_score=reference.metadata['relevance_score'])
                reference_list.append(ref)

            answer_content = answer.get(
                "result", "Sorry, we can not answer this question.")
            confidence_score = answer.get("confidence_score", 0.0)

            answer_obj = Answer(
                question_id=question_id,
                confidence_score=confidence_score,
                answer=answer_content,
                references=reference_list)

            # Update our cache
            answer_pickle_obj = pickle.dumps(answer_obj)
            set_key_value(key, answer_pickle_obj)

            # Update our db store with question and response
            logging.debug('Push conversation to chat history store')
            collection_name = os.getenv("GOOGLE_FIRESTORE_COLLECTION_NAME")

            user_ref = db.collection(collection_name).document(user_id)
            subcollection_name = os.getenv("GOOGLE_FIRESTORE_CHAT_SUBCOLLECTION_NAME")
            document_ref = user_ref.collection(
                subcollection_name).document(question_id)
            document_ref.set({
                "question": question,
                "answer": answer_content,
                "confidence_score": confidence_score,
                "scope": scope,
                "question_source": chat.source,
                "reference": pickle.dumps(reference_list)
            })

            document_ref.update({"created_time": firestore.SERVER_TIMESTAMP})

            span.set_attribute("output", answer_obj.answer)
            span.set_attribute("input", question)
            if answer_obj.answer:
                span.set_status(trace.Status(trace.StatusCode.OK))
                span.set_attribute("status", "OK")
            else:
                span.set_status(trace.Status(trace.StatusCode.ERROR))
                span.set_attribute("status", "No Answer")
    return answer_obj

Dustin N.
·
Hi Priya, thanks for trying out our OpenInference instrumentors! As far as I understand, the OpenAI instrumentor should try to determine the token count information from your quests if that information is available. Because the calls to the OpenAI SDK are abstracted away in the code sample you sent it's hard to tell exactly what's happening, but out of curiosity, are you using the chat API or the chat_completion API?
Priya
·
actually i am getting the count , if i move inside the info. But it should also be available under the columns in the ui and the one at the top, how do we do that? Well I am using AzureChatOpenAI

Roger Y.

Hi Priya. Can you trying changing this line

from opentelemetry.instrumentation.openai import OpenAIInstrumentor

from openinference.instrumentation.openai import OpenAIInstrumentor

the former is actually not our package. Ours is this.

Roger Y.

also, you would want to change these two lines

span.set_attribute("output", answer_obj.answer)
span.set_attribute("input", question)

span.set_attribute("output.value", answer_obj.answer)
span.set_attribute("input.value", question)

for them to show up in the UI

Priya
·
hey, thankyou pointing the issue. It works. Can you tell me how to show the tokencount as well ? as you can see it comes here but does not appear on top.
Priya
·
and also the spankind ? "kind": "SpanKind.INTERNAL", "parent_id": null, "start_time": "2024-02-22T08:49:27.469885Z", "end_time": "2024-02-22T08:49:51.773638Z", "status": {
Priya
·
btw if i use from openinference.instrumentation.openai import OpenAIInstrumentor then am i getting only traces, but if i use opentelemetry i get the spans as well, is there any other thing i am missing out ?
Roger Y.
·
Can you tell me how to show the tokencount as well ?
this should be resolved by using openinference.instrumentation.openai
then am i getting only traces, but if i use opentelemetry i get the spans as well
can you elaborate on what you mean by this? do you have a screenshot that shows the difference?
Priya
·
if i use opentelemetry i am getting this :
Priya
·
if i use openinference i am getting this : only the traces, no spans ?

Roger Y.

Thanks for sharing the screenshots. Sorry for the inconvenience. Would you mind running openinference one more time but with the following code to turn on logging beforehand?

for name, logger in logging.root.manager.loggerDict.items():
    if name.startswith("openinference.") and isinstance(logger, logging.Logger):
        logger.setLevel(logging.DEBUG)
        logger.handlers.clear()
        logger.addHandler(logging.StreamHandler())

Roger Y.
·
and just to confirm, you’re using openai >= 1.0.0, correct? you can check this using the following code snippet
>>> from importlib.metadata import version >>> version("openai") '1.12.0'