Luca P.

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

No worries, thanks for the update 🙂

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Things got better (we also have a nice calls structure now), however it's still repeated twice per call sometimes (the second ZeroCodeAgent.run for example), and the total amount of tokens seems to not be adding up as one would expect?

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

sure, i will try and come back to you. Thanks!

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

`model = LiteLLMModel("gemini-2.0-flash") model(messages)`

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

The calls in the picture above are generated by calling a LiteLLMModel that is defined by smolagent.

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

Let me know if there is any other info i can share (we are mostly concerned about the fact that the innermost call takes half the time of the outermost call, so maybe we are calling things repeatdly without realising it)

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

(worker is a celery worker if it may be helpful) Then .query spawns some thread executors that run the smolagents (however the the pic i posted above should be executing directly in the .query thread if i recall corretly).

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

if os.environ.get("APP_USE_TELEMETRY") == "true":
    from phoenix.otel import register
    register(project_name="zero", auto_instrument=True)
worker.run()

...
# in worker.run()
from openinference.instrumentation import using_session
with using_session(telemetry_session_id or f"{workspace_id}-{uuid()}"):
        output = engine.query(task)

this is what we run to setup everything.

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

openinference-instrumentation==0.1.32
openinference-instrumentation-smolagents==0.1.12
openinference-semantic-conventions==0.1.17

this is my current environment.

Commented on https://arize-ai.slack.com/archives/C016XGKCG0P/p1...·Posted inPhoenix Support

Luca P.

would a pip freeze answer your question?