https://arize-ai.slack.com/archives/C016XGKCG0P/p1749543083889179

https://arize-ai.slack.com/archives/C016XGKCG0P/p1749543083889179 | Arize AI Community

RunLLM
APP
·
Phoenix does not currently support a timeline view similar to AWS X-Ray. This feature is not available, but you can request it as an enhancement on the Phoenix GitHub repository. According to a recent Slack discussion, users are encouraged to open a feature request if they need this functionality. Would you like more detail on how to submit a feature request or alternative ways to visualize traces in Phoenix?Sources: - Slack discussion: timeline view not supported
💡 Hint: Mention RunLLM in the thread for followups.
John G.
·
Luca P. - we shouldn't be repeating calls like this. Would you mind sharing:
1.
The version of openinference-instrumentation-smolagents you're using, if you're using that package
2.
Any other openinference-instrumentation-xxx libraries present in your environment
Thank you!
Luca P.
·
would a pip freeze answer your question?

openinference-instrumentation==0.1.32
openinference-instrumentation-smolagents==0.1.12
openinference-semantic-conventions==0.1.17

this is my current environment.

Luca P.

if os.environ.get("APP_USE_TELEMETRY") == "true":
    from phoenix.otel import register
    register(project_name="zero", auto_instrument=True)
worker.run()

...
# in worker.run()
from openinference.instrumentation import using_session
with using_session(telemetry_session_id or f"{workspace_id}-{uuid()}"):
        output = engine.query(task)

this is what we run to setup everything.

Luca P.
·
(worker is a celery worker if it may be helpful) Then .query spawns some thread executors that run the smolagents (however the the pic i posted above should be executing directly in the .query thread if i recall corretly).
John G.
·
Okay that's great - trying to replicate this now!
Luca P.
·
Let me know if there is any other info i can share (we are mostly concerned about the fact that the innermost call takes half the time of the outermost call, so maybe we are calling things repeatdly without realising it)
John G.
·
Okay it doesn't look like there's a general problem with duplicate calls on those versions, I'm seeing the correct traces. Going to try and match your setup more closely. Are you using the InferenceClientModel or InferenceClient object within smolagents, and which model(s) are you using?
Luca P.
·
The calls in the picture above are generated by calling a LiteLLMModel that is defined by smolagent.
👍1
Luca P.
·
`model = LiteLLMModel("gemini-2.0-flash") model(messages)`
John G.
·
Ah okay think I found it - can you upgrade to openinference-instrumentation-smolagents==0.1.13 We added better support for the generate function and base model invocation approaches. I'm seeing that remove the repeat issue - though I only saw it repeat a total of 3 calls
Luca P.
·
sure, i will try and come back to you. Thanks!
🙌1
Luca P.
·
Things got better (we also have a nice calls structure now), however it's still repeated twice per call sometimes (the second ZeroCodeAgent.run for example), and the total amount of tokens seems to not be adding up as one would expect?
John G.
·
Luca P. - it looks like this duplication is a bug on our end, likely not an issue with your instrumentation. The team is repro'ing today and should pick it up this coming week. Apologies for all the issues here!

16 comments