Things got better (we also have a nice calls structure now), however it's still repeated twice per call sometimes (the second ZeroCodeAgent.run for example), and the total amount of tokens seems to not be adding up as one would expect?
Let me know if there is any other info i can share (we are mostly concerned about the fact that the innermost call takes half the time of the outermost call, so maybe we are calling things repeatdly without realising it)
(worker is a celery worker if it may be helpful)
Then .query spawns some thread executors that run the smolagents (however the the pic i posted above should be executing directly in the .query thread if i recall corretly).
if os.environ.get("APP_USE_TELEMETRY") == "true":
from phoenix.otel import register
register(project_name="zero", auto_instrument=True)
worker.run()
...
# in worker.run()
from openinference.instrumentation import using_session
with using_session(telemetry_session_id or f"{workspace_id}-{uuid()}"):
output = engine.query(task)