Hi - I'm having difficulty using the new Save/Load TraceDataset feature. I started with the GitHub template: dataset = TraceDataset(...) dataset_id = dataset.save() loaded_dataset = TraceDataset.load(dataset_id) But I'm having trouble properly specifying what should go inside the TraceDataset() parentheses, since I can grab all spans but I can't seem to format them properly to pass as a TraceDataset parameter: from phoenix.trace import TraceDataset allspans=px.active_session().get_spans_dataframe() dataset = TraceDataset.from_spans(allspans) ######### Generates an error What's the proper way to do this, for both loading and saving? Thanks, David
Also curious, do you hit the same issue if you use px.active_session().get_trace_dataset()?
Hi Xander -- Here is the traceback:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[56], line 4
2 allspans=px.active_session().get_spans_dataframe()
3 #allspans=px.active_session().get_trace_dataset()
----> 4 dataset = TraceDataset.from_spans(allspans)
File ~\AppData\Local\anaconda3\envs\testdir\Lib\site-packages\phoenix\trace\trace_dataset.py:139, in TraceDataset.from_spans(cls, spans)
128 @classmethod
129 def from_spans(cls, spans: List[Span]) -> "TraceDataset":
130 """Creates a TraceDataset from a list of spans.
131
132 Args:
(...)
136 TraceDataset: A TraceDataset containing the spans.
137 """
138 return cls(
--> 139 pd.json_normalize(
140 (json.loads(span_to_json(span)) for span in spans), # type: ignore
141 max_level=1,
142 )
143 )
File ~\AppData\Local\anaconda3\envs\testdir\Lib\site-packages\pandas\io\json\_normalize.py:461, in _json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep, max_level)
458 return DataFrame(_simple_json_normalize(data, sep=sep))
460 if record_path is None:
--> 461 if any([isinstance(x, dict) for x in y.values()] for y in data):
462 # naive normalization, this is idempotent for flat records
463 # and potentially will inflate the data considerably for
464 # deeply nested structures:
465 # {VeryLong: { b: 1,c:2}} -> {VeryLong.b:1 ,VeryLong.c:@}
466 #
467 # TODO: handle record value which are lists, at least error
468 # reasonably
469 data = nested_to_record(data, sep=sep, max_level=max_level)
470 return DataFrame(data)
File ~\AppData\Local\anaconda3\envs\testdir\Lib\site-packages\pandas\io\json\_normalize.py:461, in <genexpr>(.0)
458 return DataFrame(_simple_json_normalize(data, sep=sep))
460 if record_path is None:
--> 461 if any([isinstance(x, dict) for x in y.values()] for y in data):
462 # naive normalization, this is idempotent for flat records
463 # and potentially will inflate the data considerably for
464 # deeply nested structures:
465 # {VeryLong: { b: 1,c:2}} -> {VeryLong.b:1 ,VeryLong.c:@}
466 #
467 # TODO: handle record value which are lists, at least error
468 # reasonably
469 data = nested_to_record(data, sep=sep, max_level=max_level)
470 return DataFrame(data)
AttributeError: 'str' object has no attribute 'values'If I try px.active_session().get_trace_dataset(), I get
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[57], line 3
1 from phoenix.trace import TraceDataset
2 #allspans=px.active_session().get_spans_dataframe()
----> 3 allspans=px.active_session().get_trace_dataset()
4 dataset = TraceDataset.from_spans(allspans)
AttributeError: 'ThreadSession' object has no attribute 'get_trace_dataset'I'm not sure if I called that as you intended. Perhaps I'm missing an import? Regards, David
It is a new addition, I suspect you might be running a version of Phoenix that doesn't have it yet.
Hi Xander -- I had upgraded to 2.7.0 before my last email to you, and I just checked again. I still get the same issue. I started a new notebook (but maintained my active Phoenix session with 200 traces) and tried again, with the same results: !pip show arize-phoenix
Name: arize-phoenix
Version: 2.7.0
Summary: ML Observability in your notebookimport phoenix as px from phoenix.trace import TraceDataset allspans=px.active_session().get_trace_dataset() dataset = TraceDataset.from_spans(allspans)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[5], line 4
2 from phoenix.trace import TraceDataset
----> 3 allspans=px.active_session().get_trace_dataset()
4 dataset = TraceDataset.from_spans(allspans)
AttributeError: 'NoneType' object has no attribute 'get_trace_dataset'Can you try this?
import phoenix as px
# run your app here to collect traces
dataset = px.active_session().get_trace_dataset()Hi Xander -- When I tried your request, I found I couldn't generate any traces at all with 2.7.0, or for 2.6.0. However, when I downgraded back to 2.4.0, the traces returned. Each time after downgrading I exited my environment and relaunched. I think my initial problem may have been that my active session was originally generated in 2.4.0, but then I was trying to save in 2.6.0/2.7.0. However, I have yet to have success in generating any traces in 2.6.0/2.7.0 -- but traces work again immediately upon downgrading to 2.4.0, exiting the environment, re-enteriing, and running. Regards,
Hi Xander -- Yes, I'm using LlamaIndex. I've attached a test notebook with 2 pickled files used by it. When I run with version 2.4.0, I get the proper trace feed in Phoenix. Switching to 2.6.0 or 2.7.0 and then resetting the kernel and running, I don't get any trace feed. Reverting to 2.4.0 and resetting the kernel and running again gives me a proper trace feed. Thanks and regards, David
Thanks, will take a look!
Hi Xander -- Strange indeed. I use set_global_handler in my application just as in the example I sent you. I did a clean reboot and I even upgraded LlamaIndex to the latest version, and still no success. I can only get traces on 2.4, not 2.6 or 2.7. The reason I wanted to upgrade to 2.6 was to get the save/restore session functionality. Is there code I can run natively in 2.4.0 that will accomplish this? Regards, David
If you have a few minutes, please book a time on my calendar and we'll help you resolve the issue. https://calendly.com/xander-arize/30min
