what’s the best way to save a dataset and load it back up from disk? I saved it like so:
dataset_path = "dataset.json"
dataset = client.get_dataset(name=dataset_name)
df = dataset.as_dataframe()
df.to_json(dataset_path, index=False)but I can’t load it back up from disk:
-> dataset = Dataset.from_dict(dataset)
(Pdb) dataset = Dataset.from_dict(dataset)
*** KeyError: 'dataset_id'
(Pdb) dataset.keys()
dict_keys(['input', 'output', 'metadata'])I get KeyError: 'dataset_id'
💡 Hint: Mention RunLLM in the thread for followups.
the client (/phoenix/session/client.py ) does not have a save method. client.upload_dataset does not persist the dataset locally. it uploads it to the phoenix server. My question is about persisting the dataset locally and loading it back up locally. the Dataset object (phoenix/experiments/types.py) does not have a .save method either RunLLM
💡 Hint: Mention RunLLM in the thread for followups.
💡 Hint: Mention RunLLM in the thread for followups.
I didn’t try pickling but I certainly could. Thanks for the suggestion!
Sounds good! 🙏🏻 it will be handy
hi Andrew S. this functionality is out for PR here: https://github.com/Arize-ai/phoenix/pull/8453
beautiful. I will use this for sure
