Error Loading Dataset Examples: TypeError Due to String Indexing
getting an error where examples in my dataset aren't loading properly. here's what dataset.examples prints:
{'RGF0YXNldEV4YW1wbGU6Mw==': Example(
id="RGF0YXNldEV4YW1wbGU6Mw==",
[1m[94minput[0m={
"doc_path": "test_data/409A Valuation Report.pdf"
},
[1m[94moutput[0m={
"expected_summary": "This is a 409A valuation report document that..."
},
[1m[94mmetadata[0m={
"metadata": {
"topic": "financial"
}
},
), 'RGF0YXNldEV4YW1wbGU6NA==': Example(
id="RGF0YXNldEV4YW1wbGU6NA==",
[1m[94minput[0m={
"doc_path": "test_data/409A Valuation Report.pdf"
},
[1m[94moutput[0m={
"expected_summary": "This is a 409A valuation report document that..."
},
[1m[94mmetadata[0m={
"metadata": {
"topic": "financial"
}
},
)}and here's how I initially created the dataset:
df = pd.DataFrame(
[
{
"doc_path": pdf_path,
"expected_summary": "This is a 409A valuation report document that provides a detailed analysis and valuation of company shares for tax and compliance purposes.",
"metadata": {"topic": "financial"},
},
{
"doc_path": pdf_path,
"expected_summary": "This is a 409A valuation report document that provides a detailed analysis and valuation of company shares for tax and compliance purposes.",
"metadata": {"topic": "financial"},
},
]
)
return phoenix_client.upload_dataset(
dataframe=df,
dataset_name=dataset_name,
input_keys=["doc_path"],
output_keys=["expected_summary"],
metadata_keys=["metadata"],
)errors out at the line with cache_key
cache_key = (example["id"], repetition_number)
TypeError: string indices must be integers, not 'str'
because Example is just
example
'RGF0YXNldEV4YW1wbGU6Mw=='