Hi there, I'm getting phoenix.session.client.DatasetUploadError: Internal Server Error when trying to upload a new dataset to Arize cloud platform instance via px.Client.upload_dataset(). Is there a known issue on the server side? thank you
There is no explicit mention of a widespread or current known issue with DatasetUploadError: Internal Server Error when using px.Client.upload_dataset() on Arize cloud in the provided data sources. However, issues with dataset upload can occur due to invalid data (e.g., NaN values in your DataFrame), large dataset size causing timeouts, or server-side problems. For example, NaN values are not valid JSON and will cause upload/view errors, and large uploads may trigger timeouts even if the data appears in the UI [source 1]聽[source 2].To troubleshoot, ensure your data contains only valid JSON values (replace NaN with null), try uploading smaller batches, and check for any server/client version mismatches. If the error persists, it may be a transient server issue or a bug not yet documented. Would you like more detail on debugging steps or error handling?Sources: - https://github.com/Arize-ai/phoenix/issues/7965 - https://github.com/Arize-ai/phoenix/issues/8482
馃挕聽Hint: Mention RunLLM in the thread for followups.
can you DM me your url? i can take a look at the server logs
ok i think this is a regression on our end
return orjson.loads(orjson.dumps(value)) if isinstance(value, dict) else {}
^^^^^^^^^^^^^^^^^^^
sqlalchemy.exc.StatementError: (builtins.TypeError) Type is not JSON serializable: numpy.ndarraysorry about that
the workaround for the time being is to convert dataframe to list of dictionaries like this
inputs = df.loc[:, input_keys].to_dict('records')
outputs = df.loc[:, output_keys].to_dict('records') if output_keys else []
metadata = df.loc[:, metadata_keys].to_dict('records') if metadata_keys else []
dataset = client.upload_dataset(dataset_name=name, inputs=inputs, outputs=outputs, metadata=metadata)