Hi friends, is there a way to filter failed examples from a specific experiment, rerun them and add results to the same experiment? I have a large dataset with 10k+ examples and LLM calls failed sporadically, it would be great to just iteratively rerun the failed examples until there's none left with error is None .
馃挕聽Hint: Mention RunLLM in the thread for followups.
Yes, please show me how to programmatically filter and rerun failed examples. Understood that this would create a different experiment. But is there a workaround for me to "stitch" together two experiment results depends on the error/success status of the individual examples?
hi Jo P. currently we don't have a great way to stitch together experiments, though I understand you desire this functionality. Could you give me a little bit more information about how the failures are happening? Experiments does have some built-in retrying machinery that should alleviate most sporadic issues
About the the built-in retry mechanism, how many times does it retry?
ah, that's super reasonable. by default it's set to requeue up to 10 times IIRC, but that's becomes ocassionally requests will time out of our maximum time waiting for a response
Got it. Should I put up a feature request for this?
Any other workarounds would you recommend in the meantime?
yes, please file an issue and we'll try our best to prioritize it
moving experiments over to the thin phoenix client is top of mind for us so hopefully we can get to this soon
By moving experiments over to the thin phoenix client , do you mean exposing more methods / attributes to the experiment returned by:
experiment = px.Client().get_experiment(experiment_id=experiment_id)If so, that'll be great! I naturally looked there but found nothing substantial or useful.
ah sorry for the confusion here! When you install phoenix you also install the arize-phoenix-client package, importable under phoenix.client, this is a newer version of our client with a minimal dependency footprint that we're hoping to bring to parity with all current phoenix features
the idea is we'll have a new interface for experiments under the phoenix.client package that should be (mostly) backwards compatible that's lighter weight dependency-wise while also offering some much needed ergonomics updates
