Hello. In the prompt hub I was able to get f-string variable formatting to work, but not mustache. I'm using the following format API:
attribute_prompt = prompt_client.pull_prompt("Lead Attributes Summary")
formatted_prompt = attribute_prompt.format(attribute_json)
formatted_promptI toggle between the variable options in the prompt playground, saving a new prompt version. When pulling a prompt it's input_variable_format attribute is correctly set to f-string or mustache but only f-string variables actually fill the prompt values in on calling format. I've referenced this documentation: https://arize.com/docs/ax/reference/prompt-hub-api#using-different-variable-formats
Additionally I'm finding the evaluator examples in this notebook to be obsolete: https://colab.research.google.com/github/Arize-ai/tutorials/blob/main/python/llm/experiments/summarization-experiment.ipynb#scrollTo=DKPQJjrIhNuQ The experiment upload result system is expecting eval.<evaluator>.label and eval.<evaluator>.explanation to be bool or str type. The default is NoneType, causing the result uploader to fail. I used the EvaluationResult type with string values to work around the issue.
Hello 馃敀[private user] Thanks for the response! I have a follow up question regarding Custom Code Evaluators. I see the documentation for creating them in the UI: https://arize.com/docs/ax/evaluate/code-evaluations#custom-code-evaluators But CustomCodeEvaluators on my end says Coming Soon still. Is it possible to add custom code evaluators from the API?
When using tracers in production, I'm passing a prompt from my Prompt Hub, is there a way to link the prompt I used to the trace?
I would say the best way to go about this is to use add metadata or attributes to your trace that include the prompt name/version/tag. For example: current_span.set_attribute("promt_name", prompt.name) Let me know if that helps clarify. Here is the documentation to reference https://arize.com/docs/ax/observe/tracing/add-metadata/add-attributes-metadata-and-tags
Also, CustomCodeEvaluators do say Coming Soon. There isn't a way to do this via API yet, but I can help find a work around for now. Can you give me more details on what you are trying to do? You can leverage the starter code for the existing code evals and the Test in Code button to build/run the evals. Running a Phoenix custom eval may work too based on your use case. Then, log back results to Arize using the Python SDK.
馃敀[private user] My end goal is to enable dashboard users to run experiments with evaluators that I have predefined. Similar to how in the API I can define a task and Evaluator for a experiment, I'd like these tasks or evaluators to be avaible for individuals who only use the dashboard.
When attempting to use an existing evaluator as well the only Span attributes I can select from are my input columns? I want to run evals on outputs, but the output column isn't selectable from the provided dropdown.
Another question... When adding a dataset with an id column, why do you force the column name to Example ID? This causes problems when passing the dataset to a prompt in the platform vs in production. Is there a way I can prevent this unexpected mutation?
