Is there a way to get a dataset ID?

9 comments

· Sorted by Oldest

RunLLM
APP
·
Yes, you can retrieve a dataset ID using the Phoenix API. You can list all datasets and their IDs by making a GET request to the `/v1/datasets` endpoint. This will return a list of datasets with their respective IDs. Here's a code snippet in TypeScript for listing datasets: ``` import { createClient } from "@arizeai/phoenix-client"; const phoenix = createClient(); phoenix.GET("/v1/datasets", { params: { query: { limit: 100 } } }) .then(({ data }) => data?.data ?? []) .then(datasets => { datasets.forEach((dataset, index) => { console.log(`${index + 1}. ${dataset.name} (${dataset.id})`); }); }); ``` Would you like more detail on how to use this in a specific context or language? Sources: - <https://github.com/Arize-ai/phoenix/blob/main/js/packages/phoenix-client/examples/list_datasets.ts|list_datasets.ts> - <https://github.com/Arize-ai/phoenix/blob/main/js/examples/apps/phoenix-experiment-runner/index.ts|phoenix-experiment-runner/index.ts>
💡 Hint: Mention RunLLM in the thread for followups.
Nico G.
·
I did upload a dataset as df from Python notebook
Nico G.
·
I am wondering if I can get the ID directly from Arize side
Nico G.
·
This is amazing, John G.. I am pretty new here and wanted to follow up on the matter. So, I am loading datasets batch by batch and wanted to get eval metrics for each. Then, try to do some aggregating to get a final result on large dataset through dataset IDs. Do you think this is possible in current Arize capability? if so, where can I find docs/resources surrounding it? Thank you again!
John G.
·
Hey Nico, let me make sure I'm understanding correctly. Right now, you're loading a series of datasets into Arize Phoenix, and you want to evaluate each. Then once that's done, you want to combine all the datasets into one and rerun the evals on the large dataset? Or do you want to just sum up the metrics from your smaller datasets? Either way, totally doable in Phoenix. I'd suggest following these docs to show how you can upload a dataset and run an experiment (eval) on that dataset. Alternatively, this guide shows a similar flow for a specific use case.
Nico G.
·
Yes, latter is the case. I am batching the larger dataset and trying to aggregate the results to get final eval of my large dataset. Resources look good to me. But, I am wondering how I can get the evals based on dataset ID or experiment ID directly. If I could get, I can do the aggregation on my own by defining a few functions.
Nico G.
·
I see. Can this also be working fine with custom evals?
John G.
·
Yep that will work with experiments that use any kind of evals
🙌1
Nico G.
·
Thank you!
:np_blue:1

RunLLM
APP
·
Yes, you can retrieve a dataset ID using the Phoenix API. You can list all datasets and their IDs by making a GET request to the `/v1/datasets` endpoint. This will return a list of datasets with their respective IDs. Here's a code snippet in TypeScript for listing datasets: ``` import { createClient } from "@arizeai/phoenix-client"; const phoenix = createClient(); phoenix.GET("/v1/datasets", { params: { query: { limit: 100 } } }) .then(({ data }) => data?.data ?? []) .then(datasets => { datasets.forEach((dataset, index) => { console.log(`${index + 1}. ${dataset.name} (${dataset.id})`); }); }); ``` Would you like more detail on how to use this in a specific context or language? Sources: - <https://github.com/Arize-ai/phoenix/blob/main/js/packages/phoenix-client/examples/list_datasets.ts|list_datasets.ts> - <https://github.com/Arize-ai/phoenix/blob/main/js/examples/apps/phoenix-experiment-runner/index.ts|phoenix-experiment-runner/index.ts>
💡 Hint: Mention RunLLM in the thread for followups.
Nico G.
·
I did upload a dataset as df from Python notebook
Nico G.
·
I am wondering if I can get the ID directly from Arize side
Nico G.
·
This is amazing, John G.. I am pretty new here and wanted to follow up on the matter. So, I am loading datasets batch by batch and wanted to get eval metrics for each. Then, try to do some aggregating to get a final result on large dataset through dataset IDs. Do you think this is possible in current Arize capability? if so, where can I find docs/resources surrounding it? Thank you again!
John G.
·
Hey Nico, let me make sure I'm understanding correctly. Right now, you're loading a series of datasets into Arize Phoenix, and you want to evaluate each. Then once that's done, you want to combine all the datasets into one and rerun the evals on the large dataset? Or do you want to just sum up the metrics from your smaller datasets? Either way, totally doable in Phoenix. I'd suggest following these docs to show how you can upload a dataset and run an experiment (eval) on that dataset. Alternatively, this guide shows a similar flow for a specific use case.
Nico G.
·
Yes, latter is the case. I am batching the larger dataset and trying to aggregate the results to get final eval of my large dataset. Resources look good to me. But, I am wondering how I can get the evals based on dataset ID or experiment ID directly. If I could get, I can do the aggregation on my own by defining a few functions.
Nico G.
·
I see. Can this also be working fine with custom evals?
John G.
·
Yep that will work with experiments that use any kind of evals
🙌1
Nico G.
·
Thank you!
:np_blue:1