Questions on Query Behavior and Evaluation Metrics in Arize

·Mar 20, 2024 12:07 PM

Hello Arize team, I have two questions: 1 - Did something happen / change with the query explode? When I run this command, I get an output:

query = SpanQuery().where(
    "span_kind" == "RERANKER",
).select(
    # input="reranker.query",
    model = "reranker.model_name",
)
reranked_docs_df = px.active_session().query_spans(query)
reranked_docs_df

However, if I run this command, I don't get anything back:

query = SpanQuery().where(
    "span_kind" == "RERANKER",
).select(
    # input="reranker.query",
    model = "reranker.model_name",
).explode(
    "reranker.output_documents",
    reference = "reranker.document_content",
)
reranked_docs_df = px.active_session().query_spans(query)
reranked_docs_df

If I switch the reference in the select, I can get the result of the list of documents for the output contents:

query = SpanQuery().where(
    "span_kind" == "RERANKER",
).select(
    # input="reranker.query",
    model = "reranker.model_name",
    reference = "reranker.output_documents"
)
reranked_docs_df = px.active_session().query_spans(query)
reranked_docs_df

2 - What is the suggested way of performing evaluations when query transformation is in place (same query asked multiple times in different ways) like in the picture attached? I do want to calculate the DCG@5, Precision@5 and Hit rate for the retrieval part, however I am trying to think of the best way to capture the retrieval evaluation and be considerate of the cost considering I'm getting 100s of documents per question.

Phoenix Support

Questions on Query Behavior and Evaluation Metrics in Arize

Teodor C.

·Mar 20, 2024 12:07 PM

Hello Arize team, I have two questions: 1 - Did something happen / change with the query explode? When I run this command, I get an output:

query = SpanQuery().where(
    "span_kind" == "RERANKER",
).select(
    # input="reranker.query",
    model = "reranker.model_name",
)
reranked_docs_df = px.active_session().query_spans(query)
reranked_docs_df

However, if I run this command, I don't get anything back:

query = SpanQuery().where(
    "span_kind" == "RERANKER",
).select(
    # input="reranker.query",
    model = "reranker.model_name",
).explode(
    "reranker.output_documents",
    reference = "reranker.document_content",
)
reranked_docs_df = px.active_session().query_spans(query)
reranked_docs_df

If I switch the reference in the select, I can get the result of the list of documents for the output contents:

query = SpanQuery().where(
    "span_kind" == "RERANKER",
).select(
    # input="reranker.query",
    model = "reranker.model_name",
    reference = "reranker.output_documents"
)
reranked_docs_df = px.active_session().query_spans(query)
reranked_docs_df

class RerankerAttributes: """ Attributes for a reranker """ RERANKER_INPUT_DOCUMENTS = "reranker.input_documents" """ List of documents as input to the reranker """ RERANKER_OUTPUT_DOCUMENTS = "reranker.output_documents" """ List of documents as output from the reranker """ RERANKER_QUERY = "reranker.query" """ Query string for the reranker """ RERANKER_MODEL_NAME = "reranker.model_name" """ Model name of the reranker """ RERANKER_TOP_K = "reranker.top_k" """ Top K parameter of the reranker """

query = SpanQuery().where( "span_kind" == "RERANKER", ).select( # input="reranker.query", model = "reranker.model_name", ).explode( "reranker.output_documents", reference = "document.content", ) reranked_docs_df = px.active_session().query_spans(query) reranked_docs_df

query = SpanQuery().where( "span_kind" == "RERANKER", ).select( reference = "reranker.output_documents", ) reranked_docs_df = px.Client().query_spans(query) print(type(reranked_docs_df.reference[0][0])) print(reranked_docs_df.reference[0][0].keys())

query = SpanQuery().where( "span_kind" == "RERANKER", ).explode( "reranker.output_documents", reference = "document.content", ) reranked_docs_df = px.active_session().query_spans(query) reranked_docs_df

Questions on Query Behavior and Evaluation Metrics in Arize

29 comments

Questions on Query Behavior and Evaluation Metrics in Arize

29 comments