Priya

Commented on Issue with Accessing Arize Tracing API in Browser·Posted inPhoenix Support

Priya

No error messages, just the above logs which i shared

Commented on Issue with Accessing Arize Tracing API in Browser·Posted inPhoenix Support

Priya

yes, I wonder why the evaluations are not getting logged🤔

Commented on Issue with Accessing Arize Tracing API in Browser·Posted inPhoenix Support

Priya

but my traces ares being sent via this endpoint but why the evaluations are not being sent to the same?

Commented on Issue with Accessing Arize Tracing API in Browser·Posted inPhoenix Support

Priya

If we do not provide any endpoint inside px.client(), then by default it takes from environment variable, right?

Commented on Issue with Accessing Arize Tracing API in Browser·Posted inPhoenix Support

Priya

Actually I have set the endpoint like this in the kubernetes deployment :


- name: PHOENIX_COLLECTOR_ENDPOINT
              value: http://arize-ui.default.svc.cluster.local:80/v1/traces

The traces are being sent to the ui, but the evaluations are not being sent via px.client() we have used it like this :

logger.info(f"Getting input, output and reference from traces ...")
        test_spans = px.Client().get_spans_dataframe()
        logger.info(f"############# {test_spans}")
        input_output_df = get_qa_with_reference(px.Client())
        logger.info(f"####### { input_output_df}")

This is the log which i am getting now :

 The evaluation based on dataset:100001. 5 questions to run
2024-03-05T18:59:57.761503+0000 INFO [job_builder] - Getting input, output and reference from traces ...
2024-03-05T18:59:58.119939+0000 INFO [job_builder] - #############                                                    name  ... attributes.status
context.span_id                                          ...                  
b7ff27afb7de0567                        AzureChatOpenAI  ...              None
12edcc7d11ab52bb                               LLMChain  ...              None
ad29ee6a9b54714d                    StuffDocumentsChain  ...              None
163cd80dbe74b0db  ConversationalRetrievalWithScoreChain  ...              None
ac51b20dccd0fb28                        AzureChatOpenAI  ...              None
...                                                 ...  ...               ...
4025ad8114ed6e9c                             get_answer  ...                OK
e2704ad13a15b2a6                             get_answer  ...                OK
345b2087aa39b26b                             get_answer  ...                OK
23fdd79e69456cf2                             get_answer  ...                OK
2154118ee030ff6a                             get_answer  ...                OK

[527 rows x 29 columns]
2024-03-05T18:59:58.218531+0000 INFO [job_builder] - ####### Empty DataFrame
Columns: [input, output]
Index: []
2024-03-05T18:59:58.305156+0000 INFO [job_builder] - Running evaluations: dict_values(['Correctness', 'Hallucination', 'Toxicity', 'Groundtruth'])
2024-03-05T18:59:58+0000 WARNING [executor] - 🐌!! If running llm_classify inside a notebook, patching the event loop with nest_asyncio will allow asynchronous eval submission, and is significantly faster. To patch the event loop, run `nest_asyncio.apply()`.

run_evals |          | 0/0 (0.0%) | ⏳ 00:00<? | ?it/s
run_evals |          | 0/0 (0.0%) | ⏳ 00:00<? | ?it/s
2024-03-05T18:59:58.622718+0000 INFO [job_builder] - Log evaluation results to UI for: dict_values(['Correctness', 'Hallucination', 'Toxicity', 'Groundtruth'])

Commented on Configuring Phoenix Traces to Send Data to Remote...·Posted inPhoenix Support

Priya

Thankyou Roger Y. it worked , I had to make another deployment in the same cluster and it works.

Commented on Configuring Phoenix Traces to Send Data to Remote...·Posted inPhoenix Support

Priya

this is my main code where i have implemented tracing :

def submit(self):
        init_logger()
        # to do: change to support local server and remote server based on the server set

        os.environ["PHOENIX_COLLECTOR_ENDPOINT"] = "https://arize.zscaler.site"

        session = px.launch_app()
        LangChainInstrumentor().instrument()

        ai_dresult = {}
        question_num = 0
        for question, human_answer in self.question_answer_pool.items():
            ai_answer = self.chat_app_run(question)
            ai_dresult[question] = ai_answer
            question_num += 1
        logger.info(f"The evaluation based on dataset:{self.dataset_id}. {question_num} questions to run")

        # log the traces
        logger.info(f"Getting input, output and reference from traces ...")
        input_output_df = get_qa_with_reference(px.Client())
        input_output_df["correct_answer"] = input_output_df["input"].apply(
            lambda x: self.question_answer_pool[x])

        input_output_df["ai_answer"] = input_output_df["input"].apply(
            lambda x: ai_dresult[x])
        retrieved_documents_df = get_retrieved_documents(px.Client())

        if self.evaluators_qa_with_reference:
            evaluations_list = self.evaluators_qa_with_reference.values()
            logger.info(f"Running evaluations: {evaluations_list}")
            df_results = []
            names = []
            for name in self.evaluators_qa_with_reference.values():
                df_results.append(f"result_of_{name}")
                names.append(name)

            df_results = run_evals(
                dataframe=input_output_df,
                evaluators=self.evaluators_qa_with_reference.keys(),
                provide_explanation=True,
            )
            logger.info(f"Log evaluation results to UI for: {evaluations_list}")
            for index, df_result in enumerate(df_results):
                px.Client().log_evaluations(
                    SpanEvaluations(eval_name=names[index], dataframe=df_result)
                )
                self.dashboard_data[names[index]] = df_result.mean(numeric_only=True)["score"]

        if self.evaluators_retrieved_documents:
            evaluations_retrived = self.evaluators_retrieved_documents.values()
            logger.info(f"Running evaluations: {evaluations_retrived}")
            relevance_eval_df = run_evals(
                dataframe=retrieved_documents_df,
                evaluators=self.evaluators_retrieved_documents.keys(),
                provide_explanation=True,
            )[0]
            logger.info(f"Log evaluation results to UI: {evaluations_retrived}")
            px.Client().log_evaluations(
                DocumentEvaluations(eval_name="Relevance", dataframe=relevance_eval_df),
            )
            self.dashboard_data["Relevance"] = relevance_eval_df.mean(numeric_only=True)["score"]

        if self.run_outputtone:
            logger.info("Run evaluation: outputtone")
            output_tone_df = run_output_tone_evaluation(input_output_df)
            logger.debug(f"output_tone_df:{output_tone_df}")
            logger.info(f"Log evaluation results to UI: outputtone")
            px.Client().log_evaluations(SpanEvaluations(eval_name="output_tone", dataframe=output_tone_df))

        # to do: implement the function
        self.dashboard_data["timestamp"] = datetime.now().timestamp()
        if self.dataset_id > BENCHMARK_DATASET_ID_MIN and self.project_name != PROJECT_NAMES['RANDOM']:
            logger.info("Log evaluations scores to Bigquery")
            log_scores_to_bigQuery(self.dashboard_data)

        # to do: check condition, only do this for running on local trace server
        import time
        time.sleep(self.job_save_seconds_for_local_run)

this is my usage_example.py:

from llamaas.qa.chatbot_qa.job import JobBuilder
from llamaas.qa.chatbot_qa.utils import load_from_json
from llamaas.model_connector.azure_resources import DevGPT4_32K
from llamaas.qa.chatbot_qa.constants import *
from llamaas.qa.eval_utils.eval_models import get_azure_openai_model
from llamaas.orchestrator.doc_qa.scoped_doc_qa_app import ScopedDocQaApp
from llamaas.qa.chatbot_qa.server import TraceServer

# [required] the dataset that used for evaluation, format follow ./questions_answers_template.json
json_questions_answers = load_from_json("./questions_answers_template.json")

# [required] the instance of app  under evaluation
chat_app = ScopedDocQaApp(
    vector_store_path="/Users/priya/datasci/All")
# [required] the method name to get answer, the evaluation job will pass question as parameter
run_method = "get_answer"

# [option] the Azure model that used to do evaluation, if not set, default use DevGPT4_32K, it will be setted in set_tasks
eval_model = get_azure_openai_model(DevGPT4_32K)

job = (JobBuilder()
       # [required] defined question_answers data set, format same to questions_answers_template.json
       .set_question_answer_pool(json_questions_answers)
       # [option] trace server ip, if local run, set "127.0.0.1", if not set, default is "127.0.0.1"
       .set_server(TraceServer("127.0.0.1"))
       # [required] set model to be used to do evaluation and evaluation tasks(True: run the evaluation, False(default): not run)
       .set_tasks(
           eval_model=eval_model,
           correctness=True,
           hallucination=True,
           toxicity=True,
           groundtruth=True,
           relevance=True,
           outputtone=True)
       # [required]set the chat_app to be evaluationed, and how to run it. The system will pass in "question" as parameter
       .set_chat_app_run(chat_app, run_method)
       # [required] set the model that used in chat_app, str, used for log Metric_store to do model comparision.
       .set_chat_app_model("gpt-4")
       # [required] projectname, used for Metric_store table. The name should lised in constants.PROJECT_NAMES. For new project, please add name into constants.PROJECT_NAMES first
       .set_project_name(PROJECT_NAMES["DOC_GPT"])
       # [option]when local run, how long(senconds) to save job for checking results, default: 0
       .set_job_save_seconds_for_local_run(1000)
       .build()
       )
job.submit()

this is the error i am facing: i tried adding and removing port.

Commented on Configuring Phoenix Traces to Send Data to Remote...·Posted inPhoenix Support

Priya

Commented on Configuring Phoenix Traces to Send Data to Remote...·Posted inPhoenix Support

Priya

if i set the remote url, in none of the places the traces are being sent. but it works fine with localhost.

Commented on Configuring Phoenix Traces to Send Data to Remote...·Posted inPhoenix Support

Priya