Migrating User Conversations to Traces in Phoenix for Annotation

Aman D. · 2025-06-12T08:07:45.764Z

Hi everyone. I have recently moved to a self-hosted instance of Phoenix and wondering how can I move my previous data (user conversations with an LLM) to the traces for annotation. I see that I can upload it as a dataset and edit the rows individually later on if I am not satisfied with the response. But I would prefer uploading it to Traces and have multiple team members leave their annotations on it instead. I would highly prefer to store it as a ChatCompletion object where the inputs and outputs are nicely formatted. It feels like a very common use-case that anyone moving to Phoenix might have but I couldn't find any resource for it. Any guidance on what can be done here?

3 comments

· Sorted by Oldest

RunLLM
APP
·
You can import your previous LLM conversation data into Phoenix as traces by converting your data (e.g., ChatCompletion objects) into a DataFrame that matches the OpenInference trace schema, then using the `px.TraceDataset` and `px.Client().log_traces()` methods to upload them to your self-hosted Phoenix instance. This allows you and your team to annotate traces collaboratively in the Phoenix UI, rather than editing datasets row-by-row. See the <https://github.com/Arize-ai/phoenix/blob/main/docs/tracing/how-to-interact-with-traces/importing-existing-traces.md|importing existing traces guide> and <https://github.com/Arize-ai/phoenix/blob/main/tutorials/log_traces_to_phoenix.ipynb|logging traces tutorial> for step-by-step instructions and code examples. If your data is not already in the OpenInference format, you will need to map your ChatCompletion fields (inputs, outputs, metadata) to the appropriate columns expected by Phoenix traces. Once uploaded, traces can be annotated by multiple users in the UI. Would you like more detail or a code example for this process? Sources: - <https://github.com/Arize-ai/phoenix/blob/main/docs/tracing/how-to-interact-with-traces/importing-existing-traces.md|Importing Existing Traces> - <https://github.com/Arize-ai/phoenix/blob/main/tutorials/log_traces_to_phoenix.ipynb|Logging Traces to Phoenix Tutorial>
💡 Hint: Mention RunLLM in the thread for followups.
Aman D.
·
my dataset is a user conversation history and not in the format of traces expected by px.TraceDataset RunLLM

Aman D.

Hey John G.. Thanks for your comment. Here is the dataset format I'm working with:

chat_history = [
    {
        "role": "user",
        "content": "What's the capital of France?",
        "created_at": "2023-01-01T00:00:00Z",
    },
    {
        "role": "assistant",
        "content": "The capital of France is Paris.",
        "created_at": "2023-01-01T00:00:01Z",
    },
    {
        "role": "user",
        "content": "What's the population?",
        "created_at": "2023-01-01T00:00:02Z",
    },
    {
        "role": "assistant",
        "content": "About 2.1 million as of 2023.",
        "created_at": "2023-01-01T00:00:03Z",
    },
]

here's a script I wrote to attempt to do the conversion:

def make_span(
    name, kind, start, end, span_id, trace_id, parent_id=None, attributes=None
):
    return Span(
        name=name,
        context=SpanContext(trace_id=trace_id, span_id=span_id),
        span_kind=kind,
        start_time=start,
        end_time=end,
        parent_id=None,
        status_code=StatusCode.OK,
        status_message=None,
        attributes=attributes or {},
        events=[],  # optional structured log messages like function call, exception, etc.
        conversation=None,  # or: SpanConversationAttributes(conversation_id=some_uuid)
    )


def chat_to_trace(chat_history):
    spans = []

    for i in range(0, len(chat_history), 2):
        user_msg = chat_history[i]
        assistant_msg = (
            chat_history[i + 1] if i + 1 < len(chat_history) else {"content": ""}
        )

        span_id = str(uuid.uuid4())
        trace_id = str(uuid.uuid4())

        llm_model_name = "gpt-4"
        llm_provider = "openai"

        # Read timestamps from chat_history created_at fields
        user_created_at = user_msg.get("created_at")
        assistant_created_at = assistant_msg.get("created_at")

        # Use created_at timestamps if available, otherwise generate them
        start = datetime.fromisoformat(user_created_at.replace("Z", "+00:00")).replace(
            tzinfo=None
        )
        end = datetime.fromisoformat(
            assistant_created_at.replace("Z", "+00:00")
        ).replace(tzinfo=None)

        span = {
            "name": "LLM Response",
            "span_kind": "LLM",
            "parent_id": None,
            "start_time": start.isoformat() + "Z",
            "end_time": end.isoformat() + "Z",
            "status_code": "OK",
            "status_message": "",
            "events": [],
            "context.span_id": span_id,
            "context.trace_id": trace_id,
            # Core attributes
            "attributes.input.value": user_msg["content"],
            "attributes.output.value": assistant_msg["content"],
            "attributes.input.mime_type": "text/plain",
            "attributes.output.mime_type": "text/plain",
            "attributes.openinference.span.kind": "llm",
            # Optional LLM-specific attributes
            "attributes.llm.model_name": llm_model_name,
            "attributes.llm.provider": llm_provider,
            "attributes.llm.input_messages": [user_msg],
            "attributes.llm.output_messages": [assistant_msg],
            "attributes.llm.token_count.prompt": 10,
            "attributes.llm.token_count.completion": 15,
            "attributes.llm.token_count.total": 25,
            # Mocked or empty values for now
            "attributes.llm.invocation_parameters": {},
            "attributes.llm.system": "",
            "attributes.llm.tools": [],
            "attributes.session.id": str(uuid.uuid4()),
            "attributes.user.id": "anonymous",
            "attributes.metadata": {},
            # Deep token details (placeholders)
            "attributes.llm.token_count.prompt_details.audio": 0,
            "attributes.llm.token_count.prompt_details.cache_read": False,
            "attributes.llm.token_count.completion_details.reasoning": 10,
            "attributes.llm.token_count.completion_details.audio": 0,
        }

        spans.append(span)

    return spans

spans = chat_to_trace(chat_history)
spans_df = pd.DataFrame(spans)

When I attempt to log this, I don't see any errors but I don't see the logs appearing on Phoenix as well:

px.Client().log_traces(
    trace_dataset=px.TraceDataset(spans_df), project_name="my_project_name"
)

Could you share why that might be happening? If you already have a conversion script, that would be super helpful.

RunLLM
APP
·
You can import your previous LLM conversation data into Phoenix as traces by converting your data (e.g., ChatCompletion objects) into a DataFrame that matches the OpenInference trace schema, then using the `px.TraceDataset` and `px.Client().log_traces()` methods to upload them to your self-hosted Phoenix instance. This allows you and your team to annotate traces collaboratively in the Phoenix UI, rather than editing datasets row-by-row. See the <https://github.com/Arize-ai/phoenix/blob/main/docs/tracing/how-to-interact-with-traces/importing-existing-traces.md|importing existing traces guide> and <https://github.com/Arize-ai/phoenix/blob/main/tutorials/log_traces_to_phoenix.ipynb|logging traces tutorial> for step-by-step instructions and code examples. If your data is not already in the OpenInference format, you will need to map your ChatCompletion fields (inputs, outputs, metadata) to the appropriate columns expected by Phoenix traces. Once uploaded, traces can be annotated by multiple users in the UI. Would you like more detail or a code example for this process? Sources: - <https://github.com/Arize-ai/phoenix/blob/main/docs/tracing/how-to-interact-with-traces/importing-existing-traces.md|Importing Existing Traces> - <https://github.com/Arize-ai/phoenix/blob/main/tutorials/log_traces_to_phoenix.ipynb|Logging Traces to Phoenix Tutorial>
💡 Hint: Mention RunLLM in the thread for followups.
Aman D.
·
my dataset is a user conversation history and not in the format of traces expected by px.TraceDataset RunLLM

Aman D.

Hey John G.. Thanks for your comment. Here is the dataset format I'm working with:

chat_history = [
    {
        "role": "user",
        "content": "What's the capital of France?",
        "created_at": "2023-01-01T00:00:00Z",
    },
    {
        "role": "assistant",
        "content": "The capital of France is Paris.",
        "created_at": "2023-01-01T00:00:01Z",
    },
    {
        "role": "user",
        "content": "What's the population?",
        "created_at": "2023-01-01T00:00:02Z",
    },
    {
        "role": "assistant",
        "content": "About 2.1 million as of 2023.",
        "created_at": "2023-01-01T00:00:03Z",
    },
]

here's a script I wrote to attempt to do the conversion:

def make_span(
    name, kind, start, end, span_id, trace_id, parent_id=None, attributes=None
):
    return Span(
        name=name,
        context=SpanContext(trace_id=trace_id, span_id=span_id),
        span_kind=kind,
        start_time=start,
        end_time=end,
        parent_id=None,
        status_code=StatusCode.OK,
        status_message=None,
        attributes=attributes or {},
        events=[],  # optional structured log messages like function call, exception, etc.
        conversation=None,  # or: SpanConversationAttributes(conversation_id=some_uuid)
    )


def chat_to_trace(chat_history):
    spans = []

    for i in range(0, len(chat_history), 2):
        user_msg = chat_history[i]
        assistant_msg = (
            chat_history[i + 1] if i + 1 < len(chat_history) else {"content": ""}
        )

        span_id = str(uuid.uuid4())
        trace_id = str(uuid.uuid4())

        llm_model_name = "gpt-4"
        llm_provider = "openai"

        # Read timestamps from chat_history created_at fields
        user_created_at = user_msg.get("created_at")
        assistant_created_at = assistant_msg.get("created_at")

        # Use created_at timestamps if available, otherwise generate them
        start = datetime.fromisoformat(user_created_at.replace("Z", "+00:00")).replace(
            tzinfo=None
        )
        end = datetime.fromisoformat(
            assistant_created_at.replace("Z", "+00:00")
        ).replace(tzinfo=None)

        span = {
            "name": "LLM Response",
            "span_kind": "LLM",
            "parent_id": None,
            "start_time": start.isoformat() + "Z",
            "end_time": end.isoformat() + "Z",
            "status_code": "OK",
            "status_message": "",
            "events": [],
            "context.span_id": span_id,
            "context.trace_id": trace_id,
            # Core attributes
            "attributes.input.value": user_msg["content"],
            "attributes.output.value": assistant_msg["content"],
            "attributes.input.mime_type": "text/plain",
            "attributes.output.mime_type": "text/plain",
            "attributes.openinference.span.kind": "llm",
            # Optional LLM-specific attributes
            "attributes.llm.model_name": llm_model_name,
            "attributes.llm.provider": llm_provider,
            "attributes.llm.input_messages": [user_msg],
            "attributes.llm.output_messages": [assistant_msg],
            "attributes.llm.token_count.prompt": 10,
            "attributes.llm.token_count.completion": 15,
            "attributes.llm.token_count.total": 25,
            # Mocked or empty values for now
            "attributes.llm.invocation_parameters": {},
            "attributes.llm.system": "",
            "attributes.llm.tools": [],
            "attributes.session.id": str(uuid.uuid4()),
            "attributes.user.id": "anonymous",
            "attributes.metadata": {},
            # Deep token details (placeholders)
            "attributes.llm.token_count.prompt_details.audio": 0,
            "attributes.llm.token_count.prompt_details.cache_read": False,
            "attributes.llm.token_count.completion_details.reasoning": 10,
            "attributes.llm.token_count.completion_details.audio": 0,
        }

        spans.append(span)

    return spans

spans = chat_to_trace(chat_history)
spans_df = pd.DataFrame(spans)

When I attempt to log this, I don't see any errors but I don't see the logs appearing on Phoenix as well:

px.Client().log_traces(
    trace_dataset=px.TraceDataset(spans_df), project_name="my_project_name"
)

Could you share why that might be happening? If you already have a conversion script, that would be super helpful.