Hi, I've run into an issue when loading the saved data with phoenix client. Here are the steps I followed:
Spin up the server using docker.
Added a few dummy entries with tracing.
Saved the dummy spans as dataframe(CSV) and default parquet.
When I load the data and create a TraceDataset object it works perfectly, the issue is when I use phoenix_client.log_traces method, it shows me TypeError and ValueError everytime I load. Any kind of help is much appreciated, Thanks for considering.
Hi Anuraag T., thank you for trying out Phoenix! Is there any chance you can show up a code snippet so we can help diagnose what's happening?
Sure Dustin N., Here is the code snippet I used in jupyter notebook. My Phoenix server is hosted on docker
import phoenix as px
import pandas as pd
pxc = px.Client(endpoint="http://127.0.0.1:6006")
# save traces as dataframe.
pxc.get_spans_dataframe(project_name="dummyDemo").to_csv("./demo1.csv", index=False)
# load the dataframe
df2 = pd.read_csv("./demo1.csv")
# create trace dataset from dataframe
my_traces2 = px.TraceDataset(dataframe=df2)
# log the traces to default project
# this line throws ValueError
pxc.log_traces(
trace_dataset=my_traces2,
project_name="default",
)
any chance you can show the type error you're getting? Also just to verify, what version of Phoenix are you running?
Sure here is the traceback of the error. I used both version 4.5.0 and latest 4.7.1.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [61], line 1
----> 1 pxc.log_traces(
2 trace_dataset=my_traces2,
3 project_name="default",
4 )
6 # session = px.launch_app(trace=px.TraceDataset.load(trace_id, "./"))
File e:\Projects\py3\env\lib\site-packages\phoenix\session\client.py:275, in Client.log_traces(self, trace_dataset, project_name)
273 project_name = project_name or get_env_project_name()
274 spans = trace_dataset.to_spans()
--> 275 otlp_spans = [
276 ExportTraceServiceRequest(
277 resource_spans=[
278 ResourceSpans(
279 resource=Resource(
280 attributes=[
281 KeyValue(
282 key="openinference.project.name",
283 value=AnyValue(string_value=project_name),
284 )
285 ]
286 ),
287 scope_spans=[ScopeSpans(spans=[encode_span_to_otlp(span)])],
288 )
289 ],
290 )
291 for span in spans
292 ]
293 for otlp_span in otlp_spans:
294 serialized = otlp_span.SerializeToString()
File e:\Projects\py3\env\lib\site-packages\phoenix\session\client.py:275, in <listcomp>(.0)
273 project_name = project_name or get_env_project_name()
274 spans = trace_dataset.to_spans()
--> 275 otlp_spans = [
276 ExportTraceServiceRequest(
277 resource_spans=[
278 ResourceSpans(
279 resource=Resource(
280 attributes=[
281 KeyValue(
282 key="openinference.project.name",
283 value=AnyValue(string_value=project_name),
284 )
285 ]
286 ),
287 scope_spans=[ScopeSpans(spans=[encode_span_to_otlp(span)])],
288 )
289 ],
290 )
291 for span in spans
292 ]
293 for otlp_span in otlp_spans:
294 serialized = otlp_span.SerializeToString()
File e:\Projects\py3\env\lib\site-packages\phoenix\trace\trace_dataset.py:185, in TraceDataset.to_spans(self)
183 if end_time is pd.NaT:
184 end_time = None
--> 185 yield json_to_span(
186 {
187 "name": row["name"],
188 "context": context,
189 "span_kind": row["span_kind"],
190 "parent_id": row.get("parent_id"),
191 "start_time": cast(datetime, row["start_time"]).isoformat(),
192 "end_time": end_time.isoformat() if end_time else None,
193 "status_code": row["status_code"],
194 "status_message": row.get("status_message") or "",
195 "attributes": attributes,
196 "events": row.get("events") or [],
197 "conversation": row.get("conversation"),
198 }
199 )
File e:\Projects\py3\env\lib\site-packages\phoenix\trace\span_json_decoder.py:72, in json_to_span(data)
70 data["span_kind"] = SpanKind(data["span_kind"])
71 data["status_code"] = SpanStatusCode(data["status_code"])
---> 72 data["events"] = [
73 SpanException(
74 message=(event.get("attributes") or {}).get(EXCEPTION_MESSAGE) or "",
75 timestamp=datetime.fromisoformat(event["timestamp"]),
76 )
77 if event["name"] == "exception"
78 else SpanEvent(
79 name=event["name"],
80 attributes=event.get("attributes") or {},
81 timestamp=datetime.fromisoformat(event["timestamp"]),
82 )
83 for event in data["events"]
84 ]
85 data["conversation"] = (
86 SpanConversationAttributes(**data["conversation"])
87 if data["conversation"] is not None
88 else None
89 )
90 return Span(**data)
File e:\Projects\py3\env\lib\site-packages\phoenix\trace\span_json_decoder.py:77, in <listcomp>(.0)
70 data["span_kind"] = SpanKind(data["span_kind"])
71 data["status_code"] = SpanStatusCode(data["status_code"])
72 data["events"] = [
73 SpanException(
74 message=(event.get("attributes") or {}).get(EXCEPTION_MESSAGE) or "",
75 timestamp=datetime.fromisoformat(event["timestamp"]),
76 )
---> 77 if event["name"] == "exception"
78 else SpanEvent(
79 name=event["name"],
80 attributes=event.get("attributes") or {},
81 timestamp=datetime.fromisoformat(event["timestamp"]),
82 )
83 for event in data["events"]
84 ]
85 data["conversation"] = (
86 SpanConversationAttributes(**data["conversation"])
87 if data["conversation"] is not None
88 else None
89 )
90 return Span(**data)
TypeError: string indices must be integersI see, thanks for the report, this definitely seems like a bug
a couple things: the log_traces method was intended to be a stopgap before we supported persistence for Phoenix, is it sufficient for your use case to simply log them to Phoenix the first time and rely on persistence to keep them around? 2. If this codepath is necessary, would you mind filing a bug report? We can prioritize the issue and fix it as soon as we can
In my case, I'm using docker to run the server so I can't use px.launch_app to re-run the server, so for me working of log_traces is crucial. Can you tell me the steps to file the bug report?
Hi Anuraag T.! You can file issues on Github repo: https://github.com/Arize-ai/phoenix/issues
Even if you are using Docker, you can configure Phoenix to use either a sqlite or postgres database with environment variables https://docs.arize.com/phoenix/deployment/docker
In the example docker compose we point to a postgres database, so data can be persisted between phoenix sessions
Yes, I'm using the docker deployment with sqlite as persistence db.
is there a reason you aren't seeing traces persisted between deployments?
No, I'm seeing the traces on the deployment, its just that I can't log traces to different projects.
traces can be logged to different projects either by setting the resource or by using the using_project context manager, the latter is mainly intended for use in notebooks but if you aren't doing any other instrumentation you can try it to see if it works
Sure, will do. Thanks.
