I noticed in https://github.com/Arize-ai/openinference/blob/main/spec/semantic_conventions.md there are both input.value and llm.input_messages , looking at a real example shows the input messages are duplicated in these two fields, is that expected? Similar observations for output.value and llm.output_messages is also noticed.
Yes, this is expected. The OpenInference semantic conventions specify both input.value (a raw string or JSON of the input payload) and llm.input_messages (a flattened, indexed list of chat messages) for LLM spans. This allows for both a full, machine-readable input and a structured, human-readable breakdown. The same applies to output.value and llm.output_messages for outputs. This duplication ensures compatibility with various tools and UIs, and is reflected in both the spec and real-world traces (semantic conventions, example spans, llm_spans.md).Would you like more detail on how these fields are populated or used in Phoenix and OpenInference?Sources:
- llm_spans.md
- semantic conventions
- example spans
馃挕聽Hint: Mention RunLLM in the thread for followups.
this is because the input.value and output.value are raw data structures that can vary from vendor to vendor, but llm.input_messages and llm.output_messages are extracted objects that follow a consistent structure.