Manan G.

Commented on Error: TaskRuntimeError with Template Variable Att...·Posted inArize AX Support

Thanks

Commented on Error: TaskRuntimeError with Template Variable Att...·Posted inArize AX Support

Getting this error

Convergence: The evaluation label is NOT_PARSABLE for 1 spans, which may be due to one or more of the following issues: 1. "Enable Function Calling" is disabled in the UI, so labels are not extracted correctly and snapped to rails. Enable Function Calling to resolve this. 2. The max tokens setting is too low, cutting off the LLM's output during the explanation before generating the label. Increase max tokens or toggle off explanations on the task to fix this. 3. Both rails appear in the explanation, confusing the parsing logic. Update the prompt to encourage the LLM to mention only one rail. For spans with ids: fb68b3d56b082ea1

Commented on Error: TaskRuntimeError with Template Variable Att...·Posted inArize AX Support

Manan G.

# Google Sheets Copilot Convergence Evaluation Judge

You are an expert evaluator specializing in analyzing whether a Google Sheets AI copilot is making effective progress toward solving user requests without getting stuck in loops.

## Task
Evaluate whether the AI assistant's multi-step process demonstrates good convergence toward the goal or shows signs of being stuck in unproductive loops.

## Evaluation Criteria

### **Converging (PASS)**
- Each iteration brings the system measurably closer to solving the user's spreadsheet task
- New information, formulas, or data is obtained in each step
- System builds logically on previous findings
- Different approaches are tried when initial attempts don't work
- Clear progression from problem identification → data gathering → solution implementation

### **Stuck/Looping (FAIL)**
- Identical or near-identical actions repeated without variation
- Same Google Sheets API calls made multiple times with same parameters
- Oscillating between 2-3 actions without making progress
- No new information gathered for 3+ consecutive iterations
- Circular reasoning that revisits the same logical state
- Making the same error repeatedly without adaptation
- Repeating failed approaches without learning from failures

## Evaluation Process
Analyze the conversation step-by-step:

1. **Action Uniqueness**: Are the system's actions varied and building on each other?
2. **Information Gathering**: Is new data/context acquired in each iteration?
3. **Goal Proximity**: Is each step measurably closer to the final spreadsheet solution?
4. **Loop Detection**: Check for repeated tool calls, identical reasoning patterns, circular logic, or repeated error patterns
5. **Error Adaptation**: When errors occur, does the system learn and try different approaches or repeat the same failed method?

## Examples

### Example 1: PASS - Good Convergence
**User Request**: "Calculate the total sales for Q3 and create a summary"
**AI Process**:
- Step 1: Identifies Q3 date range (July-September)
- Step 2: Locates sales data in columns B-D
- Step 3: Creates SUMIFS formula for Q3 filtering
- Step 4: Implements formula and validates results
- Step 5: Creates summary table with formatted output

**Analysis**: Each step builds logically, new information is gathered, clear progression toward goal.
**LABEL**: PASS

### Example 2: FAIL - Stuck in Loop
**User Request**: "Find the highest value in column C"
**AI Process**:
- Step 1: Calls sheets.get(range="C:C") 
- Step 2: Calls sheets.get(range="C:C") [identical]
- Step 3: Suggests MAX(C:C) formula
- Step 4: Calls sheets.get(range="C:C") [identical again]
- Step 5: Suggests MAX(C:C) formula [repeated reasoning]

**Analysis**: Identical API calls repeated, no progress between iterations 2-5, circular reasoning pattern.
**LABEL**: FAIL

### Example 3: FAIL - Error Loop Despite Final Success
**User Request**: "Create a filter formula in L5:R8"
**AI Process**:
- Step 1: Attempts FILTER formula → Error: "overwrite data in M5"
- Step 2: Clears range L5:R8 → Clear succeeds
- Step 3: Reapplies same FILTER formula → Same error: "overwrite data in M5"
- Step 4: Clears larger range L5:R21 → Clear succeeds  
- Step 5: Reapplies identical FILTER formula → Same error again
- Step 6: Final attempt with different approach → Success

**Analysis**: Steps 1-5 show a clear error loop - same formula causing same error, only cleared different ranges but never addressed root cause. Despite eventual success, the repetitive error pattern indicates poor convergence.
**LABEL**: FAIL

### Example 4: PASS - Error Recovery with Adaptation
**User Request**: "Find word pairs with S/vowel replacements"
**AI Process**:
- Step 1: Uses Apps Script with `Set` → Error: "Set is not defined"
- Step 2: Recognizes ES5 limitation, rewrites using object instead → Success
- Step 3: Completes analysis and reports results

**Analysis**: When error occurred, system immediately identified root cause and adapted approach. No repetition of failed method.
**LABEL**: PASS

## Your Task
Analyze the following Google Sheets copilot interaction:

**AI Process Steps**:
{attributes.interactionHistory}

**Final Output**:
{attributes.output.value}

## Step-by-Step Analysis
1. **Action Pattern Analysis**: [Examine if actions are unique and purposeful]
2. **Information Progress**: [Check if new data/insights are gained each step]
3. **Goal Movement**: [Assess if system is getting closer to solving the user's request]
4. **Loop Detection**: [Identify any repeated actions or circular reasoning]
5. **Adaptation Capability**: [Note if system tries different approaches when stuck]

**LABEL**: [PASS/FAIL]
**Reasoning**: [Detailed explanation of convergence behavior and any loop patterns detected]

Make sure you include the word LABEL in your response before the classification

## Output Format
LABEL: [PASS/FAIL]
Reasoning: [Detailed step-by-step analysis explaining your evaluation]

Posted in Arize AX Support·

Manan G.

Error: TaskRuntimeError with Template Variable Attributes in Eval

Hi even though I have the exact same attribute name in my prompt, it’s even showing up while testing. But when i run the eval I’m getting this error

TaskRuntimeError: Approximately 0 spans were skipped due to error: rpc error: code = Unknown desc = Failed to get arrow records for model: Task failed due to invalid template variable attributes.interactionHistory

3Comments

Posted in Arize AX Support·

Manan G.

Filtering Custom Attributes in Arize UI: A Basic Requirement?

Hey folks, I don't know why is there no support for filtering custom attributes in the arize UI. Specially if it's nested. Feels like a very basic requirement. Am I using it wrong? For example, here i want to filter something inside metadata > event > eventname. Also the reason why were sending everything inside metadata is the same - we thought filtering on custom attributes in supported inside metadata.

{
  "event": {
    "flow_id": "string",
    "index": "number",
    "timestamp": "string (ISO 8601 datetime)",
    "type": "string"
  },
  "flow": {
    "id": "string"
  },
  "input": {
    "value": "string"
  },
  "metadata": {
    "userPrompt": "string",
    "userIntent": "string",
    "event": {
      "userId": "string",
      "chatId": "string",
      "eventName": "string",
      "eventType": "string",
      "timestamp": "string (ISO 8601 datetime)",
      "duration": "number",
      "eventAttributes": {
        "userPrompt": "string",
        "context": "string"
      },
      "isOfflineEval": "boolean"
    }
  },
  "openinference": {
    "span": {
      "kind": "string"
    }
  },
  "retrieval": {
    "documents": [
      {
        "document.content": "string (JSON-encoded content)",
        "document.id": "string"
      }
    ]
  },
  "session": {
    "id": "string"
  },
  "user": {
    "id": "string"
  }
}

0Comments

Commented on Setting Up Eval in Arize UI for LLM Call Span·Posted inArize AX Support

Manan G.

I understand that, but i want to filter based on the occurence of the LLM call because all the LLM calls cannot be uniquely filtered. For example here i want to filter out the last LLM call. Any idea on how i can do that?

Commented on Issue with Output Rails Not Being Parsable Despite...·Posted inArize AX Support

Manan G.

Hi, now even if "label" is coming up it's not getting parsed

Posted in Arize AX Support·

Manan G.

Setting Up Eval in Arize UI for LLM Call Span

Hello, I want to set up eval in arize ui on the first llm call span in the trace, how do I do it?

1Comment

Commented on Issue with Output Rails Not Being Parsable Despite...·Posted inArize AX Support

Manan G.

worked, thanks

Commented on Issue with Output Rails Not Being Parsable Despite...·Posted inArize AX Support

Manan G.

Amazing, thanks will try this out