Valentin G.

Commented on Error with Local Model Using Evals API: Tool Call...·Posted inPhoenix Support

Yes, that is what I did to "fix" it. But it is still not 100% reliable. I do not understand why it is necessary to make a tool call in the first place.

Posted in Phoenix Support·

Valentin G.

Error with Local Model Using Evals API: Tool Call Requirement

I am getting a strange error when using a local open source model (In this case Qwen3-30B) for evals with the new Evals API. It is working normally with OpenAI/Azure models. The model I used here is pretty capable and has no problem to do tool calling in other contexts. Apparently, I get the message that there need to be tool calls in the response - otherwise it does not work. Is this connected to generating formatted output? The error occurs consistently. But, I can reduce the occurences of the error down to approx. 10% if I had something like "Call any tool" in the prompt. As this is not really a gentle solution, I wanted to ask if anyone is aware of this issue. This is my setup, roughly:

model_name = os.getenv("MODEL_NAME_JUDGE")
api_key = os.getenv("OPENAI_KEY_JUDGE", "open")
base_url = os.getenv("OPENAI_BASE_URL")

eval_model = LLM(api_key=api_key, base_url=base_url, model=model_name, provider="openai")

print(eval_model.generate_classification(
        prompt="Test prompt.",
        labels=["correct", "not_correct"],
        include_explanation=False))

  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\experiments\functions.py", line 790, in async_evaluate_run
    result = await evaluator.async_evaluate(...)
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\experiments\evaluators\base.py", line 77, in async_evaluate
    return self.evaluate(output=output, **kwargs)
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\experiments\evaluators\utils.py", line 215, in evaluate
    result = func(*bound_signature.args, **bound_signature.kwargs)
  File "C:\Users\bizis01\Desktop\request-to-mdx\src\text_to_mdx\experimentation\tests\correctness.py", line 48, in correctness
    response = eval_model.generate_classification(
        prompt="Test prompt.",
        labels=["correct", "not_correct"],
        include_explanation=False
    )
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\evals\llm\wrapper.py", line 288, in generate_classification
    result = self.generate_object(prompt, schema, **kwargs)
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\evals\tracing.py", line 153, in _wrapper_sync
    result = func(*args, **kwargs)
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\evals\llm\wrapper.py", line 246, in generate_object
    return rate_limited_generate(prompt, schema, **kwargs)
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\evals\rate_limiters.py", line 218, in wrapper
    return fn(*args, **kwargs)
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\evals\llm\adapters\openai\adapter.py", line 163, in generate_object
    return self._generate_with_tool_calling(prompt, schema, **kwargs)
  File "C:\Users\bizis01\Desktop\request-to-mdx\.venv\Lib\site-packages\phoenix\evals\llm\adapters\openai\adapter.py", line 264, in _generate_with_tool_calling
    raise ValueError("No tool calls in response")

ValueError: No tool calls in response

The above exception was the direct cause of the following exception:
RuntimeError: evaluator failed for example id 'RGF0YXNldEV4YW1wbGU6OQ==', repetition 1

3Comments

Commented on Error with Azure OpenAI API: max_tokens vs max_com...·Posted inPhoenix Support

Valentin G.

I’m running into a 404 error when using the new evals library — connection to the endpoint fails:

eval_model = LLM(
    model=model_name,
    provider="azure",
    client="openai",
    api_version=api_version,
    api_key=api_key,
    base_url=azure_base_url
)

eval_model.generate_classification(...)

I’m using the exact same parameters that work with OpenAIModel in the legacy evals library. Could this be related to using chat-completions vs. responses?

Commented on Error with Azure OpenAI API: max_tokens vs max_com...·Posted inPhoenix Support

Valentin G.

RunLLM the issue and PR you shared is only talking about the reasoning models (o1, o3, o4). I have installed the latest version and there is no support for gpt-5 models.

Posted in Phoenix Support·

Valentin G.

Error with Azure OpenAI API: max_tokens vs max_completion_tokens

Hi everyone I’m running an eval_model with llm_classify using the Azure OpenAI API, and I’m getting this error when using gpt-5-mini:

Unsupported parameter: 'max_tokens' is not supported with this model. 
Use 'max_completion_tokens' instead.

Digging deeper, I found this method in the codebase (OpenAIModel):

def _get_token_param_str(is_azure: bool, model: str) -> str:
    """
    Get the token parameter string for the given model.
    OpenAI o1 and o3 models made a switch to use
    max_completion_tokens and now all the models support it.
    However, Azure OpenAI models currently do not support
    max_completion_tokens unless it's an o1 or o3 model.
    """
    azure_reasoning_models = ("o1", "o3", "o4")

    if is_azure and not model.startswith(azure_reasoning_models):
        return "max_tokens"
    return "max_completion_tokens"

As you can see, it currently filters only for the reasoning models (o1, o3, o4) to use max_completion_tokens. However, I couldn’t find anything in the Azure docs confirming this — but it seems like the Azure endpoint might now require max_completion_tokens for the GPT-5 family as well. Question: Has anyone else experienced this or found updated documentation confirming that Azure GPT-5 models now require max_completion_tokens instead of max_tokens?

9Comments