Arize AI Community Icon

Home
Events

Arize AX Releases
Arize AX Support
Arize News
Arize Observe
Discussions
Introductions
Phoenix Support

Powered by Tightknit

Phoenix Support

Phoenix Support

Error with Azure OpenAI API: max_tokens vs max_completion_tokens

·Oct 07, 2025 10:24 AM·

Hi everyone I’m running an eval_model with llm_classify using the Azure OpenAI API, and I’m getting this error when using gpt-5-mini:

Unsupported parameter: 'max_tokens' is not supported with this model. 
Use 'max_completion_tokens' instead.

Digging deeper, I found this method in the codebase (OpenAIModel):

def _get_token_param_str(is_azure: bool, model: str) -> str:
    """
    Get the token parameter string for the given model.
    OpenAI o1 and o3 models made a switch to use
    max_completion_tokens and now all the models support it.
    However, Azure OpenAI models currently do not support
    max_completion_tokens unless it's an o1 or o3 model.
    """
    azure_reasoning_models = ("o1", "o3", "o4")

    if is_azure and not model.startswith(azure_reasoning_models):
        return "max_tokens"
    return "max_completion_tokens"

As you can see, it currently filters only for the reasoning models (o1, o3, o4) to use max_completion_tokens. However, I couldn’t find anything in the Azure docs confirming this — but it seems like the Azure endpoint might now require max_completion_tokens for the GPT-5 family as well. Question: Has anyone else experienced this or found updated documentation confirming that Azure GPT-5 models now require max_completion_tokens instead of max_tokens?

9 comments

· Sorted by Oldest

RunLLM
APP
·
·
Yes, this is a known issue: Azure OpenAI's o1, o3, o4, and now gpt-5 family models require the use of the max_completion_tokens parameter instead of max_tokens. The Phoenix codebase and recent PRs confirm that the playground and evals logic have been updated to map max_tokens to max_completion_tokens for these models, including gpt-5 and its variants, to resolve API errors—see the fix in this PR and the related discussion in this issue.While the official Azure documentation does not yet explicitly state this requirement for GPT-5, the Phoenix maintainers and user reports confirm that the Azure API now enforces max_completion_tokens for all o1/o3/o4/gpt-5 models. If you encounter this error, you should update your code or library to use max_completion_tokens for these model families when calling the Azure OpenAI API.Would you like more detail on how to patch your code or configure Phoenix to handle this automatically?Sources: - fix: duplicate OpenAIReasoningStreamingClient for AzureOpenAIReasoningStreamingClient - BUG: 'max_tokens' is not supported in OpenAI Azure deployment of o3-mini model connecting through prompt playground
💡 Hint: Mention RunLLM in the thread for followups.
Valentin G.
·
·
RunLLM the issue and PR you shared is only talking about the reasoning models (o1, o3, o4). I have installed the latest version and there is no support for gpt-5 models.
RunLLM
APP
·
·
You are correct: as of the latest Phoenix releases, support for max_completion_tokens with Azure GPT-5 models is not fully implemented—recent PRs and issues only address o1, o3, and o4 reasoning models, not GPT-5. The PR 8661 and issue 8631 do not mention GPT-5, and the codebase currently filters only for the reasoning models when mapping the token parameter.A recent merged PR (PR 8985) highlights that GPT-5 models are misclassified and not fully supported in the playground, resulting in missing streaming support and invocation parameter handling. There is no explicit update for GPT-5 in the Azure parameter mapping logic yet.Would you like more detail on the current code paths or how to track when GPT-5 support is added?Sources: - feat(playground): support gpt-5 - PR 8661 - issue 8631
💡 Hint: Mention RunLLM in the thread for followups.
Srilakshmi
External
·
·
Hi Valentin, the llm_classify & OpenAIModel is now apart of our legacy evals library. This page in our documentation should take you to the how to configure an LLM with our evals 2.0 library.
Elizabeth H.
·
·
Just a heads up, though, that we are still working on supporting LLM invocation parameters with evals 2.0 so you won't be able to specify max_tokens or max_completion_tokens right now. We are actively working on this now so it should be available soon. cc: 🔒[private user], 🔒[private user]
Dustin N.
·
·
hi Valentin G. this should be fixed here: https://github.com/Arize-ai/phoenix/pull/9829
❤️1
Valentin G.
·
·
I’m running into a 404 error when using the new evals library — connection to the endpoint fails:
eval_model = LLM( model=model_name, provider="azure", client="openai", api_version=api_version, api_key=api_key, base_url=azure_base_url ) eval_model.generate_classification(...)
I’m using the exact same parameters that work with OpenAIModel in the legacy evals library. Could this be related to using chat-completions vs. responses?
Dustin N.
·
·
hi Valentin G. let me look into your question, it seems possible that some api_versions don't support the responses API but I'm not sure at this moment
Dustin N.
·
·
ah yeah, looks like the responses API is only available with the "v1" azure api? do you have access to this https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle?tabs=python#api-evolution

❤️

1