Below are the models which support structured generation of in smolagents:

Model Support Parameter name + link to docs
VLLMModel guided_json (also has grammar, regex, choice etc)
MLXModel ❌ (mlx-lm does not have native support yet, use VLLM) -
TransformersModel ❌ (Transformers will probably never have support, use VLLM) -
LiteLLMModel response_format (OpenAI compatible)
InferenceClientModel response_format (Will be OpenAI compatible.)
OpenAIServerModel response_format
AzureOpenAIServerModel response_format
AmazonBedrockServerModel ❌ (does not have support) -

Under the hood of InferenceClientModel also these providers use response_format tag.

Here is a table of Inference providers available through Hugging Face which use enforced structured generation:

Inference Provider Support for structured outputs
HF Inference Endpoints
Cohere
Nebius
Fireworks
Sambanova
Hyperbolic
Cerebras
Novita
Together