Below are the models which support structured generation of in smolagents:
| Model | Support | Parameter name + link to docs |
|---|---|---|
| VLLMModel | ✅ | guided_json (also has grammar, regex, choice etc) |
| MLXModel | ❌ (mlx-lm does not have native support yet, use VLLM) | - |
| TransformersModel | ❌ (Transformers will probably never have support, use VLLM) | - |
| LiteLLMModel | ✅ | response_format (OpenAI compatible) |
| InferenceClientModel | ✅ | response_format (Will be OpenAI compatible.) |
| OpenAIServerModel | ✅ | response_format |
| AzureOpenAIServerModel | ✅ | response_format |
| AmazonBedrockServerModel | ❌ (does not have support) | - |
Under the hood of InferenceClientModel also these providers use response_format tag.
Here is a table of Inference providers available through Hugging Face which use enforced structured generation:
| Inference Provider | Support for structured outputs |
|---|---|
| HF Inference Endpoints | ✅ |
| Cohere | ❌ |
| Nebius | ✅ |
| Fireworks | ✅ |
| Sambanova | ❌ |
| Hyperbolic | ❌ |
| Cerebras | ✅ |
| Novita | ❌ |
| Together | ❌ |