Below are the models which support structured generation of in smolagents:
Model | Support | Parameter name + link to docs |
---|---|---|
VLLMModel | ✅ | guided_json (also has grammar, regex, choice etc) |
MLXModel | ❌ (mlx-lm does not have native support yet, use VLLM) | - |
TransformersModel | ❌ (Transformers will probably never have support, use VLLM) | - |
LiteLLMModel | ✅ | response_format (OpenAI compatible) |
InferenceClientModel | ✅ | response_format (Will be OpenAI compatible.) |
OpenAIServerModel | ✅ | response_format |
AzureOpenAIServerModel | ✅ | response_format |
AmazonBedrockServerModel | ❌ (does not have support) | - |
Under the hood of InferenceClientModel also these providers use response_format
tag.
Here is a table of Inference providers available through Hugging Face which use enforced structured generation:
Inference Provider | Support for structured outputs |
---|---|
HF Inference Endpoints | ✅ |
Cohere | ❌ |
Nebius | ✅ |
Fireworks | ✅ |
Sambanova | ❌ |
Hyperbolic | ❌ |
Cerebras | ✅ |
Novita | ❌ |
Together | ❌ |