
Setup
Getting started with Model Router is simple. Generate an API key and drop it into your favorite framework.Generate API key
API keys for Model Router are generated within your workspace. Generate a key by logging into the console and navigating to Model router → API keys.Connect via framework
Model Router integrates easily into the most popular frameworks.- OpenAI SDK
- Vercel AI SDK
- Modus
Model Router is a drop-in replacement for OpenAI’s API.
Connect directly via API
You can also access the API directly.- Generation
- Embedding
Available models
Hypermode provides a variety of the most popular open source and commercial models.We’re constantly evaluating model usage in determining new models to add to
our catalog. Interested in using a model not listed here? Let us know at
help@hypermode.com.
Model Introspection
The full list of available models is available via the API.curl
Generation
Large language models provide text generation and reasoning capabilities.| Provider | Model | Slug |
|---|---|---|
| Anthropic | Claude 4 Sonnet | claude-sonnet-4-20250514 |
| Anthropic | Claude 4 Opus | claude-opus-4-20250514 |
| Anthropic | Claude 3.7 Sonnet (latest) | claude-3-7-sonnet-latest |
| Anthropic | Claude 3.7 Sonnet | claude-3-7-sonnet-20250219 |
| Anthropic | Claude 3.5 Sonnet (latest) | claude-3-5-sonnet-latest |
| Anthropic | Claude 3.5 Sonnet | claude-3-5-sonnet-20241022 |
| Anthropic | Claude 3.5 Sonnet | claude-3-5-sonnet-20240620 |
| Anthropic | Claude 3.5 Haiku (latest) | claude-3-5-haiku-latest |
| Anthropic | Claude 3.5 Haiku | claude-3-5-haiku-20241022 |
| Anthropic | Claude 3 Opus (latest) | claude-3-opus-latest |
| Anthropic | Claude 3 Opus | claude-3-opus-20240229 |
| Anthropic | Claude 3 Sonnet | claude-3-sonnet-20240229 |
| Anthropic | Claude 3 Haiku | claude-3-haiku-20240307 |
| DeepSeek | DeepSeek-R1-Distill-Llama | deepseek-ai/deepseek-r1-distill-llama-70b |
| Gemini 2.5 Pro | gemini-2.5-pro-exp-03-25 | |
| Gemini 2.5 Pro Preview | gemini-2.5-pro-preview-05-06 | |
| Gemini 2.5 Flash Preview | gemini-2.5-flash-preview-04-17 | |
| Gemini 2.0 Flash Lite | gemini-2.0-flash-lite | |
| Gemini 2.0 Flash Image Generation | gemini-2.0-flash-exp-image-generation | |
| Gemini 2.0 Flash Live | gemini-2.0-flash-live-001 | |
| Gemini 2.0 Flash (latest) | gemini-2.0-flash | |
| Gemini 2.0 Flash | gemini-2.0-flash-001 | |
| Gemini 1.5 Pro (latest) | gemini-1.5-pro-latest | |
| Gemini 1.5 Pro | gemini-1.5-pro | |
| Gemini 1.5 Pro | gemini-1.5-pro-002 | |
| Gemini 1.5 Pro | gemini-1.5-pro-001 | |
| Gemini 1.5 Flash (latest) | gemini-1.5-flash-latest | |
| Gemini 1.5 Flash | gemini-1.5-flash | |
| Gemini 1.5 Flash | gemini-1.5-flash-002 | |
| Gemini 1.5 Flash | gemini-1.5-flash-001 | |
| Gemini 1.5 Flash 8B (latest) | gemini-1.5-flash-8b-latest | |
| Gemini 1.5 Flash 8B | gemini-1.5-flash-8b | |
| Gemini 1.5 Flash 8B | gemini-1.5-flash-8b-exp-0924 | |
| Gemini 1.5 Flash 8B | gemini-1.5-flash-8b-exp-0827 | |
| Gemini 1.5 Flash 8B | gemini-1.5-flash-8b-001 | |
| Meta | Llama 4 Scout | meta-llama/llama-4-scout-17b-16e-instruct |
| Meta | Llama 3.3 | meta-llama/llama-3.3-70b-versatile |
| OpenAI | O3 (latest) | o3 |
| OpenAI | O3 | o3-2025-04-16 |
| OpenAI | O4 Mini (latest) | o4-mini |
| OpenAI | O4 Mini | o4-mini-2025-04-16 |
| OpenAI | GPT 4.5 Preview (latest) | gpt-4.5-preview |
| OpenAI | GPT 4.5 Preview | gpt-4.5-preview-2025-02-27 |
| OpenAI | O3 Mini (latest) | o3-mini |
| OpenAI | O3 Mini | o3-mini-2025-01-31 |
| OpenAI | O1 (latest) | o1 |
| OpenAI | O1 | o1-2024-12-17 |
| OpenAI | O1 Preview (latest) | o1-preview |
| OpenAI | O1 Preview | o1-preview-2024-09-12 |
| OpenAI | O1 Mini (latest) | o1-mini |
| OpenAI | O1 Mini | o1-mini-2024-09-12 |
| OpenAI | GPT 4.1 (latest) | gpt-4.1 |
| OpenAI | GPT 4.1 | gpt-4.1-2025-04-14 |
| OpenAI | GPT 4.1 Mini (latest) | gpt-4.1-mini |
| OpenAI | GPT 4.1 Mini | gpt-4.1-mini-2025-04-14 |
| OpenAI | GPT 4.1 Nano (latest) | gpt-4.1-nano |
| OpenAI | GPT 4.1 Nano | gpt-4.1-nano-2025-04-14 |
| OpenAI | GPT-4o Mini Search Preview (latest) | gpt-4o-mini-search-preview |
| OpenAI | GPT-4o Mini Search Preview | gpt-4o-mini-search-preview-2025-03-11 |
| OpenAI | GPT 4o (latest) | gpt-4o |
| OpenAI | GPT 4o | gpt-4o-2024-11-20 |
| OpenAI | GPT 4o | gpt-4o-2024-08-06 |
| OpenAI | GPT 4o | gpt-4o-2024-05-13 |
| OpenAI | GPT 4o Mini (latest) | gpt-4o-mini |
| OpenAI | GPT 4o Mini | gpt-4o-mini-2024-07-18 |
| OpenAI | GPT 4o Audio Preview (latest) | gpt-4o-audio-preview |
| OpenAI | GPT 4o Audio Preview | gpt-4o-audio-preview-2024-12-17 |
| OpenAI | GPT 4o Audio Preview | gpt-4o-audio-preview-2024-10-01 |
| OpenAI | GPT 4o Search Preview (latest) | gpt-4o-search-preview |
| OpenAI | GPT 4o Search Preview | gpt-4o-search-preview-2025-03-11 |
| OpenAI | GPT 4o Search Preview | gpt-4o-search-preview-2025-03-11 |
| OpenAI | ChatGPT 4o | chatgpt-4o-latest |
| OpenAI | GPT 4 (latest) | gpt-4 |
| OpenAI | GPT 4 | gpt-4-0613 |
| OpenAI | GPT 4 Turbo | gpt-4-turbo-2024-04-09 |
| OpenAI | GPT 4 Turbo Preview | gpt-4-turbo-preview |
| OpenAI | GPT 4 Preview (latest) | gpt-4-1106-preview |
| OpenAI | GPT 4 Preview | gpt-4-0125-preview |
| OpenAI | GPT 3.5 Turbo (latest) | gpt-3.5-turbo |
| OpenAI | GPT 3.5 Turbo | gpt-3.5-turbo-1106 |
| OpenAI | GPT 3.5 Turbo | gpt-3.5-turbo-0125 |
| Mistral | Mistral Large (coming soon) | mistral-large-latest |
| Mistral | Pixtral Large (coming soon) | pixtral-large-latest |
| Mistral | Mistral Medium (coming soon) | mistral-medium-latest |
| Mistral | Mistral Moderation (coming soon) | mistral-moderation-latest |
| Mistral | Ministral 3B (coming soon) | ministral-3b-latest |
| Mistral | Ministral 8B (coming soon) | ministral-8b-latest |
| Mistral | Open Mistral Nemo (coming soon) | open-mistral-nemo |
| Mistral | Mistral Small (coming soon) | mistral-small-latest |
| Mistral | Mistral Saba (coming soon) | mistral-saba-latest |
| Mistral | Codestral (coming soon) | codestral-latest |
| xAI | Grok 3 Beta (coming soon) | grok-3-beta |
| xAI | Grok 3 Fast Beta (coming soon) | grok-3-fast-beta |
| xAI | Grok 3 Mini Beta (coming soon) | grok-3-mini-beta |
| xAI | Grok 3 Mini Fast Beta (coming soon) | grok-3-mini-fast-beta |
Embedding
Embedding models provide vector representations of text for similarity matching and other applications.| Provider | Model | Slug |
|---|---|---|
| Nomic AI | Embed Text V1.5 | nomic-ai/nomic-embed-text-v1.5 |
| OpenAI | Embedding 3 Large | text-embedding-3-large |
| OpenAI | Embedding 3 Small | text-embedding-3-small |
| OpenAI | ADA Embedding | text-embedding-ada-002 |
| Hugging Face | MiniLM-L6-v2 (coming soon) | sentence-transformers/all-MiniLM-L6-v2 |
Choosing the right model
Choosing the right model is essential to building effective agents. This section helps you evaluate trade-offs, pick the right model for your use case, and iterate quickly.Key considerations
- Accuracy and output quality: Advanced logic, mathematical problem-solving, and multi-step analysis may require high-capability models.
- Domain expertise: Performance varies by domain (for example, creative writing, code, scientific analysis). Review model benchmarks or test with your own examples.
- Context window: Long documents, extensive conversations, or large codebases require models with longer context windows.
- Embeddings: For semantic search or similarity, consider embedding models. These aren’t for text generation.
- Latency: Real-time apps may need low-latency responses. Smaller models (or “Mini,” “Nano,” and “Flash” variants) typically respond faster than larger models.
Models by task / use case at a glance
| Task / use case | Example models | Key strengths | Considerations |
|---|---|---|---|
| General-purpose conversation | Claude 4 Sonnet, GPT-4.1, Gemini Pro | Balanced, reliable, creative | May not handle edge cases as well |
| Complex reasoning and research | Claude 4 Opus, O3, Gemini 2.5 Pro | Highest accuracy, multi-step analysis | Higher cost, quality critical |
| Creative writing and content | Claude 4 Opus, GPT-4.1, Gemini 2.5 Pro | High-quality output, creativity, style control | High cost for premium content |
| Document analysis and summarization | Claude 4 Opus, Gemini 2.5 Pro, Llama 3.3 | Handles long inputs, comprehension | Higher cost, slower |
| Real-time apps | Claude 3.5 Haiku, GPT-4o Mini, Gemini 1.5 Flash 8B | Low latency, high throughput | Less nuanced, shorter context |
| Semantic search and embeddings | OpenAI Embedding 3, Nomic AI, Hugging Face | Vector search, similarity, retrieval | Not for text generation |
| Custom model training & experimentation | Llama 4 Scout, Llama 3.3, DeepSeek, Mistral | Open source, customizable | Requires setup, variable performance |
Hypermode provides access to the most popular open source and commercial
models through Hypermode Model Router documentation. We’re
constantly evaluating model usage and adding new models to our catalog based
on demand.
Get started
You can change models at any time in your agent settings. Start with a general-purpose model, then iterate and optimize as you learn more about your agent’s needs.- Create an agent with GPT-4.1 (default).
- Define clear instructions and connections for the agent’s role.
- Test with real examples from your workflow.
- Refine and iterate based on results.
- Evaluate alternatives once you understand patterns and outcomes.
Value first, optimize second. Clarify the task requirements before tuning
for specialized capabilities or cost.
Comparison of select large language models
| Model | Best For | Considerations | Context Window+ | Speed | Cost++ |
|---|---|---|---|---|---|
| Claude 4 Opus | Complex reasoning, long docs | Higher cost, slower than lighter models | Very long (200K+) | Moderate | $$$$ |
| Claude 4 Sonnet | General-purpose, balanced workloads | Less capable than Opus for edge cases | Long (100K+) | Fast | $$$ |
| GPT-4.1 | Most tasks, nuanced output | Higher cost, moderate speed | Long (128K) | Moderate | $$$ |
| GPT-4.1 Mini | High-volume, cost-sensitive | Less nuanced, shorter context | Medium (32K-64K) | Very Fast | $$ |
| GPT o3 | General chat, broad compatibility | May lack latest features/capabilities | Medium (32K-64K) | Fast | $$ |
| Gemini 2.5 Pro | Up-to-date info | Limited access, higher cost | Long (128K+) | Moderate | $$$ |
| Gemini 2.5 Flash | Real-time, rapid responses | Shorter context, less nuanced | Medium (32K-64K) | Very Fast | $$ |
| Llama 4 Scout | Privacy, customization, open source | Variable performance | Medium-Long (varies) | Fast | $ |