# AI Model Selection

## 🧠 **Choose Your AI Engine**

### **Provider Connection Requirements**

> **⚠️ Important**: To use models from any provider (except AgenticFlow), you must first add a connection for that provider in your project's **Connection Settings**.
>
> * **AgenticFlow Provider**: Uses your AgenticFlow credits directly - no connection setup required
> * **All Other Providers**: Require a valid connection configured in Connection Settings
> * **Automatic Selection**: The server automatically uses the first available connection for the selected model's provider

**How to add a provider connection:**

1. Navigate to your project's **Connection Settings**
2. Add a new connection for your desired provider (OpenAI, Anthropic, Google, etc.)
3. Configure the required API keys and credentials
4. Once connected, you can select models from that provider in your agent configuration

### **Quick Provider Overview**

| Provider         | Models | Best For                                                      | Cost Range (per 1M tokens) |
| ---------------- | ------ | ------------------------------------------------------------- | -------------------------- |
| **PixelML**      | 50+    | Unified access to latest models (GPT-5, Claude 4.5, Gemini 3) | $0.05-15 input             |
| **OpenAI**       | 17     | Reasoning (O1/O3), Coding (GPT-5.1 Codex), General use        | $0.08-15 input             |
| **Anthropic**    | 8      | Analysis, structured outputs, long context                    | $1-15 input                |
| **Google GenAI** | 5      | Vision, video, audio, 1M+ token context                       | $0.075-2.5 input           |
| **Groq**         | 14     | Ultra-fast inference (< 1 sec), open-source models            | $0.03-1 input              |
| **DeepSeek**     | 2      | Cost-efficient reasoning and chat (V4 lineup)                 | $0.14-0.435 input          |
| **AgenticFlow**  | 11     | Budget-friendly, 50K input token limit                        | $0.05-0.2 input            |

***

## 📊 **Complete Model Comparison Table**

### **OpenAI Models**

| Model                  | Input Cost | Output Cost | Context | Features                     | Best For                              |
| ---------------------- | ---------- | ----------- | ------- | ---------------------------- | ------------------------------------- |
| **GPT-5 Pro**          | $15.00     | $120.00     | 400K    | Vision, Tools, Structured    | Premium reasoning, critical decisions |
| **GPT-5.1**            | $1.25      | $10.00      | 400K    | Vision, Tools, Structured    | Advanced general use                  |
| **GPT-5.1 Codex**      | $1.25      | $10.00      | 400K    | Tools, Structured            | Code generation & debugging           |
| **GPT-5.1 Codex Mini** | $1.50      | $6.00       | 400K    | Tools, Structured            | Efficient coding tasks                |
| **GPT-5**              | $1.25      | $10.00      | 272K    | Vision, Tools, Structured    | General premium use                   |
| **GPT-5 Mini**         | $0.25      | $2.00       | 272K    | Vision, Tools, Structured    | Cost-effective quality                |
| **GPT-5 Nano**         | $0.05      | $0.40       | 272K    | Vision, Tools, Structured    | High-volume budget tasks              |
| **GPT-4.1**            | $2.00      | $8.00       | 1M      | Tools, Structured            | Large context needs                   |
| **GPT-4.1 Mini**       | $0.40      | $1.60       | 1M      | Tools, Structured            | Cost-effective large context          |
| **GPT-4.1 Nano**       | $0.08      | $0.32       | 1M      | Tools, Structured            | Ultra-budget large context            |
| **GPT-4o**             | $0.50      | $1.50       | 128K    | Tools, Structured            | Standard general use                  |
| **O3**                 | $1.10      | $4.40       | 128K    | Reasoning, Tools, Structured | Complex reasoning                     |
| **O3 Mini**            | $0.60      | $2.40       | 200K    | Reasoning, Tools, Structured | Efficient reasoning                   |
| **O1**                 | $1.10      | $4.40       | 200K    | Reasoning, Tools, Structured | Advanced reasoning                    |
| **O1 Mini**            | $0.40      | $1.60       | 128K    | Tools, Structured            | Budget reasoning                      |
| **GPT OSS 120B**       | $0.15      | $0.75       | 131K    | Tools, Structured            | Open-source compatible                |
| **GPT OSS 20B**        | $0.10      | $0.50       | 131K    | Tools, Structured            | Efficient open-source                 |

### **Anthropic (Claude) Models**

| Model                          | Input Cost | Output Cost | Context | Features                             | Best For                     |
| ------------------------------ | ---------- | ----------- | ------- | ------------------------------------ | ---------------------------- |
| **Claude 4.5 Opus**            | $5.00      | $25.00      | 200K    | Vision, Reasoning, Tools, Structured | Premium analysis & reasoning |
| **Claude 4.5 Sonnet**          | $3.00      | $15.00      | 200K    | Tools, Structured                    | Balanced quality & cost      |
| **Claude 4.5 Haiku**           | $1.00      | $5.00       | 200K    | Tools, Structured                    | Fast, cost-effective         |
| **Claude 4 Opus**              | $15.00     | $75.00      | 50K     | Tools, Structured                    | Highest quality reasoning    |
| **Claude 4 Sonnet**            | $3.00      | $15.00      | 50K     | Tools, Structured                    | Quality analysis             |
| **Claude 3.7 Sonnet**          | $3.00      | $15.00      | 50K     | Tools, Structured                    | Advanced analysis            |
| **Claude 3.7 Sonnet Thinking** | $3.00      | $15.00      | 200K    | Reasoning, Structured                | Thinking mode enabled        |
| **Claude 3.5 Opus**            | $15.00     | $75.00      | 50K     | Tools, Structured                    | Legacy premium               |
| **Claude 3.5 Sonnet**          | $3.00      | $15.00      | 50K     | Tools, Structured                    | Legacy balanced              |
| **Claude 3.5 Haiku**           | $3.00      | $4.00       | 50K     | Tools, Structured                    | Legacy fast                  |

### **Google Gemini Models**

| Model                     | Input Cost | Output Cost | Context | Features                                | Best For                   |
| ------------------------- | ---------- | ----------- | ------- | --------------------------------------- | -------------------------- |
| **Gemini 3 Pro Preview**  | $2.00      | $12.00      | 1M      | Vision, Tools, Structured               | Latest Gemini capabilities |
| **Gemini 2.5 Pro**        | $2.50      | $15.00      | 1M      | Vision, Video, Tools, Structured        | Multi-modal premium        |
| **Gemini 2.5 Flash**      | $0.15      | $0.60       | 1M      | Tools, Structured                       | Fast, cost-effective       |
| **Gemini 2.5 Flash Lite** | $0.075     | $0.30       | 1M      | Vision, Audio, Video, Tools, Structured | Ultra-budget multi-modal   |
| **Gemini 2.0 Flash**      | $0.10      | $0.40       | 1M      | Vision, Tools, Structured               | Balanced speed & cost      |
| **Gemini 2.0 Flash Lite** | $0.075     | $0.30       | 1M      | Vision, Tools, Structured               | Budget vision support      |
| **Gemini 1.5 Pro**        | $2.50      | $10.00      | 2M      | Tools, Structured                       | Largest context window     |
| **Gemini 1.5 Flash**      | $0.15      | $0.60       | 1M      | Tools, Structured                       | Fast & efficient           |

### **DeepSeek Models**

| Model                 | Input Cost | Output Cost | Context | Features                     | Best For                                             |
| --------------------- | ---------- | ----------- | ------- | ---------------------------- | ---------------------------------------------------- |
| **DeepSeek V4 Flash** | $0.14      | $0.28       | 1M      | Tools, Structured            | **Default** — chat, reasoning, coding (peak quality) |
| **DeepSeek V4 Pro**   | $0.435     | $0.87       | 1M      | Reasoning, Tools, Structured | Frontier-tier, specialized workloads only            |

> **⚠️ Legacy retirement (31 May 2026 PST):** `deepseek-chat`, `deepseek-reasoner`, and `DeepSeek V3.2 / V3.2 Speciale / V3.2 Exp` are inaccessible after May 31, 2026 (PST). Migrate **all of them to DeepSeek V4 Flash** — its quality already covers both the chat and reasoner slots, no need to split. Use V4 Pro only when a specialized frontier-tier workload genuinely calls for the larger model. Same base URL and API key — only the model field changes.

### **Groq Models (Ultra-Fast Inference)**

| Model                             | Input Cost | Output Cost | Context | Features          | Best For                   |
| --------------------------------- | ---------- | ----------- | ------- | ----------------- | -------------------------- |
| **Llama 3.3 70B Versatile**       | $0.59      | $0.79       | 131K    | Tools             | Fast large model           |
| **DeepSeek R1 Distill Llama 70B** | $0.75      | $0.99       | 131K    | Tools             | Fast reasoning             |
| **Llama 3.1 8B Instant**          | $0.05      | $0.08       | 131K    | Tools             | Ultra-fast, ultra-cheap    |
| **Llama 4 Maverick 17B 128E**     | $0.24      | $0.24       | 131K    | Tools             | Fast balanced model        |
| **Llama 4 Scout 17B 16E**         | $0.11      | $0.34       | 131K    | Tools             | Fast efficient model       |
| **Kimi K2 Instruct**              | $1.00      | $3.00       | 131K    | Tools, Structured | Fast advanced chat         |
| **Qwen 3 32B**                    | $0.29      | $0.54       | 131K    | Tools             | Fast Chinese + English     |
| **Mistral Saba 24B**              | $0.79      | $0.79       | 32K     | Tools             | Fast European model        |
| **Gemma 2 9B**                    | $0.20      | $0.20       | 8K      | Tools             | Fast small model           |
| **GPT OSS 120B**                  | $0.15      | $0.75       | 131K    | Tools             | Fast open-source large     |
| **GPT OSS 20B**                   | $0.10      | $0.40       | 131K    | Tools             | Fast open-source small     |
| **Llama Guard 4 12B**             | $0.20      | $0.20       | 131K    | Streaming         | Safety moderation          |
| **Llama Prompt Guard 2 86M**      | $0.04      | $0.04       | 512     | Streaming         | Prompt injection detection |
| **Llama Prompt Guard 2 22M**      | $0.03      | $0.03       | 512     | Streaming         | Fast prompt filtering      |

### **Additional Models (PixelML)**

| Model                | Input Cost | Output Cost | Context | Features                     | Best For                      |
| -------------------- | ---------- | ----------- | ------- | ---------------------------- | ----------------------------- |
| **Grok 4**           | $3.00      | $15.00      | 64K     | Tools, Structured            | xAI model with real-time data |
| **Kimi K2**          | $0.57      | $2.30       | 64K     | Tools, Structured            | Chinese language specialist   |
| **Kimi K2 Thinking** | $0.60      | $2.50       | 262K    | Reasoning, Tools, Structured | Long-context reasoning        |
| **GLM-4.5**          | $0.39      | $1.55       | 131K    | Tools, Structured            | Chinese bilingual model       |
| **GLM-4.5 Air**      | $0.14      | $0.86       | 131K    | Tools, Structured            | Efficient Chinese model       |

### **AgenticFlow Provider (Budget Models)**

All AgenticFlow models have **50K input token limit** for cost control:

| Model                     | Input Cost | Output Cost | Context | Features                                | Best For                  |
| ------------------------- | ---------- | ----------- | ------- | --------------------------------------- | ------------------------- |
| **Gemini 2.5 Flash Lite** | $0.075     | $0.30       | 1M      | Vision, Audio, Video, Tools, Structured | Best budget multi-modal   |
| **Gemini 2.0 Flash Lite** | $0.07      | $0.30       | 50K     | Tools, Structured                       | Budget vision             |
| **Gemini 2.0 Flash**      | $0.10      | $0.40       | 50K     | Vision, Tools, Structured               | Budget balanced           |
| **Gemini 1.5 Flash**      | $0.07      | $0.30       | 50K     | Tools, Structured                       | Budget efficient          |
| **GPT-5 Nano**            | $0.05      | $0.40       | 50K     | Vision, Tools, Structured               | Budget OpenAI vision      |
| **GPT-4o Mini**           | $0.10      | $0.40       | 50K     | Tools, Structured                       | Budget OpenAI             |
| **GPT OSS 120B**          | $0.15      | $0.75       | 131K    | Tools, Structured                       | Budget large model        |
| **GPT OSS 20B**           | $0.10      | $0.50       | 131K    | Tools, Structured                       | Budget small model        |
| **DeepSeek V4 Flash**     | $0.14      | $0.28       | 50K     | Tools, Structured                       | Budget reasoning & coding |
| **Claude 3.5 Haiku**      | $1.00      | $5.00       | 50K     | Tools, Structured                       | Budget Claude             |
| **GLM-4.5 Air**           | $0.14      | $0.86       | 131K    | Tools, Structured                       | Budget Chinese            |

**Notes:**

* All costs are per million tokens
* Context shown is max input tokens
* Features: Tools = Function calling, Structured = JSON schema, Vision/Audio/Video = Multi-modal

***

## 🎯 **Model Selection by Use Case**

### **General Business Use**

**Recommended:** GPT-4.1 ($2/M), Claude 4.5 Sonnet ($3/M), Claude 3.5 Sonnet ($3/M)

* Best overall performance for business tasks
* Excellent reasoning and communication
* Good balance of capability and cost

### **Customer Support (High-Volume)**

**Recommended:** Claude 4.5 Haiku ($1/M), GPT-4.1 Nano ($0.08/M), Gemini 2.5 Flash Lite ($0.075/M)

* Fast response times
* Cost-effective for high-volume
* Tool calling and structured output support

### **Content Creation**

**Recommended:** Claude 4.5 Opus ($5/M), GPT-5 ($1.25/M), GPT-4.1 ($2/M)

* Superior writing capabilities
* Vision support for image-based content
* Advanced formatting

### **Data Analysis**

**Recommended:** Claude 4.5 Sonnet ($3/M), Gemini 2.5 Pro ($2.5/M), DeepSeek V4 Flash ($0.14/M)

* Excellent analytical reasoning
* Strong structured data handling
* Large context windows (200K-1M+)

### **Technical Support & Coding**

**Recommended:** GPT-5.1 Codex ($1.25/M), Claude 4.5 Opus ($5/M), DeepSeek V4 Flash ($0.14/M)

* Code understanding and generation
* Complex problem-solving
* Reasoning features for debugging

### **Budget-Conscious / High-Volume**

**Recommended:** Gemini 2.5 Flash Lite ($0.075/M), GPT OSS 20B ($0.10/M), Llama 3.1 8B ($0.05/M)

* Ultra-low costs
* Still capable for most tasks
* Fast inference times

### **Advanced Reasoning**

**Recommended:** O1 ($1.10/M), O3 ($1.10/M), DeepSeek V4 Flash ($0.14/M), Claude 4.5 Opus ($5/M)

* Step-by-step reasoning
* Complex problem-solving
* Mathematical and logical tasks

### **Multi-Modal (Vision/Video/Audio)**

**Recommended:** Gemini 2.5 Flash Lite ($0.075/M), Gemini 2.5 Pro ($2.5/M), GPT-5 ($1.25/M)

* Image understanding
* Video analysis (Gemini 2.5 Pro/Lite)
* Audio processing (Gemini 2.5 Lite)

***

## ⚙️ **Configuration Settings**

### **Temperature Control**

* **0.0-0.3 (Focused)**: Consistent, predictable responses for support, analysis, documentation
* **0.4-0.7 (Balanced)**: Natural variation for general communication
* **0.8-1.0 (Creative)**: High creativity for marketing, writing, brainstorming
* **Default**: 0.1 (backend standard)

### **Token Limits**

* **Max Input Tokens**: Controls context window (50K-2M depending on model)
* **Max Output Tokens**: Controls response length (1K-128K depending on model)
* **AgenticFlow Provider**: Capped at 50K input for cost control

### **Advanced Settings**

* **Streaming**: Enabled by default for all models
* **Tool Calling**: Function calling for external integrations
* **Structured Output**: JSON schema enforcement
* **Response Format**: Plain text, JSON, or custom structures

***

## 💰 **Cost Optimization Tips**

1. **Choose the Right Provider**
   * **PixelML**: Latest models, highest limits.
   * **AgenticFlow**: Use your Agenticflow Credit, no extra cost.
   * **Direct Providers**: Full features, varying costs
2. **Token Management**
   * Set appropriate max\_input\_tokens to control costs
   * Configure max\_output\_tokens based on actual needs
   * Lower temperature (0.1-0.3) = shorter, focused responses
3. **Feature Selection**
   * Only use vision-enabled models when processing images
   * Use reasoning models only for complex logic
   * Choose standard models for routine tasks

***

## 📋 **Quick Reference**

### **Fastest Models (< 1 second)**

All Groq models: Llama 3.1 8B, Llama 3.3 70B, Llama 4 family, Gemma 2 9B

### **Largest Context (1M+ tokens)**

GPT-4.1 (1M), All Gemini models (1M-2M)

### **Best Value**

Gemini 2.5 Flash Lite ($0.075/M), GPT-4.1 Nano ($0.08/M), Llama 3.1 8B ($0.05/M)

### **Most Capable**

Claude 4.5 Opus, GPT-5 Pro, O1/O3, Claude 4 Opus

### **Best for Coding**

GPT-5.1 Codex, Claude 4.5 Opus, DeepSeek V4 Flash

### **Multi-Modal**

Gemini 2.5 family (vision + video + audio), GPT-5 family (vision), Claude 4.5 Opus (vision)

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.agenticflow.ai/ai-agents/model-selection.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.