GPT-5.4 Mini
Overview
GPT-5.4 Mini is OpenAI's high-performance compact model released on March 17, 2026, an efficient distillation of GPT-5.4. It significantly improves over GPT-5 Mini across coding, reasoning, multimodal understanding, and tool use while running 2x+ faster, at roughly 1/6 the cost of the standard model — ideal for high-volume workloads.
Key Features
- Near-Flagship Performance: Scores 54.38% on SWE-Bench Pro, remarkably close to the standard model's 57.7%, at roughly 1/6 the cost.
- Strong Scientific Reasoning: Achieves 87.5% on GPQA Diamond, excelling at graduate-level scientific reasoning tasks.
- Full Tool Support: Supports tool use, web search, image analysis, and Native Computer Use — full capability retention.
- 2x+ Speed Improvement: Runs 2x+ faster than GPT-5 Mini, suitable for latency-sensitive workloads.
- 400K Context Window: Supports a 400,000 token context window with vision input, suitable for medium-scale long document processing.
Best Use Cases
- Coding Assistants & Sub-Agents: Approaches flagship-level performance on coding benchmarks, delivering reliable code generation and repair at significantly lower cost.
- Real-Time AI Applications: 2x speed improvement makes it ideal for chatbots, real-time translation, and interactive coding assistance.
- High-Throughput Data Processing: Low-cost, high-performance combination suits large-scale document classification, content moderation, and data extraction pipelines.
- Desktop Automation Agents: Full Native Computer Use support for building moderately complex desktop automation workflows.
Capabilities and Limitations
| Capability | Detailed Description |
|---|
| Reasoning Ability | SWE-Bench Pro 54.38%, GPQA Diamond 87.5%; strong reasoning but slightly behind standard on the most complex multi-step problems. |
| Creative Ability | Good text and code generation for most everyday creative tasks; less capable than standard for creation requiring very deep reasoning. |
| Multimodal Ability | Supports text and image input with text output; significantly improved multimodal understanding and image analysis over GPT-5 Mini. |
| Response Speed | Fast — 2x+ faster than GPT-5 Mini, suitable for latency-sensitive scenarios. |
| Context Window | 400,000 tokens |
| Max Output | Not officially specified, estimated 16,000–32,000 tokens |
| Knowledge Cutoff | August 31, 2025 |
Credits and Pricing
| Model | Input (Credits/Token) | Output (Credits/Token) |
|---|
| GPT-5.4 Mini | 0.75 | 4.50 |