gpt-5-4-mini

Overview

GPT-5.4 Mini is OpenAI's high-performance compact model released on March 17, 2026, an efficient distillation of GPT-5.4. It significantly improves over GPT-5 Mini across coding, reasoning, multimodal understanding, and tool use while running 2x+ faster, at roughly 1/6 the cost of the standard model — ideal for high-volume workloads.

Key Features

Near-Flagship Performance: Scores 54.38% on SWE-Bench Pro, remarkably close to the standard model's 57.7%, at roughly 1/6 the cost.
Strong Scientific Reasoning: Achieves 87.5% on GPQA Diamond, excelling at graduate-level scientific reasoning tasks.
Full Tool Support: Supports tool use, web search, image analysis, and Native Computer Use — full capability retention.
2x+ Speed Improvement: Runs 2x+ faster than GPT-5 Mini, suitable for latency-sensitive workloads.
400K Context Window: Supports a 400,000 token context window with vision input, suitable for medium-scale long document processing.

Best Use Cases

Coding Assistants & Sub-Agents: Approaches flagship-level performance on coding benchmarks, delivering reliable code generation and repair at significantly lower cost.
Real-Time AI Applications: 2x speed improvement makes it ideal for chatbots, real-time translation, and interactive coding assistance.
High-Throughput Data Processing: Low-cost, high-performance combination suits large-scale document classification, content moderation, and data extraction pipelines.
Desktop Automation Agents: Full Native Computer Use support for building moderately complex desktop automation workflows.

Capabilities and Limitations

Capability	Detailed Description
Reasoning Ability	SWE-Bench Pro 54.38%, GPQA Diamond 87.5%; strong reasoning but slightly behind standard on the most complex multi-step problems.
Creative Ability	Good text and code generation for most everyday creative tasks; less capable than standard for creation requiring very deep reasoning.
Multimodal Ability	Supports text and image input with text output; significantly improved multimodal understanding and image analysis over GPT-5 Mini.
Response Speed	Fast — 2x+ faster than GPT-5 Mini, suitable for latency-sensitive scenarios.
Context Window	400,000 tokens
Max Output	Not officially specified, estimated 16,000–32,000 tokens
Knowledge Cutoff	August 31, 2025

Credits Usage

Model	Input (Credits/Token)	Cache Write (Credits/Token)	Cache Read (Credits/Token)	Output (Credits/Token)	Web Search (Credits/Use)	Billing Notes
GPT-5.4 Mini	`0.75`	`0.75`	`0.075`	`4.50`	`10,000`	-

Pricing note

Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records.

Overview​

Key Features​

Best Use Cases​

Capabilities and Limitations​

Credits Usage​

Overview

Key Features

Best Use Cases

Capabilities and Limitations

Credits Usage