Skip to main content

Qwen3.6-27B

Overview

Qwen3.6-27B FP8 is a dense 27-billion-parameter multimodal model developed by Alibaba's Qwen Team. It is optimized for agentic coding and reasoning, while remaining practical to deploy for production workloads. On B.AI, it is positioned for coding assistance, technical reasoning, multimodal understanding, and tool-assisted workflows. Specific input modalities, context limits, and tool capabilities may vary by B.AI model catalog and platform configuration.

Key Features

  • Hybrid Gated DeltaNet Architecture: Uses a hybrid attention design that combines efficient linear-attention-style layers with full self-attention layers, balancing inference efficiency with long-context performance.
  • Natively Multimodal: Supports text, image, and video inputs at the model level, subject to B.AI platform configuration and availability.
  • Hybrid Thinking Mode: Supports both thinking and non-thinking response modes where available, allowing different quality-speed tradeoffs per task.
  • Thinking Preservation: Designed to preserve reasoning context across multi-turn conversations, improving coherence in agentic coding workflows.
  • Multi-Token Prediction (MTP): Uses multi-token prediction training to improve inference throughput.

Best Use Cases

  • Agentic Coding: Well-suited for autonomous code generation, debugging, and multi-step software engineering workflows.
  • Complex Reasoning Tasks: Suitable for scientific, mathematical, engineering, and analytical problem-solving.
  • Multimodal Analysis: Can be used for document, screenshot, chart, diagram, image, and video understanding when the relevant input modality is enabled.
  • Production Workloads: A practical choice for workloads that need a balance of capability, latency, and usage cost.

Capabilities and Limitations

CapabilityDescription
ReasoningStrong technical and analytical reasoning for structured problem-solving tasks
CodingSuitable for code generation, debugging, refactoring, and agentic software workflows
Creative WritingGeneral-purpose text generation; primarily optimized for code and reasoning rather than creative output
MultimodalText, image, and video input at the model level; text output
Context WindowUp to 128k tokens, subject to platform configuration
Max OutputUp to 32,768 tokens, subject to platform configuration
Tool UseNative function calling and tool use support where enabled

Known Limitations

  • Specific capability availability may depend on the B.AI integration, provider support, plan settings, and rollout status.
  • Video input, tool use, long-context limits, and other advanced capabilities require compatible platform configuration.
  • Public evaluations, third-party comparisons, policy behavior, and implementation details may change over time, so they are not treated as fixed guarantees in this documentation.

Pricing

ModelInput (Credits/Token)Cache Write (Credits/Token)Cache Read (Credits/Token)Output (Credits/Token)Web Search (Credits/Use)Billing Notes
Qwen3.6-27B0.190.190.192.99-Cache reads and writes are billed at the same rate.
Pricing note

Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records.