Qwen 3.5 Launched: A Deep Dive into Alibaba's New Multimodal AI

3:48 AM · Feb 18, 2026

Qwen Logo

Release Date: February 16, 2026 | Developer: Alibaba Cloud Qwen Team | License: Apache 2.0

🚀 Introduction: Another Milestone in AI

On February 16, 2026 (Chinese New Year's Day), Alibaba Cloud's Qwen team globally released the Qwen 3.5 series of large language models, marking another significant leap forward for the Qwen family. Released during the Spring Festival, the Qwen team presented a generous gift to developers and enterprises worldwide.

The flagship model Qwen3.5-397B-A17B features 397 billion total parameters with 17 billion active parameters, utilizing a Mixture-of-Experts (MoE) architecture. It achieves breakthrough advances in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility, empowering developers and enterprises with unprecedented capabilities and efficiency.

📊 Model Family Overview

Initial Release Models

Model	Parameters	Architecture	Context Length	Features
Qwen3.5-397B-A17B	397B/17B	MoE	1,010,000 tokens	🏆 Flagship Version
Qwen3.5-Plus	-	Cloud-hosted	1M default	Enterprise Features

Note: Qwen3.5-Plus is the cloud-hosted version of Qwen3.5-397B-A17B, offering additional production features such as 1 million context length by default, official built-in tools, and adaptive tool usage.

Detailed Technical Specifications

Model Type: Causal Language Model with Vision Encoder
Training Stage: Pre-training & Post-training

Language Model Parameters:
  Total Parameters: 397B
  Active Parameters: 17B
  Hidden Dimension: 4096
  Token Embedding: 248320
  Number of Layers: 60

Mixture of Experts (MoE):
  Total Experts: 512
  Active Experts: 10 (routed) + 1 (shared)
  Expert Intermediate Dimension: 1024

Context Length:
  Native Support: 262,144 tokens
  Maximum Extensible: 1,010,000 tokens

🧠 Five Core Innovations

1️⃣ Unified Vision-Language Foundation

Qwen 3.5 employs early fusion training strategies, trained on trillions of multimodal tokens:

✅ Achieves cross-generational parity with Qwen3 across reasoning, coding, agents, and visual understanding benchmarks
✅ Comprehensively surpasses Qwen3-VL models
✅ Native support for unified architecture of image understanding and generation

2️⃣ Efficient Hybrid Architecture

Innovatively combines two advanced technologies:

Gated Delta Networks: Enhances long sequence modeling capabilities
Sparse Mixture-of-Experts: Activates only partial experts for efficiency

Advantages:

⚡ High-throughput inference
💰 Minimized latency and cost overhead
🎯 15 × (3 × Gated DeltaNet → MoE) layout within 60-layer architecture

3️⃣ Scalable RL Generalization

Scales reinforcement learning across million-agent environments
Progressive complex task distribution training
Robust real-world adaptability
Supports asynchronous RL frameworks and large-scale agent scaffolding

4️⃣ Global Linguistic Coverage

Significant language support improvements:

🌍 Supports 201 languages and dialects (major increase from Qwen3's 29)
🌏 Includes nuanced cultural and regional understanding
🌐 True global deployment capabilities

5️⃣ Next-Generation Training Infrastructure

Near 100% multimodal training efficiency (compared to text-only training)
Supports large-scale environment orchestration
Asynchronous RL framework support
Advanced multimodal token processing capabilities

🏆 Performance

According to officially released benchmark charts, Qwen3.5-397B-A17B demonstrates excellent performance across multiple tests:

Qwen 3.5 Performance Chart

Evaluation Dimensions

📊 Reasoning Ability: Complex logic and multi-step reasoning
💻 Coding Ability: Code generation, understanding, and debugging
🤖 Agent Capability: Tool usage and task execution
👁️ Visual Understanding: Image analysis and visual reasoning
🌐 Multilingual Ability: Cross-language understanding and generation

🛠️ Deployment and Usage

Online Experience

Qwen Chat: https://chat.qwen.ai - Official web interface
Qwen3.5-Plus API: https://modelstudio.alibabacloud.com/ - Alibaba Cloud Model Studio

Local Deployment

Model Download

# Hugging Face
huggingface-cli download Qwen/Qwen3.5-397B-A17B

# ModelScope (China mirror)
modelscope download --model Qwen/Qwen3.5-397B-A17B

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen/Qwen3.5-397B-A17B"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Inference
prompt = "Explain the principles of quantum computing"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer([text], return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=1024)
response = tokenizer.batch_decode(outputs)[0]
print(response)

Using vLLM

from vllm import LLM, SamplingParams

llm = LLM(model="Qwen/Qwen3.5-397B-A17B")
sampling_params = SamplingParams(temperature=0.7, top_p=0.8)

outputs = llm.generate("What are the future trends of artificial intelligence?", sampling_params)
print(outputs[0].outputs[0].text)

Using SGLang

# Start server
python -m sglang.launch_server \
    --model Qwen/Qwen3.5-397B-A17B \
    --tp 8

Model Repository Links

Platform	Link
🤗 Hugging Face	https://huggingface.co/Qwen/Qwen3.5-397B-A17B
🤖 ModelScope	https://www.modelscope.cn/organization/Qwen
📁 GitHub	https://github.com/QwenLM/Qwen3.5

📈 Version Evolution Comparison

Feature	Qwen3	Qwen 3.5
Largest Open Model	235B-A22B	397B-A17B (MoE)
Context Length	128K	1,010,000 tokens
Language Support	29	201+
Multimodal Training	Late Fusion	Early Fusion
Architecture	MoE	Gated Delta + MoE
RL Scale	Thousand agents	Million agents

💡 Application Scenarios

1. Enterprise AI Assistant

Ultra-long context document analysis (supports million tokens)
Multilingual customer service and translation
Complex business process automation

2. Code Development

Large-scale code base understanding and refactoring
Cross-language programming assistance
Automated code review

3. Multimodal Creation

Image understanding and description
Visual question answering systems
Multimedia content analysis

4. Global Applications

Localized content generation
Cross-cultural communication assistants
Multilingual knowledge base construction

🔮 Future Outlook

The release of Qwen 3.5 marks Alibaba's continued leadership in the AI field. As more model sizes are released (More sizes are coming), we can expect:

Smaller Model Sizes: Suitable for edge device deployment
More Specialized Versions: Domain-optimized models
Stronger Multimodal Capabilities: Support for video, audio, and more modalities
More Complete Tool Ecosystem: Agent frameworks and development tools

📝 Summary

Qwen 3.5 represents one of the highest levels of open-source large language models currently available. Its innovative unified vision-language architecture, efficient Gated Delta + MoE hybrid architecture, million-agent RL training, and global coverage of 201+ languages make it one of the most noteworthy AI models of 2026.

Whether you're an enterprise user pursuing extreme performance or a developer seeking local deployment, Qwen 3.5 provides powerful capabilities and flexible deployment options.

Try it now: Visit chat.qwen.ai or download the model for local deployment to experience the power of Qwen 3.5!

📚 References

GitHub Repository: https://github.com/QwenLM/Qwen3.5
Hugging Face: https://huggingface.co/Qwen/Qwen3.5-397B-A17B
ModelScope: https://www.modelscope.cn/organization/Qwen
Official Blog: https://qwen.ai/blog?id=qwen3.5
Qwen Chat: https://chat.qwen.ai
Alibaba Cloud Model Studio: https://modelstudio.alibabacloud.com/
Wikipedia: https://en.wikipedia.org/wiki/Qwen
Technical Documentation: https://www.alibabacloud.com/help/en/model-studio/text-generation