Qwen 3.5 Launched: A Deep Dive into Alibaba's New Multimodal AI

Release Date: February 16, 2026 | Developer: Alibaba Cloud Qwen Team | License: Apache 2.0
๐ Introduction: Another Milestone in AI
On February 16, 2026 (Chinese New Year's Day), Alibaba Cloud's Qwen team globally released the Qwen 3.5 series of large language models, marking another significant leap forward for the Qwen family. Released during the Spring Festival, the Qwen team presented a generous gift to developers and enterprises worldwide.
The flagship model Qwen3.5-397B-A17B features 397 billion total parameters with 17 billion active parameters, utilizing a Mixture-of-Experts (MoE) architecture. It achieves breakthrough advances in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility, empowering developers and enterprises with unprecedented capabilities and efficiency.
๐ Model Family Overview
Initial Release Models
| Model | Parameters | Architecture | Context Length | Features |
|---|---|---|---|---|
| Qwen3.5-397B-A17B | 397B/17B | MoE | 1,010,000 tokens | ๐ Flagship Version |
| Qwen3.5-Plus | - | Cloud-hosted | 1M default | Enterprise Features |
Note: Qwen3.5-Plus is the cloud-hosted version of Qwen3.5-397B-A17B, offering additional production features such as 1 million context length by default, official built-in tools, and adaptive tool usage.
Detailed Technical Specifications
Model Type: Causal Language Model with Vision Encoder
Training Stage: Pre-training & Post-training
Language Model Parameters:
Total Parameters: 397B
Active Parameters: 17B
Hidden Dimension: 4096
Token Embedding: 248320
Number of Layers: 60
Mixture of Experts (MoE):
Total Experts: 512
Active Experts: 10 (routed) + 1 (shared)
Expert Intermediate Dimension: 1024
Context Length:
Native Support: 262,144 tokens
Maximum Extensible: 1,010,000 tokens
๐ง Five Core Innovations
1๏ธโฃ Unified Vision-Language Foundation
Qwen 3.5 employs early fusion training strategies, trained on trillions of multimodal tokens:
- โ Achieves cross-generational parity with Qwen3 across reasoning, coding, agents, and visual understanding benchmarks
- โ Comprehensively surpasses Qwen3-VL models
- โ Native support for unified architecture of image understanding and generation
2๏ธโฃ Efficient Hybrid Architecture
Innovatively combines two advanced technologies:
- Gated Delta Networks: Enhances long sequence modeling capabilities
- Sparse Mixture-of-Experts: Activates only partial experts for efficiency
Advantages:
- โก High-throughput inference
- ๐ฐ Minimized latency and cost overhead
- ๐ฏ 15 ร (3 ร Gated DeltaNet โ MoE) layout within 60-layer architecture
3๏ธโฃ Scalable RL Generalization
- Scales reinforcement learning across million-agent environments
- Progressive complex task distribution training
- Robust real-world adaptability
- Supports asynchronous RL frameworks and large-scale agent scaffolding
4๏ธโฃ Global Linguistic Coverage
Significant language support improvements:
- ๐ Supports 201 languages and dialects (major increase from Qwen3's 29)
- ๐ Includes nuanced cultural and regional understanding
- ๐ True global deployment capabilities
5๏ธโฃ Next-Generation Training Infrastructure
- Near 100% multimodal training efficiency (compared to text-only training)
- Supports large-scale environment orchestration
- Asynchronous RL framework support
- Advanced multimodal token processing capabilities
๐ Performance
According to officially released benchmark charts, Qwen3.5-397B-A17B demonstrates excellent performance across multiple tests:

Evaluation Dimensions
- ๐ Reasoning Ability: Complex logic and multi-step reasoning
- ๐ป Coding Ability: Code generation, understanding, and debugging
- ๐ค Agent Capability: Tool usage and task execution
- ๐๏ธ Visual Understanding: Image analysis and visual reasoning
- ๐ Multilingual Ability: Cross-language understanding and generation
๐ ๏ธ Deployment and Usage
Online Experience
- Qwen Chat: https://chat.qwen.ai - Official web interface
- Qwen3.5-Plus API: https://modelstudio.alibabacloud.com/ - Alibaba Cloud Model Studio
Local Deployment
Model Download
# Hugging Face
huggingface-cli download Qwen/Qwen3.5-397B-A17B
# ModelScope (China mirror)
modelscope download --model Qwen/Qwen3.5-397B-A17B
Using Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen/Qwen3.5-397B-A17B"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Inference
prompt = "Explain the principles of quantum computing"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer([text], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=1024)
response = tokenizer.batch_decode(outputs)[0]
print(response)
Using vLLM
from vllm import LLM, SamplingParams
llm = LLM(model="Qwen/Qwen3.5-397B-A17B")
sampling_params = SamplingParams(temperature=0.7, top_p=0.8)
outputs = llm.generate("What are the future trends of artificial intelligence?", sampling_params)
print(outputs[0].outputs[0].text)
Using SGLang
# Start server
python -m sglang.launch_server \
--model Qwen/Qwen3.5-397B-A17B \
--tp 8
Model Repository Links
| Platform | Link |
|---|---|
| ๐ค Hugging Face | https://huggingface.co/Qwen/Qwen3.5-397B-A17B |
| ๐ค ModelScope | https://www.modelscope.cn/organization/Qwen |
| ๐ GitHub | https://github.com/QwenLM/Qwen3.5 |
๐ Version Evolution Comparison
| Feature | Qwen3 | Qwen 3.5 |
|---|---|---|
| Largest Open Model | 235B-A22B | 397B-A17B (MoE) |
| Context Length | 128K | 1,010,000 tokens |
| Language Support | 29 | 201+ |
| Multimodal Training | Late Fusion | Early Fusion |
| Architecture | MoE | Gated Delta + MoE |
| RL Scale | Thousand agents | Million agents |
๐ก Application Scenarios
1. Enterprise AI Assistant
- Ultra-long context document analysis (supports million tokens)
- Multilingual customer service and translation
- Complex business process automation
2. Code Development
- Large-scale code base understanding and refactoring
- Cross-language programming assistance
- Automated code review
3. Multimodal Creation
- Image understanding and description
- Visual question answering systems
- Multimedia content analysis
4. Global Applications
- Localized content generation
- Cross-cultural communication assistants
- Multilingual knowledge base construction
๐ฎ Future Outlook
The release of Qwen 3.5 marks Alibaba's continued leadership in the AI field. As more model sizes are released (More sizes are coming), we can expect:
- Smaller Model Sizes: Suitable for edge device deployment
- More Specialized Versions: Domain-optimized models
- Stronger Multimodal Capabilities: Support for video, audio, and more modalities
- More Complete Tool Ecosystem: Agent frameworks and development tools
๐ Summary
Qwen 3.5 represents one of the highest levels of open-source large language models currently available. Its innovative unified vision-language architecture, efficient Gated Delta + MoE hybrid architecture, million-agent RL training, and global coverage of 201+ languages make it one of the most noteworthy AI models of 2026.
Whether you're an enterprise user pursuing extreme performance or a developer seeking local deployment, Qwen 3.5 provides powerful capabilities and flexible deployment options.
Try it now: Visit chat.qwen.ai or download the model for local deployment to experience the power of Qwen 3.5!
๐ References
- GitHub Repository: https://github.com/QwenLM/Qwen3.5
- Hugging Face: https://huggingface.co/Qwen/Qwen3.5-397B-A17B
- ModelScope: https://www.modelscope.cn/organization/Qwen
- Official Blog: https://qwen.ai/blog?id=qwen3.5
- Qwen Chat: https://chat.qwen.ai
- Alibaba Cloud Model Studio: https://modelstudio.alibabacloud.com/
- Wikipedia: https://en.wikipedia.org/wiki/Qwen
- Technical Documentation: https://www.alibabacloud.com/help/en/model-studio/text-generation



โ๏ธLeave a Comment