Alibaba's Qwen2.5-Max Enters Global Top 10
Published: February 12, 2025 16:33
On February 4, the latest rankings from the globally renowned AI large model evaluation platform, Chatbot Arena, revealed that Alibaba's Qwen2.5-Max model has entered the global top ten for the first time, surpassing the recently popular DeepSeek-V3 and leading other top proprietary models like O1-Mini and Claude-3.5-Sonnet.
Specifically, Qwen2.5-Max ranks first in mathematics and programming, and second in handling hard prompts. The official evaluation from Chatbot Arena praises Qwen2.5-Max for its strong performance across multiple domains, particularly in specialized technical areas like programming, mathematics, and hard prompts.
The latest version, Qwen2.5-Max, uses an advanced mixture-of-experts (MoE) architecture, with over 20 trillion tokens of pre-training data. It is optimized with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) techniques, excelling in knowledge, programming, general abilities, and human alignment.
Whether for language models or multimodal models, Qwen is pre-trained on large-scale multilingual and multimodal data and fine-tuned with high-quality datasets to align more closely with human preferences. Qwen possesses a range of capabilities, including natural language understanding, text generation, visual understanding, audio processing, tool usage, role-playing, and interactive AI agent functions.
Key features of Qwen2.5 include:
- Easy-to-use decoder-based dense language models, available in 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameter sizes, with both base and instruction-fine-tuned variants (where "B" stands for billion, with 72B referring to 72 billion parameters).
- Pre-trained on the latest datasets, including up to 18 trillion tokens.
- Significant improvements in instruction following, long-text generation (over 8K tokens), structured data comprehension (e.g., tables), and the generation of structured outputs, especially JSON.
- Enhanced adaptability to diverse system prompts, improving role-playing and background settings for chatbots.
- Supports a context length of up to 128K tokens and generates up to 8K tokens of text.
- Supports over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
In fact, over the past year, the domestic large model industry in China has seen several waves of price reductions. For instance, Alibaba Cloud’s Tongyi Qianwen visual understanding model saw its entire line reduced by more than 80%, with a cost as low as 0.0015 yuan per thousand tokens. ByteDance’s Doubao visual understanding model charges just 3 cents per thousand tokens, 85% cheaper than industry prices. Baidu’s Wenxin Yiyan has made its two major models, ERNIE Speed and ERNIE Lite, available for free to users.
The rise of domestic models in China has made it clear that OpenAI is no longer the sole dominant force in the large model field. The technological capabilities of these models can now rival, and even exceed, those of international mainstream models. As noted by Chatbot Arena:“Chinese large models, represented by Qwen2.5-Max, are catching up fast.”OpenAI CEO Sam Altman acknowledged the impact of China’s AI rise after the launch of O3-Mini, stating that it had weakened OpenAI’s technological lead.
PRODUCTS
Product Name | Product Model | MOQ | Datasheet |
---|---|---|---|
![]() |
LLIS | 1 |
![]() |
![]() |
O2S-FR-T6 | 1 |
![]() |
![]() |
OXY-FLEX Series | 1 |
![]() |
![]() |
EM-FECS(B) | 1 |
![]() |
![]() |
EM7162 | 1 |
![]() |
![]() |
EM7000 | 1 |
![]() |
NEW PRODUCTS
More >-
PST’s intrinsically safe optical liquid level switches are designed and certified for use in demanding applications where direct contact with hydrocarbons
Model Number:LLIS
-
Zirconia O2 Sensors Screened Probe Series Long Housing O2S-FR-T6
Model Number:O2S-FR-T6
-
PST offers a compact and cost-effective zirconia transmitter to measure percentage level oxygen in combustion processes
Model Number:OXY-FLEX Series
-
The EM-FECS(B) evaluation module is designed to perform the testing and evaluation of the three-electrode electrochemical gas sensors in the FECS-series
Model Number:EM-FECS(B)
-
The EM7162 evaluation module is designed to facilitate evaluation of the characteristics of the CDM7162 carbon dioxide (CO2) sensor module.
Model Number:EM7162
-
The EM7000 Communication Board for Gas Sensor Evaluation Modules,for facilitatig
evaluation of the characteristics of various Figaro gas sensorsModel Number:EM7000