China's First Official Large Model Evaluation Results Released! Alibaba Cloud's Tongyi Qianwen Among First Batch to Pass

baoshi.rao

Results of China's first official large model standards compliance evaluation announced.

Alibaba Cloud's Tongyi Qianwen is one of the four domestic large models that passed the evaluation, meeting national standards in dimensions such as generality and intelligence.

It is reported that among the first batch of large models that passed the evaluation, Tongyi Qianwen is the only open-source model, boasting a wide range of developer users and enterprise clients worldwide. Its performance and security have undergone extensive public validation.

After its open-source release on December 1, Tongyi Qianwen 72B achieved the best results among open-source models in 10 authoritative benchmark evaluations and topped the highly authoritative HuggingFace leaderboard overseas, surpassing Llama2.

It subsequently topped the domestic OpenCompass ranking by Shanghai AI Lab, becoming the industry-recognized highest-performing open-source large model.

Currently, the Tongyi Qianwen app is available for download and experience on major Apple and Android app stores, offering dozens of practical features including text dialogue, voice dialogue, literary analysis, foreign language and classical Chinese translation, PPT outline assistance, and Xiaohongshu copywriting.

It is reported that the 'Large Model Standards Compliance Evaluation' was initiated by the China Electronics Standardization Institute, with the goal of establishing a national directory for large model standards compliance to guide the healthy and orderly development of the artificial intelligence industry.

The evaluation has solicited opinions from dozens of leading academic and industrial institutions, covering 38 specific evaluation dimensions to assess the generality and intelligence of language models. It represents an authoritative evaluation based on official large model testing benchmarks.