Skip to content
  • Categories
  • Newsletter
  • Recent
  • AI Insights
  • Tags
  • Popular
  • World
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
  1. Home
  2. AI Insights
  3. Alibaba's Tongyi Qianwen Open-Sources Qwen1.5-MoE-A2.7B Model
uSpeedo.ai - AI marketing assistant
Try uSpeedo.ai — Boost your marketing

Alibaba's Tongyi Qianwen Open-Sources Qwen1.5-MoE-A2.7B Model

Scheduled Pinned Locked Moved AI Insights
ai-articles
1 Posts 1 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • baoshi.raoB Offline
    baoshi.raoB Offline
    baoshi.rao
    wrote last edited by
    #1

    The Tongyi Qianwen team has introduced the first MoE model in the Qwen series, named Qwen1.5-MoE-A2.7B. This model has only 2.7 billion active parameters but performs comparably to the current most advanced 7-billion-parameter models. Compared to Qwen1.5-7B, Qwen1.5-MoE-A2.7B has only 2 billion non-embedding parameters, approximately one-third the size of the original model. Additionally, training costs are reduced by 75% compared to Qwen1.5-7B, while inference speed is increased by 1.74 times. The Qwen1.5-MoE model employs a specially designed MoE architecture. Unlike traditional MoE approaches, Qwen1.5-MoE utilizes 64 fine-grained experts and introduces new routing mechanisms, DeepSeek-MoE and DBRX. This fine-grained expert design aims to generate more experts without increasing the number of parameters. The Qwen1.5-MoE model demonstrates outstanding performance in training cost and inference efficiency, with performance approaching that of state-of-the-art 7B models.

    The Qwen1.5-MoE-A2.7B model has 1.43 billion activated parameters and 200 million non-embedding parameters, reducing training costs by 75%. In experiments, when tested on a single NVIDIA A100-80G GPU, the inference speed of Qwen1.5-MoE-A2.7B increased by approximately 1.74 times. The Qwen1.5-MoE model has been open-sourced on the ModelScope community and is available for direct download and use.

    In addition to performance and efficiency, the Qwen1.5-MoE model will continue to update support for third-party frameworks, including llama.cpp and MLX. Overall, the Qwen1.5-MoE model has achieved remarkable advantages in terms of performance, efficiency, and inference speed, making it one of the best practices for inference training.

    Qwen1.5-MoE experience link:

    https://modelscope.cn/studios/qwen/qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4-demo

    1 Reply Last reply
    0
    Reply
    • Reply as topic
    Log in to reply
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes


    • Login

    • Don't have an account? Register

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Newsletter
    • Recent
    • AI Insights
    • Tags
    • Popular
    • World
    • Groups