Skip to content
  • Categories
  • Newsletter
  • Recent
  • AI Insights
  • Tags
  • Popular
  • World
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
  1. Home
  2. AI Insights
  3. Alibaba Cloud Releases Multimodal Large Model Qwen-VL-Max Version
uSpeedo.ai - AI marketing assistant
Try uSpeedo.ai — Boost your marketing

Alibaba Cloud Releases Multimodal Large Model Qwen-VL-Max Version

Scheduled Pinned Locked Moved AI Insights
ai-articles
1 Posts 1 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • baoshi.raoB Offline
    baoshi.raoB Offline
    baoshi.rao
    wrote last edited by
    #1

    Alibaba Cloud has announced its latest research achievements in multimodal large models, releasing the Max version subsequent to the Plus edition.

    The Qwen-VL-Max model exhibits exceptional capabilities in visual reasoning, enabling it to comprehend and analyze complex image information, including tasks such as person recognition, question answering, creative generation, and code writing. Additionally, the model features visual positioning functionality, allowing it to conduct Q&A based on specified areas within an image. In terms of fundamental capabilities, Qwen-VL-Max can accurately describe and recognize image information, as well as perform information reasoning and extended creation based on images. This feature has enabled the model to perform exceptionally well in multiple authoritative evaluations, with overall performance comparable to GPT-4V and Gemini Ultra.

    In tasks such as document analysis (DocVQA) and Chinese image-related tasks (MM-Bench-CN), Qwen-VL-Max has surpassed GPT-4V, achieving world-leading levels.

    Additionally, Qwen-VL-Max has made significant progress in image-text processing, with notably improved Chinese and English text recognition capabilities. The model supports high-definition resolution images exceeding one million pixels and images with extreme aspect ratios. It can not only fully reproduce dense text but also extract information from tables and documents. Currently, Qwen-VL-Plus and Qwen-VL-Max are available for free for a limited time. Users can experience the capabilities of the Max version model directly on the Tongyi Qianwen official website or the Tongyi Qianwen app, or they can call the model API through the Alibaba Cloud Lingji Platform (DashScope).

    1 Reply Last reply
    0
    Reply
    • Reply as topic
    Log in to reply
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes


    • Login

    • Don't have an account? Register

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Newsletter
    • Recent
    • AI Insights
    • Tags
    • Popular
    • World
    • Groups