Runs on Mobile Too! FaceWall Intelligence Launches MiniCPM-V4.5: 410M Parameters Outperform GPT-4.1-mini

baoshi.rao

FaceWall Intelligence, in collaboration with Tsinghua University's NLP Lab, officially released its latest edge-side multimodal large model MiniCPM-V4.5, marking a new milestone in edge AI technology.

As the newest addition to the MiniCPM series, this model redefines industry expectations for edge-side multimodal models with its exceptional performance, efficient deployment capabilities, and broad application scenarios. Below, AIbase provides a detailed analysis of this groundbreaking technology.

Technical Breakthrough: Fewer Parameters, Stronger Performance

MiniCPM-V4.5 is built on the SigLIP2-400M vision module and MiniCPM4-3B language model, with a total of just 410 million parameters, yet it delivers stunning performance across multiple benchmarks. According to official data, MiniCPM-V4.5 achieved an average score of 69.0 in OpenCompass comprehensive evaluations, surpassing GPT-4.1-mini (20250414 version, 64.5 points) and Qwen2.5-VL-3B-Instruct (64.5 points), making it the new performance benchmark for edge-side multimodal models. Compared to its predecessor, MiniCPM-V2.6 (810M parameters, 65.2 points), the new model significantly reduces parameter count while improving performance, showcasing FaceWall Intelligence's deep expertise in model compression and optimization.

Multimodal Capabilities Upgrade: Vision, Text, and Video Mastery

MiniCPM-V4.5 supports single-image, multi-image, and video understanding, excelling in high-resolution image processing, OCR (optical character recognition), and multilingual support.

Vision Capabilities: The model can process images up to 1.8 million pixels (1344x1344) with arbitrary aspect ratios, and its OCR performance on OCRBench outperforms mainstream proprietary models like GPT-4o and Gemini1.5Pro.
Multi-Image and Video Understanding: On benchmarks like Mantis-Eval, BLINK, and Video-MME, MiniCPM-V4.5 demonstrates leading capabilities in multi-image reasoning and spatiotemporal video information processing, making it ideal for complex content analysis.
Multilingual Support: Inheriting the MiniCPM series' multilingual strengths, the model supports over 30 languages, including English, Chinese, German, French, Italian, and Korean, providing seamless multimodal interaction for global users.

Efficient Deployment: Optimized for Edge Devices

MiniCPM-V4.5 sets a new standard for efficiency. Thanks to its high token density (processing 1.8M-pixel images requires only 640 visual tokens, 75% fewer than most models), the model achieves significant optimizations in inference speed, first-token latency, memory usage, and power consumption. Tests show that MiniCPM-V4.5 achieves first-token latency under 2 seconds and a decoding speed exceeding 17 tokens/s on the iPhone 16 Pro Max, with no noticeable overheating. This makes the model easy to deploy on smartphones, tablets, and other edge devices, meeting the needs of mobile, offline, and privacy-sensitive scenarios.

Additionally, MiniCPM-V4.5 supports multiple deployment methods, including llama.cpp, Ollama, vLLM, and SGLang, and offers iOS app support, significantly lowering the barrier for developers.

Open Ecosystem: Driving Academic and Commercial Innovation

Staying true to its open-source tradition, FaceWall Intelligence releases MiniCPM-V4.5 under the Apache 2.0 license, making it fully open-source for academic researchers and free for commercial users after simple registration. This move further lowers the barrier to entry for multimodal AI, fostering both academic research and commercial applications. To date, the MiniCPM series has garnered over a million downloads on GitHub and Hugging Face, establishing itself as a benchmark in edge AI.

The launch of MiniCPM-V4.5 not only solidifies FaceWall Intelligence's leadership in multimodal large models but also points the way for the democratization of edge AI. From real-time video analysis to intelligent document processing and multilingual interaction, MiniCPM-V4.5's versatility opens new possibilities for industries like education, healthcare, and content creation.

AIbase believes that with the rapid advancement of edge computing power and continuous model optimization, MiniCPM-V4.5 has the potential to become the "new normal" for edge devices, rivaling cloud-based AI.

Project: https://huggingface.co/openbmb/MiniCPM-V-4_5