Baichuan Intelligence Releases Baichuan2-192K Large Model Capable of Processing Approximately 350,000 Chinese Characters
-
Baichuan Intelligence has released the Baichuan2-192K large model, featuring the world's longest context window length, capable of processing approximately 350,000 Chinese characters.
Compared to the currently leading large model Claude2, Baichuan2-192K's context window length exceeds it by 4.4 times and surpasses GPT-4 by 14 times.
Baichuan2-192K excels in long-context text generation, comprehension, Q&A, summarization, and more, achieving SOTA (state-of-the-art) results in 7 out of 10 long-text evaluation benchmarks.
It is reported that Baichuan2-192K achieves a balance between window length and model performance through algorithmic and engineering optimizations, employing dynamic sampling for positional encoding and a 4D parallel distributed solution.
Currently, Baichuan2-192K has begun internal testing and is collaborating with key partners in industries such as law, media, and finance. It will be fully released soon. The model can be applied to scenarios like key information extraction and analysis from long documents, summarization, review, drafting, complex programming assistance, and supports multimodal input and transfer learning.