Musk Suddenly Releases Grok 1.5! Context Length Soars 16x, Matching GPT-4

baoshi.rao

Just now, Elon Musk's AI startup xAI officially announced the launch of Grok-1.5. The official announcement said nothing, just dropped a link, embodying the principle of 'few words, big impact'.

What are the upgrades in Grok-1.5? Mainly two aspects:

Long-context understanding

For the context window, Grok-1.5 has increased it by 16 times compared to before, from 8192 to 128k, matching GPT-4.

This means Grok-1.5 can handle longer and more complex prompts while maintaining its ability to follow instructions.

In the Needle in a Haystack (NIAH) evaluation, Grok-1.5 demonstrated strong retrieval capabilities, achieving perfect retrieval results in contexts up to 128K in length.

Capabilities and reasoning

One of the biggest improvements in Grok-1.5 is its significantly enhanced ability to handle programming and math-related tasks, surpassing Grok-1, Mistral Large, and Claude 2 across the board.

In math, Grok-1.5 scored 50.6% on the MATH benchmark, surpassing Claude 3 Sonnet (medium); it scored 90% on GSM8K.

In programming, Grok-1.5 scored 74.1% on the HumanEval benchmark, surpassing Claude 3 Sonnet (medium), Gemini Pro 1.5, and GPT-4, second only to Claude 3 Opus (large).