Kuaishou Open-Sources KwaiAgents System Outperforming GPT-3.5

baoshi.rao

Recently, Kuaishou in collaboration with Harbin Institute of Technology successfully open-sourced the 'KwaiAgents' system, achieving superior performance with 7B/13B models. This accomplishment is attributed to the Meta-Agent Tuning (MAT) method, which enhances the general capabilities of large models. The entire project encompasses three key aspects: the system, models, and evaluations, and is fully open-sourced on GitHub, providing significant convenience for researchers and developers.

Project address: https://github.com/KwaiKEG/KwaiAgents

The system uses large models as the cognitive core, equipped with memory mechanisms and tool libraries to form an iterative automated system. The memory mechanism includes three types of memories: knowledge base, dialogue, and task history. Through hybrid vector retrieval and keyword retrieval technologies, it retrieves the required information in each round of conversation. The toolset includes fact-enhancing tools, and heterogeneous search and browsing mechanisms can aggregate knowledge from multiple sources, including web pages, text encyclopedias, and video encyclopedias. In the automated loop, the system receives questions in each round of conversation, updates and retrieves memories, calls the large model for task planning, invokes tools as needed, and finally synthesizes historical information to provide answers.

To avoid overfitting issues caused by single templates during training, the team proposed the MAT method. This method is divided into two phases: template generation and instruction fine-tuning. In the template generation phase, a Meta-Agent is designed to generate instantiated Agent Prompt templates. The candidate results are compared and scored against open-source templates to filter out a high-quality Agent Prompt template library. In the instruction fine-tuning phase, over 200,000 Agent tuning instruction fine-tuning data points were constructed based on tens of thousands of templates. Through this method, the model's capabilities in task planning, tool usage, and reflection are improved, while avoiding the problem of over-reliance on a single template.

KAgentBench offers an out-of-the-box automated evaluation benchmark for agent capabilities through meticulously annotated data. The benchmark encompasses diverse types of capability inputs, with each query featuring multiple templates and manually edited real responses to comprehensively evaluate accuracy and generalization. Evaluation results show that after MAT fine-tuning, 7B-13B models demonstrate significant improvements across all capabilities, surpassing the performance of GPT-3.5.

The team stated that AI Agents represent a promising path forward. They will persistently refine core technologies and actively explore the integration of Agents technology with Kuaishou's business, aiming to implement more interesting and valuable innovative applications. This open-source project has injected new vitality into the community, providing researchers with abundant resources and references.