ByteDance Responds to Using OpenAI Technology to Develop Large Language Models

baoshi.rao

AI Home News Recently, foreign media reported that ByteDance is using OpenAI technology to develop its own large language model, violating OpenAI's terms of service. In response, a ByteDance spokesperson stated that the company emphasizes compliance with OpenAI's usage terms when utilizing its services.

According to reports, earlier this year, when the technical team began initial explorations of large models, some engineers used GPT's API services for experimental projects involving smaller models. These models were only for testing purposes, with no plans for release or external use. After the company introduced GPT API usage compliance checks in April, this practice was discontinued.

As early as April this year, ByteDance's large model team had established clear internal guidelines prohibiting the inclusion of GPT-generated data in the training datasets for ByteDance's large models. Additionally, the engineering team was trained to comply with OpenAI's terms of service when using GPT.

In September, the company conducted another internal review, implementing measures to further ensure that GPT API usage complies with regulatory requirements. For example, batch sampling was used to test the similarity between model outputs and GPT's results, preventing data annotators from using GPT without authorization.

In the coming days, we will conduct another comprehensive review to ensure strict compliance with the relevant service terms.