How Are Large Language Models Impacting Scientific Writing?

A recent study led by four researchers from the University of Tubingen, Germany, and Northwestern University has revealed the significant impact of using large language models (LLM) on scientific writing. The study, which uses the method of over-word analysis, highlights the spike in the use of certain words since the introduction of LLM at the end of 2022.

What Did the Study Analyze?

This study was conducted by analyzing more than 14 million abstract articles published in PubMed between 2010 and 2024. The researchers compared the relative frequencies of the words before and after the LLM era to identify changes in vocabulary choices. The results show that a number of words such as “delves,” “showcasing,” and “underscores,” which were previously rarely used, experienced a significant spike in usage after the LLM became more commonly used.

Why Is This Important?

Dr. Andreas Müller, one of the main researchers from the University of Tubingen, explained that this increase indicates the use of LLM in the process of science abstract writing. “We found that at least 10% of abstracts published in 2024 use LLM in the process,” he said. These findings underscore the importance of detecting the use of LLM because even though the resulting text can look humane, they have the potential to contain inaccurate referrals or false claims.

Comparisons with Global Events

The study also compared the spike in the use of post-LLM words to a spike in words during significant world health events such as the COVID-19 pandemic. Dr. Müller explained that prior to the LLM era, the spike in words was generally related to major global events such as Ebola in 2015 and the COVID-19 pandemic from 2020 to 2022. However, post-LLM word spikes tend to focus on style words such as wordwords, word of nature, and word of statement.

Natural Language Evolution or Not?

Although the increase in the use of these words can occur naturally in language evolution, the researchers highlight that this sudden and significant spike is rarely seen before the LLM era. They also noted that the use of LLM may be more common among non-native writers who need assistance in editing English text.

Future Implications

This discovery paves the way for increased human ability to detect and remove unnatural style words from the text generated by LLM. The researchers hope that the knowledge of LLM marker words will allow human editors to be more effective in filtering generative text before being disseminated into the global scientific community.

The study was published in prepublished earlier this month and is expected to stimulate further discussions of the impact AI’s generating technology has on modern scientific communications.

The English, Chinese, Japanese, Arabic, and French versions are automatically generated by the AI. So there may still be inaccuracies in translating, please always see Indonesian as our main language. (system supported by DigitalSiber.id)

Source: link