What Do Editors Do With Faked Reference Lists?

If you edit manuscripts for academic books and journal articles, you’ll likely run into fake reference lists. That’s because in the academic publish-or-perish setting, authors are desperate to write their articles quickly, and they may use large language models (LLMs), such as ChatGPT, to help them. And what is the one thing LLMs are really bad at? Reference lists. But there’s a tool that can help you detect them.

I use Edifix, subscription-based reference-checking software that users access online. Said Edifix on its blog recently:

“2023 has seen the explosion into the public consciousness of ChatGPT and other large language models (LLMs), and these AI [artificial intelligence] applications have been rapidly and widely adopted in educational and research writing settings. Much has already been written about the potential benefits and pitfalls of using LLMs in scholarly publishing, including 14 posts on the Scholarly Kitchen blog alone (listed chronologically under ‘Further Reading’ below).

“The rapid embrace of LLMs has brought with it another flavor of potential reference manipulation: fake references. …

“ChatGPT does not have a true understanding of the questions it is asked or the tasks it is set. Among the ‘nonsensical answers’ that ChatGPT can give, one type especially pertinent to research publishing is its inability to generate relevant and accurate citations.

“This failure was highlighted by Curtis Kendrick on the Scholarly Kitchen [blog] just two months after the public launch of ChatGPT. When he asked ChatGPT to provide a reference list for a piece it had written on racism and whiteness in academic libraries, the list of 29 references it provided revealed a number of eye-opening problems.

“First, half of the citations were from just two journals, and typically these references were incomplete, generally lacking volume and/or issue numbers. Partly this reflects the limitations of the dataset used to train the model, which, for example, had access only to open access articles. Much more worrying was that ChatGPT didn’t always admit to not knowing the answer, sometimes appearing to lie instead. Of the 29 references it came up with, only one was accurate; some contained elements of genuine references but with parts transposed, and others were completely fake.”

Now, how the heck are editors to straighten out this mess alone? Fortunately, they’re not.

First, in its blog post, Edifix gives us 3 clues to possible reference fakery:

  1. A low overall rate of the reference list having links to the individual references on Crossref and PubMed
  2. Warnings from Crossref or PubMed Reference Correction about significant differences between the reference and the service metadata
  3. An excessive number of “unknown” references

Second, Edifix parses all entries in each of your reference lists and gives you automated comments when something isn’t right, like this comment:

Crossref does not recognize the DOI [digital object identifier; see https://en.wikipedia.org/wiki/Digital_object_identifier ] “10.1016/j.jaridenv.2019.0”, and reports it is not registered at any other registration agency. Please check the accuracy of the DOI.

Then you’ll have to query the author about those particular references and also likely alert the managing editor or other appropriate person affiliated with the publisher or journal. Meanwhile, read the blog posts from the Scholarly Kitchen that are listed and linked at the end of Edifix’s blog post. You’ll find plenty of good stuff on AI.

#editor #academic #editing #Edifix #ArtificialIntelligence #ChatGPT #FakeReferences #ReferenceLists #PublishOrPerish

Leave a Reply