Skip to main content

Large Language Models – Where Are We and Where Are We Going?

How can generative AI help, and how can it hinder?

Published onSep 11, 2023
Large Language Models – Where Are We and Where Are We Going?
·
history

You're viewing an older Release (#1) of this Pub.

  • This Release (#1) was created on Sep 11, 2023 ()
  • The latest Release (#2) was created on Jun 12, 2024 ().

The Story So Far

On August 20th, 2023, we had a great time co-presenting a webinar on “Large Language Models (LLMs) – Where Are We and Where Are We Going?” at the ISMTE 2023 Global Event. Caitlyn kick-started the session with a brief introduction on LLMs, which have taken the stage by storm in large part due to Chat-GPT and OpenAI.

But what is a LLM? For those who may not be familiar, a large language model is “a type of Artificial Intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate, and predict new content. The term generative AI also is closely connected with LLMs, which are, in fact, a type of generative AI that has been specifically architected to help generate text-based content.” (TechTarget.com)

In some ways the term “AI'' can seem misleading; LLMs are not after all autonomous intelligence like we see in science fiction. But LLMs are extremely powerful all the same and have had an incredible impact on the scholarly publishing conversation in an incredibly short amount of time. On November 30, 2022, OpenAI released the early demo of ChatGPT, which suddenly went viral due to its impressive ability to answer questions and create content. The implications of such software quickly began to affect many industries: scholarly publishing, of course, being among them. Several questions have been raised regarding the practical benefits of the model, but just as many (if not more) questions surrounding the ethical considerations began to penetrate Editorial Offices and their day-to-day operations. The first articles “authored” by ChatGPT (or where human authors attributed co-authorship to the platform) were submitted quickly after the release, leading several publishers to take a stand against the use of the AI as an author of content. Nature published an article as soon as January 18, 2023, discussing ChatGPT’s appearance as an author on papers (after already publishing on the topic of AI-as-an-author back in October 2022). Not long after the major publisher released a formal statement in their author guidelines that ChatGPT could not be attributed as an author. If it was  used as anything more than a tool, then that should be properly disclosed. Further ground rules were published in an Editorial, and consequently other publishers began to follow suit.

What ethical considerations are at hand? Where do LLMs get their materials? Are AI companies scraping and using copyrighted content? If copyright isn’t an issue, attribution certainly is. Should authors use LLMs to proofread, edit, co-author, or author articles? What if reviewers use LLMs to review articles on their behalf? Where does the content that is loaded into the LLM go, and for how long is it stored? Are LLMs GDPR compliant, or is use of it compliant in any way? How can editors spot LLM content if authors don’t disclose it directly? Could LLMs provide paper mills and predatory publishers an even faster method to flood the publishing field? These are ultimately just a few of the questions that have sprung up in the last eight months, with more to come as we learn more about these models.

Thankfully, to help us manage these waters, Chhavi dissected the scholarly publishing landscape into six parts: authorship, manuscript submission, peer review, publication, marketing, and policy. She discussed the uses—both acceptable and unacceptable—of LLMs in each domain with focus on various stakeholders.

The Current Uses and Limitations of LLMs in Generating Scholarly Content

Generative AI can only leverage previously published data to train itself, and it relies heavily on freely available content. Oftentimes this content may not have undergone any rigorous peer review or scientific scrutiny by experts (unlike the peer-reviewed, published data that is generally behind the paywalls), limiting its use to adequately train these models. Therefore, the content synthesized by generative AI is quite generic, prompt-driven, prompt-specific, lacks novelty, and is devoid of most recent trends and updates that have yet to be made publicly available. In fact, Rice and Stanford University researchers recently reported a phenomenon called Model Autophagy Disorder (MAD) that significantly diminishes the output quality if generative AI is trained on AI-created data for five iterations. Other researchers are exploring this “feedback loop” and finding that ChatGPT, to name one, has gotten significantly less accurate at answering certain questions over time. Chen et al. in their article “How Is ChatGPT’s Behavior Changing over Time?” reported that “GPT-4's success rate on ‘is this number prime? think step by step’ fell from 97.6% to 2.4% from March to June [2023].”

There are several clear perceived benefits of leveraging LLMs for generating content. For example, LLMs could be a tremendous asset to authors with English as their second language; however, LLMs should be cautiously utilized to generate what may be considered proprietary content. The input as well as the output from one user can be stored and easily serve as the output for another user. Thus, the use of LLMs should be carefully considered and avoided at all costs while using unpublished ideas, as the novelty of that content may be compromised  prior to its official publication. It is reasonable to use LLMs to create broad outlines for a scientific paper, or generate summaries of articles to determine whether in depth reading is needed to benefit an author’s research. However, its use in synthesizing chunks of scientific papers is unacceptable. There is an urgent need for author awareness campaigns, especially for early career researchers, to ensure that generative AI is not used for generating scholarly publishing content, as it may compromise the novelty of such research as stated above. Similarly, reviewer awareness campaigns are also needed to disallow any use of LLMs for conducting reviews of manuscripts (even partially). All of this is required to maintain the trust, sanctity, and high standards of scholarly publishing content.

Currently, the majority of journals with instituted policies surrounding the use of generative AI prohibit LLMs/AI to be designated as authors since they are non-human, non-legal entities that cannot be held accountable for the synthesized content. Many journals are now seeking author disclosures of the use of generative AI at submission to further determine if they qualify for peer review and subsequent publication (such determination being performed by editorial office staff within the bounds of ethical considerations; at the moment the use of LLMs do not serve to disqualify work without human supervision). LLMs, when used in a sandbox model, can help improve efficiencies in the editorial office workflow by streamlining and harmonizing submission processes. LLMs could be used to assist with submission-related data such as author affiliations, disclosures, funding information, etc. by scanning documents to auto-populate relevant, needed information during the submission process, thereby helping authors better adhere to the submission format and reducing back-and-forth correspondence between authors and editorial staff. Oftentimes these requests are missed, delaying the publication timeline and making the process cumbersome for all involved. However, in order to maintain the scientific integrity and novelty of the content and to ensure the peer-review process remains trustworthy LLMs should not be used to pre-screen/screen the submitted content for proper scrutiny.

There is a growing concern surrounding the (mis)use of AI-generated content, both text and images, with regards to copyright. If a human uses AI to generate a figure, does the human gain copyright for this image, or does the AI that actually generated it under the human’s prompt gain copyright? If the AI alters existing content to generate a (now) new image, does the original author still get credited or the opportunity to provide copyright approvals for this adaption? How are copyright laws maintained with AI building upon AI-generated images, which have been inspired by original author creations? According to the US Copyright Office, “it is well-established that copyright can protect only material that is the product of human creativity.” The Copyright Office goes as far as to clarify “the term ‘author,’ which is used in both the Constitution and the Copyright Act, excludes non-humans.” It is evident that there is an imminent need to define policies for AI-generated content, especially in the context of copyright. The EU has been an early mover on this, proposing a regulatory framework for AI back in April 2021. The goal of this new law is to classify AI systems by risk level, which would decide how much regulation is required. Additionally, as recent as August 21, 2023, United States District Court Judge Beryl A. Howell ruled that human beings, not programs, are an “essential part of a valid copyright claim,” and therefore copyright cannot be provided to or claimed by AI or AI-generated art. It logically follows that copyright could not be claimed by any AI-generated content, causing major questions for editorial offices and publishers that require transfer of said copyright by authors to the publisher, society, or journal.

The recommended policy changes must be comprehensive enough to expand their utility beyond internal organizations and extended to guide external collaborations. These partnerships should include but not be restricted to any contracting with independent editors, science writers, marketing professionals leveraging LLMs for social media engagement, etc. To summarize, new policy changes should be instituted to include confidentiality clauses for reviewers, external collaborators of content, and publication and marketing partners.

The conversation surrounding ChatGPT and LLMs doesn’t end there, however. After Chhavi led us through the implications from an author/reviewer/publisher standpoint, Jay took us through the latest updates for industry, where LLMs have shown golden promise while also threatening doomsday headlines.

Tool and Industry Updates

Before ChatGPT became hugely popular in late 2022, some AI writing tools already existed. Small groups of researchers, writers, and other people who work with text used these early tools to help them with their daily work. Tools like Paperpal, R Discovery, Scholarcy, Scite, Consensus, Elicit, Semantic Scholar, and Grammarly let users summarize texts, come up with ideas, check grammar, and more. But these early AI assistants weren't talked about as much as ChatGPT is today.

ChatGPT's launch on November 30, 2022, made AI writing very mainstream and popular, sparking a rush of interest in AI that can generate text. To many people, tools like ChatGPT feel new and revolutionary, but they actually build on years of steady progress in AI research and writing aids that came before. The early, lesser-known AI writing tools laid the foundation for the huge boom we're seeing today in AI.

The foundations of AI can be traced back to over a century ago when the concept of human-like robots was introduced (recall the Tin Man from Wizard of Oz and the robots in Metropolis). Artificial intelligence as an idea was born in the 1950s through the works of Alan Turing, John McCarthy, Marvin Minsky, Allen Newell, Cliff Shaw, and Herbert Simon. Over the next seven decades the development and journey was not always smooth; there were successes and setbacks.

All this has gotten us to today, and we can say that the “Age of AI” has truly begun. All the right ingredients are there: the internet, high-speed broadband, smartphones, cloud computing, fast computers, vast quantities of user-created content, and capital waiting to be invested.

The image below created by Daniela G. Duca is a great illustration of the rapid growth of AI tools in scholarly publishing. It is an ever-expanding universe.

AI and LLMs have the potential to revolutionize research, writing, publishing, and education. These tools can save researchers time, speed up publishing, catch issues before and after publishing, let more people access and read the content, and customize education to promote better understanding. However, it is crucial to approach their use thoughtfully, weighing ethical considerations and potential limitations and risks.

Challenges

There must always be a balance in the universe: we cannot have opportunities without challenges, and this is true with AI as well. As AI technology keeps advancing, it raises a few key challenges for the scholarly publishing community:

  • The initial instinct might be to offload all your tedious work to AI, to use it for everything that is not a core function of your research or your work. However, putting too much faith in AI could lead to more work and issues with quality, context, ethics, and integrity. There is no such thing as unsupervised AI assistants; humans will be a necessary part of the process for many years to come. Do not rush out to replace your human experts with AI; instead use AI to make your people smarter, faster, and happier.

  • There are copyright, ethical, and legal issues when using AI that need to be explored further. Users are not fully aware of what data was used to train the AI they are using, or the quality of the data. An author could mistakenly use output from an AI that violates copyright and be legally liable since the AI cannot be held responsible. Ethically, the author also has to ensure that what they submit and publish stands up to peer review, and that the output and references produced using an AI are legitimate and correct.

  • AI bias is another issue which the user may not think about while generating content. Humans are prone to bias, and AI is trained on human-generated content, so it is reasonable to think that AI reflects our biases. Since many of the widely used AI systems are trained on English content, studies have proven a bias against non-native English speaking people. The bias is also present when you ask a generative AI image creator to show you what a CEO looks like (almost always a white male) or what a housekeeper looks like (almost always a minority female).

  • Paper mills are now an old issue that has persistently gotten worse, and as AI lowers barriers to creating better content, we can safely assume that paper mill submissions will increase rapidly. I know the quick answer is to use an AI detector, but those tools are unreliable, and it may be better to flip a coin.

  • The biggest challenge is that publishers are not tech companies, and AI is evolving rapidly. Keeping up will require cross-publisher collaboration, working together with tech companies and other stakeholders in education and research, and subject matter experts. So do not rush to get rid of your people: they might just be your best defense.

Opportunities

But, as mentioned, with challenges come an equal harvest of opportunities. LLMs could spark a revolution in how we work and disseminate knowledge, lowering barriers globally and providing opportunities to address inequity:

  • Automated Summarization: This capability has been around for some time now (Scholarcy, Paper Digest, Elicit) but it has improved greatly with the introduction of GPT 3, 3.5, and now 4. Summarizing content in abstractive and extractive forms makes it more accessible, understandable, and useful. Publishers and vendors are actively testing this to determine how it can help reduce cost and time to generate Plain Language Summaries.

  • Language Editing: It is really difficult for any author to get published but it is even harder for English as Additional Language (EAL) authors in English journals. AI and LLMs hold a significant amount of promise in making the publishing process easier for all authors and can particularly help EAL authors to improve the quality of their manuscript and reduce rejection rates. Products such as Paperpal, InstaText, Grammarly, AJE, Trinka, Writefull, and QuillBot are already available to authors and publishers, and more products are being launched regularly.

  • Content Generation: For researchers, one of the most challenging parts of research is writing a manuscript. It is difficult to distill years of research into a few thousand words and condensing it further as an abstract in under 250 words. The difficulty level increases when English is not your first language. Enter AI and LLMs, which can help researchers explain their complicated studies in simple words. It also helps people who speak languages other than English share their research clearly. This AI technology is changing how researchers talk about their work, making it easier for everyone to understand and learn from.

  • Text Data Mining: One of the most promising opportunities for publishers is utilizing text and data mining to better monetize content, uncover new insights, and further science. Researchers, institutions, and corporations are desperate to learn what knowledge lies in published research and how it can guide science in the future. Publishers can build knowledge models using existing content, build a searchable database by extracting illustrations and images, and develop chatbots to answer scientifically relevant questions. All of these services can be easily licensed in order to generate revenue and create new lines of business.

  • Translation and Accessibility: Open Access unleashed access to vast amounts of content, but it did not guarantee accessibility to all. The key hurdle to this is, of course, language, and it is expensive and time consuming to translate content from one language to another, especially STEM content. AI can perform translations and thereby can provide greater accessibility by making it easier, faster, and cheaper to translate the content users need to study, conduct research, do work, or help people in their community. Mobile users have been using solutions like Google Translate and DeepL to translate content using AI for several years, and recently R Discovery launched the ability to translate abstracts from English to twenty-plus languages in real time. For publishers seeking to expand their audience, there is no better option than to offer real-time translation to users at the article level.

Both Chhavi’s and Jay’s presentations gave us a lot to consider and chew on. The promises versus the concerns of LLMs can seem to almost cancel each other out: how can we get to the marvelous opportunities promised while also navigating the ethical issues and challenges inherent in today’s systems? We spent some time after the session discussing these issues in the Lounge, but the conversation will continue for many years and AI iterations to come!

Future Perspectives: Things to Ponder Before Panicking

AI will not take over…not yet anyway. It is now evident that LLMs cannot surpass the creative and intellectual performance of humans, and therefore they should be cautiously used to improve select functions in the scholarly publishing domain. But, while they cannot surpass human ingenuity and creativity, AI/LLMs can help humans perform certain tasks more efficiently, reduce time spent on repetitive tasks, and help improve the user experience. As we move to a mostly Open Access world, it becomes vital for publishers to improve the author experience, and AI can help. Imagine if the submission system could extract all the important details and ask the author to only verify the data, providing an author a quality check prior to submission which could improve the manuscript (pre-submission!) and reduce rejection rates. Editors could more-easily identify the right reviewers based on the manuscript data, and invites and follow-ups would be fully automated. A manuscript could be automatically transferred from one journal to another with zero editorial involvement. This is all possible and does not replace the humans but frees them up to focus on critical problems and thereby create more value for the journal. At the same time, we have to be aware that all this depends on good quality data for training. As mentioned previously, repeated use of LLM-generated content to train AI leads to drastic reduction in quality of synthesized output. We can only get as much good as we put in.

We must be wary of increasing (instead of reducing) inequity. On the flip side, there is currently a growing concern about the use of LLMs exacerbating the existing inequities in the scholarly publishing domain if the authors who leverage LLMs to massage their content are identified as “users” of LLM and flagged for its usage. There is also a deep concern about the use of LLMs creating new inequities; as the use of LLMs is monetized, which may limit their benefits to those with resources and further push back contributors from resource-limiting settings. Unequal access to high-quality training datasets may also skew the performance of LLMs themselves, affecting individuals who leverage different models and providing less-satisfactory outcomes.

Those with their feet on the ground should “stay frosty.” Editorial office members should be aware of both the promises and pitfalls of LLMs and their usage. New tools and integrations could provide greater ease processing ever-increasing submission loads, providing more time for higher-level work and a better author experience both during submission and peer review. That being said, what helps those with integrity also helps those who seek to subvert the system. The threat of paper mills and predatory publishers utilizing LLMs to increase their own output of fraudulent papers is real, as is the potential for ill-intended authors to generate fabricated content and images. Editors should be wary of a sudden and dramatic increase in submissions (always ask yourself, is it too good to be true?). Reviewers should likewise remain vigilant when reviewing submissions and data. Growth is almost always seen as a positive reflection of a journal’s health, but with the ease of generating pulp papers enhanced, that growth should be scrutinized through best-practices on the editorial level and by enforcing rigorous peer review. There is an urgent need to institute frameworks and guidelines for the use of LLMs in the scholarly publishing domain. Doing so will allow publishers to benefit from all its promising offerings while curtailing its malicious use, which may overburden and overwhelm already strained editorial workflows. Publishers can collaborate with each other to identify trends and sources of fraudulent behavior, thereby reducing retractions and expressions of concern, which are of course damaging to journals, publishers, and the overall scholarly integrity.

We have the opportunity to “build it better.” The scholarly publishing industry could also benefit by collecting resources (specifically copyrighted, peer-reviewed, published data that generally sits behind paywalls) to train a domain-specific, more-reliable, trustworthy LLM. Such a LLM could then be used by the entire industry, like a sandbox model, to help improve efficiencies while maintaining the trustworthiness and sanctity of the peer-review system. The rigor and reproducibility of the scientific literature could be evaluated, thereby laying a strong foundation for future research.

In summary, LLMs are here to stay and will continue to become integral parts of the scholarly publishing domain. With human supervision, we can maximize their responsible utilization to continue benefiting from their use while empowering our teams to be more creative and efficient. The human element will remain a necessary feature of expanding, enhancing, and polishing LLM use in scholarly publishing and elsewhere. But as with all innovations, we must remain vigilant that the advancements do not come at the cost of diversity, equity, accuracy, and integrity.

General Resources:

Guest Post - Academic Publishers Are Missing the Point on ChatGPT - The Scholarly Kitchen

New Article: “An Initial Scholarly AI Taxonomy”

[2304.06794] ChatGPT cites the most-cited articles and journals, relying solely on Google Scholar's citation counts. As a result, AI may amplify the Matthew Effect in environmental science

Guest Post: AI and Scholarly Publishing — A (Slightly) Hopeful View

Predicting the success of news: Using an ML-based language model in predicting the performance of news articles before publishing

Inside the AI Factory: the humans that make tech seem human - The Verge

Authorship Resources and Recommendations:

Prof. ChatGPT and Elsevier: ChatGPT as the Author of a Research Paper

Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy - ScienceDirect

Artificial intelligence (AI) in decision making | COPE: Committee on Publication Ethics

Chatbots, Generative AI, and Scholarly Manuscripts || WAME

CSE Guidance on Machine Learning and Artificial Intelligence Tools - Science Editor

Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals

Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge | Health Informatics | JAMA

Artificial Intelligence (AI) | Nature Portfolio

Generative Artificial Intelligence and Copyright Law

Federal Register :: Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence

Generative AI Has an Intellectual Property Problem

Justices rule against Andy Warhol estate in copyright dispute - SCOTUSblog

US Copyright Office wants to hear what people think about AI and copyright - The Verge

Comments
0
comment
No comments here
Why not start the discussion?