Polly Field, Tomas Rees and Richard White, Oxford PharmaGenesis, UK
Email your questions and comments on this article to TheMAP@ismpp.org.
A systematic literature review (SLR) allows us to find and evaluate existing evidence to answer a specific research question. But with increasing interest in reviews that are rapidly updated and cover a wide evidence-base – an evidence base that is increasing with the exponential growth in the volume of scientific literature – we see increasing demands on the resources needed to develop these reviews, with multiple people required to ensure that the methods are reproducible, and the evidence is accurate. This is one reason that many people are turning to artificial intelligence (AI), especially when considering large and complex SLRs, which would otherwise not be feasible without AI because of cost or turnaround time.
Here we consider how and when AI should be used to develop SLRs for publication, drawing on our experience as AI users, SLR experts and publication professionals. We provide guidance on the potential applications of AI, and the issues that should be considered when choosing an AI-based approach, deliberately avoiding describing specific tools, as these are evolving daily.
What advantages can AI bring?
AI can help with the scale, efficiency, quality and understanding of some SLRs.
- Scale/volume of evidence: AI can enable literature reviews at scales not previously considered feasible. The larger the SLR, the greater the benefit.
- Efficiency: AI can learn from initial work, speeding up subsequent updates, for example by using LLM screening questions (prompts) optimized in the initial review or by fine-tuning ML algorithms on the original data set.
- Quality: use of AI, with appropriate checks, has the potential to improve accuracy and reduce researcher bias.
- Understanding: AI can help summarize, group and visualize data, revealing patterns and trends in the underlying information.
How can AI help in the different stages of an SLR for publication?
AI can help across the full workflow of SLR development if used correctly and with people to check and adjust the output.2 In Figure 1, we describe current major uses by stage, deliberately not describing specific tools.
Considerations for evaluating AI tools for use in SLRs for publication
Before using a tool or platform, it should be critically appraised to understand potential advantages and risks. This includes testing of usability – whether the team can easily interact with the AI and incorporate it into the SLR workflow – and testing of performance, including sensitivity, specificity, time and resource use. There are important ethical, practical, business and legal considerations when using AI relating to the following risks.
- Confidential, commercially sensitive or personal information
- Any data, information or files uploaded into an AI tool may essentially become public domain or be used to train a provider’s AI model, potentially breaching confidentiality and/or copyright. Choice of provider and careful assessment of how they use any data uploaded to the tool is key.
- Quality of output, including accuracy, hallucination, omissions and bias
- LLMs are trained on massive data sets of existing text. By analyzing these data, the AI learns the patterns and relationships within that content, which, following more training, gives them the ability to perform specific tasks like answering questions about a document. AI does not understand what it is creating and cannot judge the quality, usefulness or correctness of what it generates.
- In general use, LLMs tend to adhere to text patterns that occur in their training set. They can easily generate material that is grammatically correct and convincingly written, but factually or logically incorrect. In the SLR context, this is mitigated by using retrieval-augmented generation, in which contextual content is provided to the LLM for use in developing the output. Many tools enable the user to easily validate LLM responses by highlighting the source text. This helps with fact-checking, but errors of omission remain a challenge.
Fundamental principles for the use of AI in SLRs for publication
The use of AI in any project should be aligned with relevant formal institutional and corporate policies for all parties involved, assuring a consistent and ethical approach. A fundamental principle of any publication is that the authors of an SLR are responsible for the quality and accuracy of their work, whether developed entirely by themselves or with assistance from AI; the use of AI does not alter the authors’ ownership of quality.
- Explainability and reproducibility – these are key challenges with SLRs, even when conducted by humans. LLMs add to the problem because they are by nature stochastic. Furthermore, differences in training sets and decisions in model development are often not adequately disclosed by the developers (this affects some tools more than others).
- Appropriateness of AI to the question being asked and the available evidence – the potential benefits of using AI in the various stages of the SLR process will differ with every project. For example, a novel therapeutic modality being used in a previously untreatable rare disease may have a small corpus of literature utilizing a broad range of terminology and new endpoints, which will be challenges for the use of existing LLMs or training of new ML approaches.
- Transparency in the use of AI and its reporting – everyone involved in an SLR should agree to the use of AI, and the approaches taken should be clearly described in subsequent publications in line with relevant journal or conference guidelines.
- Assuring quality and accuracy – humans providing expert knowledge and oversight are crucial; AI should be considered an augmentation, not a human replacement. A rigorous methodology with multiple layers of human validation should be developed and agreed by all authors involved in the SLR, noting that reviewing AI-generated material requires not just more checking, but a different focus while checking.
What guidelines exist for the use of AI in SLRs for publication?
The use of AI is captured in many guidelines related to publication and SLR methodology, and it is important to ensure compliance. The International Committee of Medical Journal Editors recommendations note that authors using AI to conduct a study (which we take to include AI use in literature reviews) should describe its use in the methods section in sufficient detail to enable replication of the approach.3 It further notes that authors using AI for writing assistance should report this in the acknowledgments section.3 Many journals also have policies about the use of generative AI, and authors should check the policy of the specific journal ahead of submitting a review to ensure compliance. Overall, policies accept that authors can use generative AI, providing that they maintain responsibility for the content and accuracy of the work. This can mean that AI is used to help with exploring ideas, searching and classifying literature, and improving the language used, but that authors are accountable for the originality, validity and integrity of their work. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines are widely used for the reporting of SLRs, and the 2020 statement includes guidance on how the use of AI should be reported.4 This is limited to the early stage of the review, with authors asked to report the number of records that are removed before screening.4
Conclusions
AI is here to stay and can help medical publication professionals develop high-quality work, including SLRs for publication. It is up to all of us to keep engaging, to be alert to new approaches and applications, and to evaluate, upskill and start to use these new and evolving tools. Above all, we need to stay transparent and acknowledge our use of AI, making clear that humans remain responsible for the quality of SLRs.
Acknowledgments
This article was critically reviewed by Kim Wager, Martin Callaghan, Gemma Carter and Jacob Willet from Oxford PharmaGenesis; Jody Filkowski from Medlior contributed to the original planning and content of the article.
Bibliography
- Higgins JPT, Thomas J, Chandler J et al. Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook.
- Teperikidis E, Boulmpou A, Potoupni V et al. Does the long-term administration of proton pump inhibitors increase the risk of adverse cardiovascular outcomes? A ChatGPT powered umbrella review. Acta Cardiol 2023;78:980–8.
- International Committee of Medical Journal Editors. Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals (updated January 2024). Available from https://www.icmje.org/news-and-editorials/icmje-recommendations_annotated_jan24.pdf (Accessed 12 August 2024).
- Page MJ, McKenzie JE, Bossuyt PM et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Rev 2021;10:89.