Skip to Main Content

Artificial Intelligence (AI) in the Academic Health Sciences : Overview

Introduction

This guide provides information and resources on the applications of Artificial Intelligence (AI) in the academic health sciences. This includes the use of AI in teaching, research, and publishing. The primary types of AI used in these areas are Generative AI, including AI chatbots, and Deep Learning technology. For definitions of key terms used in the discussion of AI, see our glossary of terms. 

Glossary

Some important concepts for understanding the artificial intelligence landscape:

Algorithm: "a set of rules or instructions that tell a machine what to do with the data input into the system."

Deep Learning: "a method of machine learning that lets computers learn in a way that mimics a human brain, by analyzing lots of information and classifying that information into categories. Deep learning relies on a neural network."

Generative AI: a "system [that] takes in data and then uses predictive algorithms (a set of step-by-step instructions) to create original content. In the case of a large language model (LLM), that content can take the form of original poems, songs, screenplays, and the like produced by AI chatbots such as ChatGPT and Google Bard. The 'large' in LLMs indicates that the language model is trained on a massive quantity of data. Although the outcome makes it seem like the computer is engaged in creative expression, the system is actually just predicting a set of tokens and then selecting one."

Hallucination: "a situation where an AI system produces fabricated, nonsensical, or inaccurate information. The wrong information is presented with confidence, which can make it difficult for the human user to know whether the answer is reliable."

Large Language Model (LLM): "a computer program that has been trained on massive amounts of text data such as books, articles, website content, etc. An LLM is designed to understand and generate human-like text based on the patterns and information it has learned from its training. LLMs use natural language processing (NLP) techniques to learn to recognize patterns and identify relationships between words. Understanding those relationships helps LLMs generate responses that sound human—it’s the type of model that powers AI chatbots such as ChatGPT."

Machine Learning (ML): "a type of artificial intelligence that uses algorithms which allow machines to learn and adapt from evidence (often historical data), without being explicitly programmed to learn that particular thing."

Natural Language Processing (NLP): "the ability of machines to use algorithms to analyze large quantities of text, allowing the machines to simulate human conversation and to understand and work with human language."

Neural Network: "a deep learning technique that loosely mimics the structure of a human brain. Just as the brain has interconnected neurons, a neural network has tiny interconnected nodes that work together to process information. Neural networks improve with feedback and training."

Token: "the building block of text that a chatbot uses to process and generate a response. For example, the sentence 'How are you today?' might be separated into the following tokens: ['How', 'are', 'you', 'today', '?']. Tokenization helps the chatbot understand the structure and meaning of the input."

Definitions from:

Monahan, J. (2023, July). Artificial intelligence, explained. Carnegie Mellon University's Heinz College. https://www.heinz.cmu.edu/media/2023/July/artificial-intelligence-explained

AI Limitations

Generative AI is an emerging topic in academic health sciences research. It is important to understand some of the limitations and ethical considerations in using AI for this purpose. Here are a few examples of those limitations:

  • Bias: The data used to train Generative AI models can contain bias that is then reflected or perpetuated in the outputs.
  • Data Privacy: Data may be collected and used in ways that aren't transparent or disclosed to the user. This is why UW-Madison restricts institutional data from being entered into most generative AI products or services. 
  • Copyright: The use of AI in research projects raises issues of authorship, in addition to intellectual property right concerns of publishers and creators of the content contained in the training data.
  • Currency/Content: Generative AI is limited to the resources it has access to and how up to date that information is. Sources behind paywalls or firewalls are generally not accessible, thus impacting the quality of answers given by the AI model.
  • Reproducibility: The same search prompt can yield differing results, depending on the user. 
  • Inaccuracy: Generative AI works on predicting what the user is looking for. This can lead to inaccurant answers and data hallucinations, including invented references and citations. 

For information on ethics and generative AI, see the UW-Madison Generative AI guide.