Review Article
Open access
Published: 28 July 2023

ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model

Hanyao Huang ORCID: orcid.org/0000-0002-6805-0157¹^na1,
Ou Zheng ORCID: orcid.org/0000-0001-6313-8566²^na1,
Dongdong Wang²,
Jiayi Yin¹,
Zijin Wang²,
Shengxuan Ding³,
Heng Yin¹,
Chuan Xu^4,5,
Renjie Yang⁶,
Qian Zheng¹ &
…
Bing Shi¹

International Journal of Oral Science volume 15, Article number: 29 (2023) Cite this article

21k Accesses
96 Citations
4 Altmetric
Metrics details

Subjects

Abstract

The ChatGPT, a lite and conversational variant of Generative Pretrained Transformer 4 (GPT-4) developed by OpenAI, is one of the milestone Large Language Models (LLMs) with billions of parameters. LLMs have stirred up much interest among researchers and practitioners in their impressive skills in natural language processing tasks, which profoundly impact various fields. This paper mainly discusses the future applications of LLMs in dentistry. We introduce two primary LLM deployment methods in dentistry, including automated dental diagnosis and cross-modal dental diagnosis, and examine their potential applications. Especially, equipped with a cross-modal encoder, a single LLM can manage multi-source data and conduct advanced natural language reasoning to perform complex clinical operations. We also present cases to demonstrate the potential of a fully automatic Multi-Modal LLM AI system for dentistry clinical application. While LLMs offer significant potential benefits, the challenges, such as data privacy, data quality, and model bias, need further study. Overall, LLMs have the potential to revolutionize dental diagnosis and treatment, which indicates a promising avenue for clinical application and research in dentistry.

Large language models in medicine

Article 17 July 2023

CPMI-ChatGLM: parameter-efficient fine-tuning ChatGLM with Chinese patent medicine instructions

Article Open access 16 March 2024

Leveraging Large Language Models in the delivery of post-operative dental care: a comparison between an embedded GPT model and ChatGPT

Article Open access 12 June 2024

Introduction

Artificial intelligence (AI) has promoted recent progress in digital health for many years.^1,2 AI-equipped applications in dentistry have been found useful in analyzing medical imaging, including diagnosing dental caries,^3,4 periodontitis,⁵ and implantitis,⁶ and assisting oral and maxillofacial surgery with surgical planning.⁷ Besides the imaging data, audio data analyses can also benefit from deep-learning applications, as speech is one of the most important functions of the oral structure.^8,9 Furthermore, dental education is another emerging application.¹⁰ GPT-4, released by OpenAI, embarks on a new period of AI-powered large language models (LLMs). ChatGPT, built upon GPT-4, stirred up lots of interest among millions of scientists and engineers on account of its impressive human conversational response as a chatbot.¹¹ However, its potential impact on revolutionizing a series of technologies is more significant. Unlike earlier applications, ChatGPT is cultivated conversationally upon a tremendous knowledge base, enabling informative communications for the improvement of decision knowledge. Before ChatGPT, most AI technologies focus on the system of one input and one output, which relies on the amount of training data. With the influx of new data, re-training is required to update the existing model for more accurate decision-making. ChatGPT breakthroughs this mode and incorporates conversation to dynamically capture multiple sources of existing knowledge for question answering.^2,12,13 This human-friendly feature facilitates the diagnosis process and causes a significant change in the status quo, and its advancement will also shape digital health in dentistry.^1,14,15,16 The purpose of this paper is to provide an overview of the potential application of ChatGPT in dentistry.

Journey of LLMs

Before LLMs garner significant attention, language modeling has undergone a series of revolutions in the past decade. The early natural language model is carried out with n-gram modeling,¹⁷ which is probabilistic modeling yet effective for medical research.^18,19 The first milestone work after n-gram modeling is word embedding, which represents words in vector space to understand the natural language from a new quantitative perspective, promoting clinical research on document analysis.^20,21 Among a range of representation modeling, ELMo²² proposed by AllenNLP changes the game to a bi-direction model pretraining. This modeling approach also influences medical language research²⁰ and is also evaluated.²³ Since then, bi-directional deep-learning models have been proposed like BERT²⁴ and Generative Pretrained Transformer (GPT).²⁵ Built upon these models, a range of medical language models are proposed to accelerate medical research progress, such as a family of medical BERT models,^26,27,28,29 and clinical researchers found that the increase in model size significantly improves a variety of medical applications.^21,30,31,32 However, they are limited to medium model scales due to architecture design and hardware support, although some efficient algorithms are proposed for the medical domain.^20,33,34 One of the most important LLMs is T5, with 11 billion parameters proposed by Google.³⁵ Another rival model is GPT-3, developed by OpenAI, which contains 175 billion parameters. These billion-parameter models embark on a new chapter of LLMs and their applications. One of the most successful application instances is ChatGPT, a variant of InstructGPT³⁶ developed upon GPT-3, optimized by conversational response training. ChatGPT is equipped with interactive training which involves human feedback reinforcement learning and exhibits powerful language skills to generate human-like texts in real-time conversation. This interactive modeling also influences medical research like education.^37,38 All these rely on large-scale representation pretraining, which becomes critical to complex problem-solving with data in cross-modality, even for ChatGPT.

Large-scale vision-language pretraining

Vision-language pretraining is an important approach to solving text-to-image or image-to-text tasks, which trains a deep neural network with large image and text datasets. One of the vital training frameworks is CLIP, proposed by OpenAI, which is further improved by Salesforce to BLIP.³⁹ For text-to-image, GAN as an image generation prototype model can be integrated with text representation to generate diverse, authentic-looking but synthetic images.⁴⁰ Recently, the Diffusion Model,⁴¹ a rival model of GAN emerging with higher computation efficiency and image diversity, has been incorporated for vision-language representation pretraining. For example, DALL-E v2⁴² leverages CLIP ranked representation and diffusion model to generate image understanding sentences. To address medical domain-specific problems, a series of efficient representation learning models are also developed to empower intelligent healthcare services. BERT provides an effective solution to efficient inference and analysis of disease.⁴³ Multi-modal learning is also considered to improve medical visual-questioning-answer processes.⁴⁴ Although various applications are proposed, the study on how to integrate this powerful model with dental diagnosis is still limited.

Large-scale audio-language pretraining

Compared to vision-language models, audio-language pretraining does not prevail, but the representation learning with audio and text data still exhibits impressive audio-to-text performance. Some medium-scale models like MusCALL,⁴⁵ CTAL,⁴⁶ Wav2Seq,⁴⁷ and LAVA⁴⁸ indicate the superiority of representation pretraining on speech recognition. One of the important large-scale pretraining models is Whisper,⁴⁹ released by OpenAI, which is trained on 680,000 h of diverse audio-text pairs from the web. Inspired by the success of these works, an improved medical speech-to-text pretraining model is developed to more effectively link vocal signals to language generation and understanding.⁵⁰ Since audio-language pretraining research is still under exploration, the limited study demonstrates how to employ this pretraining framework to facilitate oral treatment.

Multi-modal LLM

With tremendous success in cross-modal training, more research attempts to incorporate multi-modal representation learning to empower LLMs. As one of the successful attempts, GPT-4⁵¹ demonstrates the competence of LLMs in a multitude of NLP applications, such as higher scores in GRE tests and other question-answering tasks. This implies a higher potential for Multi-Modal LLM in various areas, such as digital health. For example, multi-modal learning is conducted to facilitate medical services, which incorporates images, audio, and texts into training for a more comprehensive and robust model.^43,45 However, due to limited data availability, more research still attempts to explore the merit of multi-model LLM for medical fields, especially dental clinic research.

LLM as a ubiquitous solution

As LLMs become increasingly widely recognized, more representations will be embedded into the models to enhance their general problem-solving skills. The training process with a larger scale of data will yield a ubiquitous solution to problems of all kinds. For example, ChatGPT has served as a valuable tool to assist medical education for more effective instruction and analysis of teacher-student interaction.^37,38 Medical writing can be assisted or even accomplished by ChatGPT,³² which enables efficient documentation. Language challenges in medical research or clinical processes can also be alleviated by ChatGPT.⁵²

AI technology for clinical application

AI technology has promoted clinical applications by improving patient outcomes, streamlining processes, and reducing costs. In clinical practice, AI has achieved striking success in analyzing patient data like brain-tumor segmentation,⁵³ assisting in clinical decision-making like epidemiological prediction,⁵⁴ and performing complex tasks such as surgery and rehabilitation, which indicates the potential to revolutionize healthcare service. In dentistry, the convolutional neural network has shown performance gain in detecting and classifying maxillofacial fractures from CT.⁵⁵ However, subtle details of maxillofacial fractures may not be accurately detected sometimes due to the unfavorable resolution of CT scans. Still, more advanced CT scanners can achieve higher-resolution images in future studies. Medical researchers also attempt to explore detection methods and investigate the feasibility of an automated decision-making tool for dental age estimation using deep-learning and topological approaches by analyzing the third molar maturity index (I3M) from 456 mandibular radiographs.⁵⁶ Another recent research proposed a more comprehensive AI system that can precisely identify individual teeth and alveolar bones from dental cone-beam CT (CBCT) images, which enables accurate and precise dental healthcare.⁵⁷

The success of language modeling also promotes lots of research progress in representation learning for efficient medical services. For example, BioBert⁵⁸ is developed upon BERT to achieve a large but efficient text mining model for biomedical document analysis. ClinicalBert^59,60 carries out embedding training with a large volume of clinical documents to facilitate intelligent clinical diagnosis processes. SciBert²⁷ also built a large language model for representation learning with multiple documents across various scientific research domains.

One of the milestone contributions to biomedical research is AlphaFold,⁶¹ developed by DeepMind. Its success in accurate 3D protein structure prediction demonstrates the power of large-scale training to tackle significant challenges in quantitative biomedical modeling. Since then, a series of innovative large-scale frameworks have been proposed to enhance AI-powered modeling, such as OpenFold⁶² by OpenAI and BioNeMo Megatron⁶³ by NVIDIA, etc. In addition, inspired by LLM pretraining schemes, NVIDIA developed ProT-VAE⁶⁴ to advance functional protein design, which indicates the potential of large-scale biomolecule-language pretraining with an LLM. As technology continues to evolve, we can expect to see even more innovative applications of AI in clinical settings, ultimately leading to more effective healthcare services tailored to the needs of patients.

On exploring the capability of LLMs in dentistry

Automated dental diagnosis with an LLM

Record analysis with text mining

Contemporary medical practice widely adopts electronic health records (EHRs) for patient information documentation. Although it facilitates record generation and management, efficient analysis is still challenging since massive amounts of records are mixed with structured and unstructured data. This challenge leads to substantial amounts of underutilized data and obstructs patient care and research improvement. Text mining is able to tackle this challenge by drawing valuable conclusions and information from textual material in a mixed structure. Some straightforward modeling frameworks have been developed to explore patterns, correlations, and trends within textual data.^65,66,67,68 However, the performance of these models is inadequate to process massive amounts of documents efficiently and accurately.

An LLM helps to find a workaround for this limitation through training on extensive documents. Given strong competence in semantic understanding, an LLM can manage documents independent from structural formats. As shown in Fig. 1, text mining can also retrieve pertinent facts from unstructured data, such as free-text notes from healthcare professionals. From this unstructured data, an LLM like ChatGPT can be used to swiftly extract pertinent information, like a patient’s unique illnesses or adverse effects.

Treatment planning with natural language reasoning

As aforementioned, medical service experiences the influx of a large volume of digital information. In addition to straightforward document analysis, these data can assist healthcare providers in customizing treatment plans.⁶⁹ Although this data analytics is stimulating, the work is taxing since more labor is involved in document analysis. LLMs can easily automate document comprehension and make treatment plan analysis feasible, which reaps the benefit of large-scale pretraining. Furthermore, billions of documents help LLMs cultivate the capability of natural language reasoning (NLR) to perceive contexts. This capability of NLR can assist dental practitioners more efficiently in establishing treatment plans tailored to patients’ backgrounds.⁷⁰ For instance, NLR algorithms can examine adverse drug reaction (ADR) patterns linked to various dental procedures and drugs.⁷¹ Sometimes, drug administration can lead to gum bleeding and severer diseases like bisphosphonate-related osteonecrosis. Dentists can modify their treatment plans to lower the likelihood of side effects by understanding the most typical ADRs linked to particular medications. We maintain that an LLM can facilitate this process and provide a case of this application in Fig. 2. It has been found that NLR may be used to identify comorbidities by analyzing patient records for common risk factors and symptoms, identification of ADRs,⁷¹ drug safety surveillance,⁷² and patient education.⁷³

Medical documentation with natural language generation

In dental clinical practice, a synthetic yet faithfully representative EHR is essential to efficient medical information conveyance between healthcare providers and other medical professionals. Traditionally, this document preparation process is completed manually. Given keywords, practitioners organize the context following medical record documentation standards. It can be quickly assisted with an LLM. Natural Language Generation (NLG) is one of the important tasks of LLM. Generally, NLG accomplishes text generation given the understanding of natural language input, like structured texts or separate keywords. Since a well-trained LLM is highly skilled in this task, this merit can be leveraged to automate a series of common documentation tasks, such as generating reports on medical history, dental procedures, and treatment plans. For example, Fig. 3 shows an example of medical report narrative generation with only several keywords by ChatGPT. The generated texts are well formatted in a professional structure and encompass all important information in eloquent expression. The validity of the generated content has been validated.^74,75

Compared to other application fields, the influence of AI in dentistry has unquestionably been slower and more constrained. This is mostly because patient privacy concerns have prevented patient data from being made widely available to the broader AI research community. Meanwhile, training data plays a crucial role in the advancement of AI methods. To address this dilemma and promote methodological advances in dentistry, using high-quality synthetic quasi-EHRs data is a practical approach since it facilitates data sharing between healthcare providers and external investigators. We employ ChatGPT 3.5, configured with specific parameters, to generate synthetic data, shown in Fig. 4. The parameters include max tokens, frequency penalty, and presence penalty, which were set to enhance diversity in the generated text. The frequency penalty reduces the likelihood of selecting words based on their frequency of occurrence, while the presence penalty imposes a fixed cost on each word in the text. These penalties encourage the model to generate text with higher perplexity rather than relying solely on the most probable word choices. Additionally, temperature scaling is used to adjust the distribution of probabilities for the next tokens, and a top-p value of 1 ensures consideration of all available tokens. Post-processing is applied to refine the generated data to eliminate any artifacts introduced during the generation process. These post-processing rules are determined through manual examination. These data can be quickly generated and harvested with the assistance of LLMs. Synthetic EHRs can be more realistic by introducing variability in the generated data. LLMs can be guided to generate different patient profiles, medical histories, treatment plans, and outcomes. This helps mimic the diversity and complexity seen in real EHRs. Thus, it implies that an LLM has competence in efficiently preparing medical information and protecting the privacy of patients.

Cross-modal dental diagnosis with LLMs

Vision-language deployment

Visual grounding

Traditionally caries-related diagnosis is administered by dentists through visual and tactile examination. Before any treatment plan, a quick but comprehensive examination of oral health conditions is imperative. It sometimes takes experts’ effort and time to diagnose tooth conditions, possibly through X-ray images and CBCT, and reach a reliable conclusion. Some existing research has explored the potential of AI-assistant models in assisting diagnose for caries,³ periodontitis,⁵ medication-related osteonecrosis,⁷⁶ maxillofacial bone fracture,⁵⁵ oral squamous cell carcinoma,⁷⁷ and temporomandibular disorders.⁷⁸ These diseases can be diagnosed based on medical imaging. Also, AI-assistant models for imaging analyses show the potential in assisting dental treatment, including orthodontics,⁷⁹ restorative dentistry,⁸⁰ oral implantology,⁶ and oral and maxillofacial surgery.⁷

However, limited data representation hinders accurate diagnosis and treatment planning when the disease is intricate. The majority of study has been confined to image-only approaches, which restrain the effective conveyance of information and explore the untapped potential of AI models in dentistry. LLMs open the possibility of data-fused diagnosis by leveraging cross-modal perception. An LLM is highly skilled in aligning textual and visual representations for image-text analysis, which can facilitate the diagnosis of tooth problems by X-ray image interpretation.

Specifically, the inference by an LLM can be blended with specific visualization techniques to identify caries regions. For example, the practitioners can provide some keywords to query the model of ALBEF (A Lite BERT for Adaptive Embedding Factorization), which is specifically designed for image-to-text tasks and is integrated with Grad-CAM (Gradient-weighted Class Activation Mapping) to visualize the critical region for decision-making from the ALBEF model. The warmer color indicates the plausible regions corresponding to the described words. As shown in Fig. 5, root canal therapy is plausibly required in the region with a warmer color. Another tangible benefit of an LLM is training cost reduction. Without fine-tuning a large set of image data, an LLM can provide plausible affected teeth and likely locations of dental problems.

Visual question answering

In addition to visual examination of medical imaging data, diagnosis documentation is more critical to patient-centered care. From the interpretation of X-ray images, healthcare professionals will write down the observation, analysis, and medication suggestions to patients. These documents are also essential to healthcare big data analytics, while the document summarization on the medical transcripts takes much time. An LLM is able to reduce the processing labor significantly through specific tasks, like visual question answering (VQA). Commonly, a VQA model can convert the encoded image representations to word embedding for dental diagnosis questions. With the diagnosis questions, the answers are generated to facilitate diagnostic report generation. This VQA-assisted diagnosis can be performed to assess potential dental health issues.⁸¹ As shown in Fig. 6, the X-ray image of a patient’s teeth is fed into an image encoder, like BLIP-2, generating a natural language representation, i.e., embedding based upon image understanding. Meanwhile, different questions are fed into LLM to generate another set of question embeddings. Both question and image embeddings are mathematically combined to generate the interpreted answers to the questions about images.

However, sometimes the raw image contains too much noise, or the resolution is not acceptable by the BLIP model; therefore, it is difficult to extract the desired information through the VQA model. To address this, training a semantic segmentation model that divides the areas of the image with different properties into different classes is a potential solution because it allows an LLM to learn each element separately.⁸² For example, as Fig. 7 shows, the soft tissue envelope and nasal septum/concha are classified into orange and blue segments, respectively, and it is expected to improve the model performance and enhance image understanding to extract the morphology information of nasal cartilages, as the cartilages are small and embedded by the soft tissue.⁸³ Figure 7 also demonstrates the potential of training machine learning models to help reconstruct the nasal cartilage based on MRI for patients with orofacial clefts, who can suffer significant nasal deformity.^84,85,86,87 Due to the limitations of current imaging software, the differences between the cartilage and soft tissue cannot be easily defined, but with training machine learning models, the purpose should be achieved in the future.

Visual data generation

DALL-E 2, empowered by integrating GPT-3.5 as encoder and iffusion model as decoder, can be utilized in the medical field to generate synthetic medical images. Once the LLM has been trained on real EHR data, it can be used to generate synthetic quasi-EHRs. By providing prompts or specific instructions to the model, such as patient characteristics or medical conditions, the LLM can generate realistic synthetic records that resemble real-world EHRs. For example, by describing a patient’s CBCT scan in text, including details of any abnormalities, like an odontogenic cyst or alveolar cleft, which can observe obvious lesions on the alveolar bone structure, and feeding it into DALL-E 2, synthetic medical images that match the description can be efficiently produced in large quantities to improve the performance of deep-learning models by use as the training dataset. Figure 8 shows that synthetic medical images can be generated with varying levels of noise, contrast, or resolution to create images with specific properties or characteristics that are relevant to the medical condition being studied. Moreover, patient privacy can be protected since synthetic medical images are generated from textual descriptions rather than real medical data. This technique is valuable for medical research based on any medical 3D imaging techniques, including CBCT, CT, MRI, etc., and improving patient care by training machine learning models while maintaining patient privacy.

Alternatively, LLMs can also be employed to generate medical illustrations or diagrams based on textual descriptions. For example, a description of a surgical procedure can be fed into illustration software to create an accurate and detailed illustration of the procedure.

Audio-language deployment

Besides imaging and dialogs, a patient’s voice is also critical to medical diagnosis. A person’s voice can potentially reveal important clues about their speech function, as certain vocal characteristics may be indicative of the function of teeth, tongue, pharyngeal structure, and muscles. Analyzing these vocal attributes can assist healthcare professionals in identifying potential health concerns. One of the common medical diagnosis applications is waveform-spectrogram analysis on patients’ audio recordings, which are collected by requesting the patients to read certain words or paragraphs. The waveform is a curve-based representation of an audio signal, the shape of which enables acoustic analysis. The spectrogram is an alternative representation of sounds in the frequency domain, which facilitates signal processing and analysis.

Velopharyngeal insufficiency related to cleft palate, oronasal fistula, and so forth, that affects the contaction between the soft palate and posterior pharyngeal wall, or changes the needed separation between the oral cavity and nasal cavity,^87,88,89 exhibits some typical marks on voice waveforms and spectrograms. In velopharyngeal insufficiency, for example, nasal emission can lead to distinct variation in speech.⁹⁰ A person with velopharyngeal insufficiency may exhibit a waveform that shows the reduced intensity of the sound waves during certain frequencies or periods, leading to altered speech patterns because of the abnormal airflow in oral and nasal cavities.^91,92,93,94 Figure 9 shows an example of a comparison of the waveforms and spectrograms between normal people and patients with velopharyngeal insufficiency. It can be observed that normal people have a more intense waveform and continuous spectrogram, while the patients’ samples are more dispersive and broken.

Furthermore, the waveforms and spectrograms of different patients can be fed into the pretrained LLMs such as GPT-4 for potential disease and severity diagnosis. As shown in Fig. 10, a pair of graphs are inputted into GPT-4 while asking for disease deducing, and the model provides several answers for reference. Although the final answers show little about the velopharyngeal insufficiency, the output mentions muscle dysfunction. Also, a more precise output can be achieved by further fine-tuning with more labeled patients’ audio data. In addition, NLG can also be used in conjunction with speech recognition software to convert voice commands into written text, such as when dictating clinical notes or treatment plans.

Other potential cross-modal deployments

Biopsy

A biopsy is a medical procedure in which a small sample of tissue is removed from a person’s body to be examined under a microscope. This is typically done to diagnose or investigate various conditions, such as cancer, infections, or inflammation. Visualization techniques can be applied to biopsy to understand the tissue structure and cell morphology better and identify any abnormalities. Common visualization approaches include light microscopy, immunohistochemistry (IHC), immunofluorescence (IF), confocal microscopy, etc. With the embedding generated by a vision transformer, the image input can be projected into the language space and used for understanding the characteristics of disease identification. For instance, prostate cancer in biopsies⁹⁵ and pre-implantation kidney biopsy pathology practice⁹⁶ has been regarded as potential application fields, which also shows the potential for biopsy and histological analyses in dentistry, and oral and maxillofacial surgery.^97,98

Blood test

LLMs can help users understand the results of their blood tests by providing information on the normal ranges for different biomarkers and explaining the potential implications of high/low values. The changes in the test indicators also provide rich information about the human body condition, as well as track the recovery or disease deterioration process. These changed conditions can affect the patient’s treatment planning and treatment for dental problems.⁹⁹ For instance, anemia may present with low hemoglobin, hematocrit, and red blood cell count, while liver disease may present with elevated liver enzymes (ALT, AST, and ALP). These abnormal parameters may postpone the treatments like oral and maxillofacial surgeries as most of these surgeries are elective. The internal relationships and connections between these indicators can be well captured by LLMs, and thus the potential diseases can be linked with the inputted information.

Gene detection

Gene detection is the process of identifying and analyzing specific genes or genetic sequences. Classic or more recent approaches can help obtain the gene sequence, including RNA sequencing, DNA sequencing, single-cell sequencing, etc. The genes or genetic sequence can be projected into language embeddings with the corresponding encoder and then input to LLMs. As the sequence contains the underlying logic of the gene’s property, the LLMs can help to understand these logics by learning from large gene samples after training.^100,101 Potential applications may include understanding gene function, identifying genetic variations or mutations, and studying the relationships between genes and various biological processes or diseases, which can further influence the development of dental problems and treatments related to genetic disorders.

AI system for dentistry application with a fully automatic multi-modal LLM

To demonstrate the effectiveness and potential of LLMs’ application in dentistry, we present a framework of a fully automatic diagnosis system based on Multi-Modal LLMs. The system consists of three input modules from different models: vision input, audio input, and language input.

The image input could be dental X-ray, cone-beam computed tomography, and other medical imaging. For semantic classification, we focus on optimizing the capture of the critical elements. By applying vision-language models, the condition of the tooth is evaluated, potential anomaly or disease is detected, and specific diagnosis and corresponding suggestions can be given.

In this case, audio sources have two usages: voice anomaly detection and patients’ narrative understanding. For the first usage, the system receives the voice input from patients, plots waveform and spectrogram, and then performs amplitude and frequency analysis. For the second usage, the patients’ narratives are collected and converted into texts using speech recognition techniques. Afterward, the key elements, like the symptoms that patients stated, can be extracted and summarized to form reports or bullet points for doctors’ reference.

Targeting automatic diagnosis for dentistry, the AI system can be embedded into the dental clinics’ internal communication systems. Thus, a fully developed automatic application can encompass patients’ information from multiple sources and provide a professional medical diagnosis, as shown in Fig. 11.

A specific case for the multi-modal LLM AI system for dentistry clinical application

To demonstrate the application of the multi-modal LLM AI system in dentistry, we use a sample with dental caries to explain how it works by LLM, including vision-language modeling and treatment planning with natural language reasoning. As shown in Fig. 12, an X-ray of the tooth is inputted into the system, and the abnormal morphology, like decay on the tooth, can be located on the X-ray by vision-language modeling, and then the first question can be answered that dental caries can be found on the tooth. Then the next question becomes what the treatment planning for this problem is, and using the LLM again to output seven steps, including communicating with the patient, developing a treatment plan, discussing treatment options, conducting necessary dental procedures, providing oral hygiene instruction, scheduling follow-up appointments, and emphasizing preventive measures. However, from the X-ray, we can also observe potential bone loss near the distal root, which is not detected by the system according to this pilot study. Thus, further study should be done to improve the system.

Issues and limitations

While there is much excitement around the potential applications of LLMs in the field of dentistry, some issues and limitations must be addressed before these models can be widely adopted.

Data quality

Despite rigorous efforts to sanitize and filter the vast amount of training data, it is challenging to eliminate all harmful or inappropriate content, which may inadvertently propagate through the responses generated by LLMs.¹⁰² Inherently, these LLMs operate as sophisticated pattern-matching machines without a genuine understanding of the data they are trained on, which occasionally leads to nonsensical or inappropriate responses.¹⁰³ Compounding these issues, they lack the capacity to validate the information they generate, remaining incapable of accessing real-time data or verifying the current status of events post-training. Moreover, the knowledge base of an LLM is static, established at the time of training, thus unable to update its knowledge or assimilate new developments in the evolving data landscape. One possible solution could be to develop “Human-in-the-loop system”. Pairing LLM systems with human supervisors can safeguard important decisions, helping to catch and correct mistakes that the LLM might make¹⁰⁴.

Model bias

Biased clinicopathologic analysis results are the first noteworthy issue. Because LLMs are data-driven-only models which learn the features and patterns in the training data, the correctness of the LLM is highly dependent on the quality and adequacy of the data.¹⁰⁵ Although the LLMs are evolving iteratively, even in the era of GPT-4, we can’t fully trust the AI-generated clinicopathologic analysis results, and human-in-the-loop validating work is still necessary. In the future, neural-symbolic models, which can combine two approaches (neural networks and symbolic reasoning) by using neural networks to learn the statistical patterns in large datasets and then using symbolic reasoning to perform logical operations on the learned representations, can be a potential research direction.¹⁰⁶

Data privacy

The patient data breach is another big issue, especially in today’s privacy-sensitive world. Fine-tuning the LLMs in the dentistry domain is expected, and a huge amount of patient data is necessary. A data breach is likely to happen during this process if the healthcare providers and developers don’t take appropriate measures to safeguard patient data. It’s crucial to implement strict data handling protocols and use secure communication channels for transmitting and storing patient data. In addition to these security measures, it is equally essential to inform patients about using their data in advance and obtain their consent.¹⁰⁷ Another possible point of data leakage is dental diagnosis. Inputting patient data is necessary for these diagnosing applications, where there is a risk of violating patient privacy and confidentiality. One possible solution to address this concern is to use offline LLM such as META LLaMA, where the LLM is run locally on the device or edge server rather than on a centralized server API Call.¹⁰⁸

Computational cost

Computational resources can also limit the application of LLMs in dentistry. It is reasonable to expect that the LLMs in the dentistry domain will be operated locally due to data sensitivity. Firstly, fine-tuning the LLMs in the dentistry domain using local computational resources can be challenging. Then, running a full LLM to support the application in the dentistry domain is a waste of computational resources and unnecessary. A sparse expert model, a type of LLM that incorporates a set of specialized expert models, can be a future solution. It can reduce the computational resources required to train and run LLMs while handling specific tasks or domains more efficiently than the main LLM.

Conclusions

The utilization of language models like ChatGPT holds significant potential for advancing clinical applications and research in dentistry. By employing these models in a rational manner, a paradigm shift can be achieved in dental diagnosis and treatment planning. Further exploration based on diverse medical examination data will facilitate the realization of precision medicine and personalized healthcare in dentistry. A crucial future endeavor of practical deployment involves fine-tuning language models with dentistry domain-specific knowledge. This entails training the models with dentistry teaching materials, patient records, and other relevant domain information, resulting in enhanced accuracy by capturing pertinent patterns, terminology, and context. Consequently, the models acquire a profound comprehension of dentistry concepts, enabling them to generate contextually relevant and insightful responses. Customizing the outputs in alignment with domain requirements and preferences enhances efficiency, saving valuable time and resources. These benefits substantially contribute to improved performance and usability, rendering fine-tuned language models invaluable tools for research paper composition. Concurrently, the adoption of LLMs will further reduce medical costs and enhance medical efficiency.

References

Kurian, N., Cherian, J. M., Sudharson, N. A., Varghese, K. G. & Wadhwa, S. AI is now everywhere. Br. Dent. J. 234, 72–72 (2023).
Article PubMed Google Scholar
Johnson, S. B. et al. Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information. JNCI Cancer Spectr. 7, https://doi.org/10.1093/jncics/pkad015 (2023).
Mohammad-Rahimi, H. et al. Deep learning for caries detection: a systematic review. J. Dent. 122, 104115 (2022).
Article PubMed Google Scholar
Urban, R. et al. AI-assisted CBCT data management in modern dental practice: benefits, limitations and innovations. Electronics 12, 1710 (2023).
Article Google Scholar
Revilla-León, M. et al. Artificial intelligence models for diagnosing gingivitis and periodontal disease: a systematic review. J. Prosthet. Dent. https://doi.org/10.1016/j.prosdent.2022.01.026 (2022).
Mohammad-Rahimi, H. et al. Deep learning in periodontology and oral implantology: a scoping review. J. Periodontal Res. 57, 942–951 (2022).
Article PubMed Google Scholar
Minnema, J. et al. A review on the application of deep learning for CT reconstruction, bone segmentation and surgical planning in oral and maxillofacial surgery. Dentomaxillofac. Radiol. 51, 20210437 (2022).
Article PubMed Google Scholar
He, L. et al. Automatic initial and final segmentation in cleft palate speech of Mandarin speakers. PLoS ONE 12, e0184267 (2017).
Article PubMed PubMed Central Google Scholar
Fu, J., He, F., Yin, H. & He, L. Automatic detection of pharyngeal fricatives in cleft palate speech using acoustic features based on the vocal tract area spectrum. Comput. Speech Lang. 68, 101203 (2021).
Article Google Scholar
Thurzo, A., Strunga, M., Urban, R., Surovková, J. & Afrashtehfar, K. I. Impact of artificial intelligence on dental education: a review and guide for curriculum update. Educ. Sci. 13, 150 (2023).
Article Google Scholar
Zheng, O., Abdel-Aty, M., Wang, D., Wang, Z. & Ding, S. ChatGPT is on the horizon: could a large language model be all we need for Intelligent Transportation? Preprint at arXiv:2303.05382 (2023).
Schuppe, K. et al. Atypical Nelson syndrome following right partial and left total nephrectomy with incidental bilateral total adrenalectomy of renal cell carcinoma: a chat generative pre-trained transformer (ChatGPT)-assisted case report and literature review. Cureus 15, e36042 (2023).
PubMed PubMed Central Google Scholar
Şendur, H. N., Şendur, A. B. & Cerit, M. N. ChatGPT from radiologists’ perspective. Br. J. Radiol. https://doi.org/10.1259/bjr.20230203 (2023).
Alhaidry, H., Fatani, B., Alrayes, J., Almana, A. & Alfhaed, N. ChatGPT in dentistry: a comprehensive review. Cureus https://doi.org/10.7759/cureus.38317 (2023).
Eggmann, F., Weiger, R., Zitzmann, N. U. & Blatz, M. B. Implications of large language models such as ChatGPT for dental medicine. J. Esthet. Restor. Dent. https://doi.org/10.1111/jerd.13046 (2023).
Fatani, B. ChatGPT for future medical and dental research. Cureus 15, e37285 (2023).
PubMed PubMed Central Google Scholar
Damashek, M. Gauging similarity with n-grams: language-independent categorization of text. Science 267, 843–848 (1995).
Article PubMed Google Scholar
Eichstaedt, J. C. et al. Facebook language predicts depression in medical records. Proc. Natl Acad. Sci. 115, 11203–11208 (2018).
Article PubMed PubMed Central Google Scholar
Marafino, B. J., Davies, J. M., Bardach, N. S., Dean, M. L. & Dudley, R. A. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit. J. Am. Med. Inf. Assoc. 21, 871–875 (2014).
Article Google Scholar
Romanov, A. & Shivade, C. Lessons from natural language inference in the clinical domain. Preprint at arXiv:1808.06752 (2018).
Choi, E., Xiao, C., Stewart, W. & Sun, J. MiME: multilevel medical embedding of electronic health records for predictive healthcare. Adv. Neural Inf. Process. Syst. 31, 19 (2018).
Google Scholar
Sarzynska-Wawer, J. et al. Detecting formal thought disorder by deep contextualized word representations. Psychiatry Res. 304, 114135 (2021).
Article PubMed Google Scholar
Peng, Y., Yan, S. & Lu, Z. Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. Preprint at arXiv:1906.05474 (2019).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. Preprint at arXiv:1810.04805 (2018).
Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I. Improving language understanding by generative pre-training. (2018).
Michalopoulos, G., Wang, Y., Kaka, H., Chen, H. & Wong, A. Umlsbert: Clinical domain knowledge augmentation of contextual embeddings using the unified medical language system metathesaurus. Preprint at arXiv:2010.10391 (2020).
Beltagy, I., Lo, K. & Cohan, A. SciBERT: a pretrained language model for scientific text. Preprint at arXiv:1903.10676 (2019).
Rasmy, L., Xiang, Y., Xie, Z., Tao, C. & Zhi, D. Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit. Med. 4, 86 (2021).
Article PubMed PubMed Central Google Scholar
Liu, W., et al. K-bert: Enabling language representation with knowledge graph. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. No. 03 (2020).
Gu, Y. et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3, 1–23 (2021).
Google Scholar
Devaraj, A., Marshall, I., Wallace, B. C. & Li, J. J. Paragraph-level simplification of medical texts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 4972–4984 (2021).
Roy, A., & Pan, S. Incorporating medical knowledge in BERT for clinical relation extraction. In Proceedings of the 2021 conference on empirical methods in natural language processing, 5357–5366 (2021).
Neumann, M., King, D., Beltagy, I. & Ammar, W. ScispaCy: fast and robust models for biomedical natural language processing. Preprint at arXiv:1902.07669 (2019).
Rae, J. W. et al. Scaling language models: methods, analysis & insights from training gopher. Preprint at arXiv:2112.11446 (2021).
Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 5485–5551 (2020).
Google Scholar
Ouyang, L. et al. Training language models to follow instructions with human feedback. Preprint at arXiv:2203.02155 (2022).
Kung, T. H. et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLoS Digit. Health 2, e0000198 (2023).
Article PubMed PubMed Central Google Scholar
Gilson, A. et al. How does CHATGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med. Educ. 9, e45312 (2023).
Article PubMed PubMed Central Google Scholar
Li, J., Li, D., Savarese, S., & Hoi, S. Blip-2: bootstrapping language-image pre-training with frozen image encoders and large language models. Preprint at arXiv:2301.12597 (2023).
Saharia, C. et al. Photorealistic text-to-image diffusion models with deep language understanding. Preprint at arXiv:2205.11487 (2022).
Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A. & Sutskever, I. Zero-shot text-to-image generation. In International Conference on Machine Learning, pp. 8821–8831 (PMLR, 2021).
Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10684–10695 (2022).
He, Y., Zhu, Z., Zhang, Y., Chen, Q. & Caverlee, J. Infusing disease knowledge into BERT for health question answering, medical inference and disease name recognition. Preprint at arXiv:2010.03746 (2020).
Khare, Y., Bagal, V., Mathew, M., Devi, A., Priyakumar, U. D. & Jawahar, C. V. Mmbert: Multimodal bert pretraining for improved medical vqa. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), 1033–1036 (IEEE, 2021).
Manco, I., Benetos, E., Quinton, E. & Fazekas, G. Contrastive audio-language learning for music. Preprint at arXiv:2208.12208 (2022).
Li, H., Kang, Y., Liu, T., Ding, W. & Liu, Z. CTAL: Pre-training cross-modal transformer for audio-and-language representations. Preprint at arXiv:2109.00181 (2021).
Wu, F. et al. Wav2Seq: Pre-training speech-to-text encoder-decoder models using pseudo languages, ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5, Rhodes Island, Greece, 2023, https://doi.org/10.1109/ICASSP49357.2023.10096988 (2022).
Gurram, S., Chan, D., Fang, A., & Canny, J. LAVA: Language Audio Vision Alignment for Data-Efficient Video Pre-Training. In First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward at ICML (2023).
Radford, Alec, et al. Robust speech recognition via large-scale weak supervision. International Conference on Machine Learning (PMLR, 2023).
Huh, J., Park, S., Lee, J. E. & Ye, J. C. Improving medical speech-to-text accuracy with vision-language pre-training model. Preprint at arXiv:2303.00091 (2023).
OpenAI. GPT-4 Technical Report. Preprint at arXiv:2303.08774 (2023).
Baumgartner, C. The potential impact of ChatGPT in clinical and translational medicine. Clin. Transl. Med. 13, e1206 (2023).
Article PubMed PubMed Central Google Scholar
Wang, D., Zhang, S. & Wang, L. Deep epidemiological modeling by black-box knowledge distillation: an accurate deep learning model for COVID-19. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35. No. 17 (2021).
Wang, D., Gong, B. & Wang, L. On calibrating semantic segmentation models: analyses and an algorithm. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023).
Warin, K. et al. Maxillofacial fracture detection and classification in computed tomography images using convolutional neural network-based models. Sci. Rep. 13, 3434 (2023).
Article PubMed PubMed Central Google Scholar
Bui, R. et al. Artificial intelligence as a decision-making tool in forensic dentistry: a pilot study with I3M. Int. J. Environ. Res. Public Health 20, 4620 (2023).
Article PubMed PubMed Central Google Scholar
Cui, Z. et al. A fully automatic AI system for tooth and alveolar bone segmentation from cone-beam CT images. Nat. Commun. 13, 2096 (2022).
Article PubMed PubMed Central Google Scholar
Lee, J. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234–1240 (2020).
Article PubMed Google Scholar
Alsentzer, E. et al. Publicly available clinical BERT embeddings. Preprint at arXiv:1904.03323 (2019).
Huang, K., Altosaar, J. & Ranganath, R. Clinicalbert: modeling clinical notes and predicting hospital readmission. Preprint at arXiv:1904.05342 (2019).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article PubMed PubMed Central Google Scholar
Ahdritz, G. et al. OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Preprint at bioRxiv https://doi.org/10.1101/2022.11.20.517210 (2022).
Carbajosa, G., Trigo, A., Valencia, A. & Cases, I. Bionemo: molecular information on biodegradation metabolism. Nucleic Acids Res. 37, D598–D602 (2009).
Article PubMed Google Scholar
Sevgen, E. et al. ProT-VAE: protein transformer variational autoencoder for functional protein design. Preprint at bioRxiv https://doi.org/10.1101/2023.01.23.525232 (2023).
Jensen, P. B., Jensen, L. J. & Brunak, S. Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13, 395–405 (2012).
Article PubMed Google Scholar
Kocbek, S. et al. Text mining electronic hospital records to automatically classify admissions against disease: measuring the impact of linking data sources. J. Biomed. Inform. 64, 158–167 (2016).
Article PubMed Google Scholar
Sun, W. et al. Data processing and text mining technologies on electronic medical records: a review. J. Healthc. Eng. 2018, 4302425 (2018).
Article PubMed PubMed Central Google Scholar
Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. NPJ Digit. Med. 1, 18 (2018).
Article PubMed PubMed Central Google Scholar
Huber, M. T., Highland, J. D., Krishnamoorthi, V. R. & Tang, J. W.-Y. Utilizing the electronic health record to improve advance care planning: a systematic review. Am. J. Hosp. Palliat. Med. 35, 532–541 (2018).
Article Google Scholar
Luo, Y. et al. Natural language processing for EHR-based pharmacovigilance: a structured review. Drug Saf. 40, 1075–1089 (2017).
Article PubMed Google Scholar
Hirschberg, J. & Manning, C. D. Advances in natural language processing. Science 349, 261–266 (2015).
Article PubMed Google Scholar
Natsiavas, P., Maglaveras, N., & Koutkias, V. A public health surveillance platform exploiting free-text sources via natural language processing and linked data: application in adverse drug reaction signal detection using PubMed and Twitter. In Knowledge Representation for Health Care: HEC 2016 International Joint Workshop, KR4HC/ProHealth 2016, 51–67, Munich, Germany (Springer International Publishing, 2017).
Geng, W. et al. Model-based reasoning of clinical diagnosis in integrative medicine: real-world methodological study of electronic medical records and natural language processing methods. JMIR Med. Inform. 8, e23082 (2020).
Article PubMed PubMed Central Google Scholar
Lee, S. H. Natural language generation for electronic health records. NPJ Digit. Med. 1, 63 (2018).
Article PubMed PubMed Central Google Scholar
Hoogi, A., Mishra, A., Gimenez, F., Dong, J. & Rubin, D. Natural language generation model for mammography reports simulation. IEEE J. Biomed. Health Inform. 24, 2711–2717 (2020).
Article PubMed Google Scholar
Wongratwanich, P. et al. Do various imaging modalities provide potential early detection and diagnosis of medication-related osteonecrosis of the jaw? A review. Dentomaxillofac Radiol. 50, 20200417 (2021).
Article PubMed PubMed Central Google Scholar
Alabi, R. O. et al. Machine learning in oral squamous cell carcinoma: current status, clinical concerns and prospects for future—a systematic review. Artif. Intell. Med. 115, 102060 (2021).
Article PubMed Google Scholar
Jha, N., Lee, K. S. & Kim, Y. J. Diagnosis of temporomandibular disorders using artificial intelligence technologies: a systematic review and meta-analysis. PLoS ONE 17, e0272715 (2022).
Article PubMed PubMed Central Google Scholar
Monill-González, A., Rovira-Calatayud, L., d’Oliveira, N. G. & Ustrell-Torrent, J. M. Artificial intelligence in orthodontics: where are we now? a scoping review. Orthod. Craniofac Res. 24, 6–15 (2021).
Article PubMed Google Scholar
Revilla-León, M. et al. Artificial intelligence applications in restorative dentistry: a systematic review. J. Prosthet. Dent. 128, 867–875 (2022).
Article PubMed Google Scholar
Schwendicke, F. A., Samek, W. & Krois, J. Artificial intelligence in dentistry: chances and challenges. J. Dent. Res. 99, 769–774 (2020).
Article PubMed Google Scholar
Amer, Y. Y. & Aqel, M. J. An efficient segmentation algorithm for panoramic dental images. Procedia Comput. Sci. 65, 718–725 (2015).
Article Google Scholar
Shi, B. & Huang, H. Computational technology for nasal cartilage-related clinical research and application. Int. J. Oral. Sci. 12, 21 (2020).
Article PubMed PubMed Central Google Scholar
Huang, H., Cheng, X., Luo, X., Shi, B. & Li, J. Biomechanical analyses of common suspension sutures in primary cleft lip rhinoplasty. Head Face Med. 15, 20 (2019).
Article PubMed PubMed Central Google Scholar
Huang, H. et al. Mechanical analyses of critical surgical maneuvers in the correction of cleft lip nasal deformity. PLoS ONE 13, e0195583 (2018).
Article PubMed PubMed Central Google Scholar
Huang, H., Luo, X., Cheng, X., Shi, B. & Li, J. Biomechanical simulation of correcting primary unilateral cleft lip nasal deformity. PLoS ONE 13, e0199964 (2018).
Article PubMed PubMed Central Google Scholar
Huang, H. et al. Recapitulation of unilateral cleft lip nasal deformity on normal nasal structure: a finite element model analysis. J. Craniofac. Surg. 29(8), 2220–2225 (2018).
Article PubMed Google Scholar
Sakran, K. A. et al. Early cleft palate repair by a modified technique without relaxing incisions. Cleft Palate Craniofac. J. https://doi.org/10.1177/10556656221135288 (2022).
Sakran, K. A. et al. Evaluation of late cleft palate repair by a modified technique without relaxing incisions. J. Stomatol. Oral Maxillofac. Surg. 124, 101403 (2023).
Article PubMed Google Scholar
Huang, H. et al. Validation of the Chinese velopharyngeal insufficiency effects on life outcomes instrument. Laryngoscope 129, E395–E401 (2019).
Article PubMed Google Scholar
Huang, H. et al. Analysis of velopharyngeal functions using computational fluid dynamics simulations. Ann. Otol. Rhinol. Laryngol. 128, 742–748 (2019).
Article PubMed Google Scholar
Huang, H. et al. Computational fluid dynamic analysis of different velopharyngeal closure patterns. Ann. Otol. Rhinol. Laryngol. 129, 157–163 (2019).
Article PubMed Google Scholar
Huang, H. et al. Airflow of the two-port velopharyngeal closure: study using computational fluid dynamics. J. Craniofac. Surg. 31, 2188–2192 (2020).
Article PubMed Google Scholar
Yang, C. et al. Inspiration after posterior pharyngeal flap palatoplasty: a preliminary study using computational fluid dynamic analysis. Front. Pediatr. 10, 823777 (2022).
Article PubMed PubMed Central Google Scholar
Ström, P. et al. Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study. Lancet Oncol. 21, 222–232 (2020).
Article PubMed Google Scholar
Girolami, I. et al. Artificial intelligence applications for pre-implantation kidney biopsy pathology practice: a systematic review. J. Nephrol. 35, 1801–1808 (2022).
Article PubMed PubMed Central Google Scholar
Wan, A. & Savage, N. Biopsy and diagnostic histopathology in dental practice in Brisbane: usage patterns and perceptions of usefulness. Aust. Dent. J. 55, 162–169 (2010).
Article PubMed Google Scholar
Ilhan, B., Lin, K., Guneri, P. & Wilder-Smith, P. Improving oral cancer outcomes with imaging and artificial intelligence. J. Dent. Res. 99, 241–248 (2020).
Article PubMed PubMed Central Google Scholar
Miller, C. S. & Westgate, P. M. Implications of medical screenings of patients arriving for dental treatment: the results of a comprehensive laboratory screening. J. Am. Dent. Assoc. 145, 1027–1035 (2014).
Article PubMed PubMed Central Google Scholar
Yu, X. et al. GPT paternity test: GPT generated text detection with GPT genetic inheritance. Preprint at arXiv:2305.12519 (2023).
Zhang, N. et al. Ontoprotein: protein pretraining with gene ontology embedding. Preprint at arXiv:2201.11147 (2022).
Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at arXiv:2303.12712 (2023).
Caufield, J. H. et al. Structured prompt interrogation and recursive extraction of semantics (SPIRES): a method for populating knowledge bases using zero-shot learning. Preprint at arXiv:2304.02711 (2023).
Sakran, K. A. et al. Evaluation of postoperative outcomes in two cleft palate repair techniques without relaxing incisions. Plast. Reconstr. Surg. 152, 145–154 (2023).
Article PubMed Google Scholar
McNichols, H., Zhang, M. & Lan, A. Algebra error classification with large language models. Preprint at arXiv:2305.06163 (2023).
Lamb, L. C. et al. Graph neural networks meet neural-symbolic computing: a survey and perspective. Preprint at arXiv:2003.00330 (2020).
Al Omar, A., Bhuiyan, M. Z. A., Basu, A., Kiyomoto, S. & Rahman, M. S. Privacy-friendly platform for healthcare data in cloud based on blockchain environment. Future Gener. Comput. Syst. 95, 511–521 (2019).
Article Google Scholar
Sharma, S. et al. From occlusion to insight: object search in semantic shelves using large language models. Preprint at arXiv:2302.12915 (2023).

Download references

Funding

This work was supported by the Research and Development Program, West China Hospital of Stomatology, Sichuan University (RD-02-202107), Sichuan Province Science and Technology Support Program (2022NSFSC0743), and Sichuan Postdoctoral Science Foundation (TB2022005) grant to H. Huang.

Author information

These authors contributed equally: Hanyao Huang, Ou Zheng

Authors and Affiliations

State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, China
Hanyao Huang, Jiayi Yin, Heng Yin, Qian Zheng & Bing Shi
Department of Civil, Environmental & Construction Engineering, University of Central Florida, Orlando, USA
Ou Zheng, Dongdong Wang & Zijin Wang
College of Transportation Engineering, University of Central Florida, Orlando, USA
Shengxuan Ding
School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, China
Chuan Xu
C2SMART Center, Tandon School of Engineering, New York University, Brooklyn, USA
Chuan Xu
State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Eastern Clinic, West China Hospital of Stomatology, Sichuan University, Chengdu, China
Renjie Yang

Authors

Hanyao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ou Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Dongdong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiayi Yin
View author publications
You can also search for this author in PubMed Google Scholar
Zijin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shengxuan Ding
View author publications
You can also search for this author in PubMed Google Scholar
Heng Yin
View author publications
You can also search for this author in PubMed Google Scholar
Chuan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Renjie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Bing Shi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.H. and O.Z. contributed to the conception and design of the work. H.H., O.Z., D.W., J.Y., Z.W., S.D., H.Y., C.X., and R.Y. performed data acquisition, analysis, and interpretation. O.Z. and D.W. performed data acquisition. H.H., O.Z., and D.W. performed analysis and interpretation. H.H., O.Z., D.W., J.Y., Z.W., S.D., H.Y., C.X., R.Y., Q.Z., and B.S. drafted and critically revised the paper. All authors gave final approval and agreed to be accountable for all aspects of the work.

Corresponding authors

Correspondence to Hanyao Huang or Ou Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, H., Zheng, O., Wang, D. et al. ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci 15, 29 (2023). https://doi.org/10.1038/s41368-023-00239-y

Download citation

Received: 26 March 2023
Revised: 06 July 2023
Accepted: 13 July 2023
Published: 28 July 2023
DOI: https://doi.org/10.1038/s41368-023-00239-y

This article is cited by

Assessing the potential integration of large language models in accounting practices: evidence from an emerging economy
- Ahmad A. Toumeh
Future Business Journal (2024)
Evaluating the accuracy of Chat Generative Pre-trained Transformer version 4 (ChatGPT-4) responses to United States Food and Drug Administration (FDA) frequently asked questions about dental amalgam
- Mehmet Buldur
- Berkant Sezer
BMC Oral Health (2024)
Leveraging Large Language Models in the delivery of post-operative dental care: a comparison between an embedded GPT model and ChatGPT
- Itrat Batool
- Nighat Naved
- Fahad Umer
BDJ Open (2024)
Large language multimodal models for new-onset type 2 diabetes prediction using five-year cohort electronic health records
- Jun-En Ding
- Phan Nguyen Minh Thao
- Fang-Ming Hung
Scientific Reports (2024)
Multimodal deep learning using on-chip diffractive optics with in situ training capability
- Junwei Cheng
- Chaoran Huang
- Xinliang Zhang
Nature Communications (2024)