Clinical Evaluation of AI-assisted Diagnostic Medical Device Software in China – A new NMPA guidance

Bingshuo Li, PhD "I am excited to be a member of the Qserve global team to assist medical device developers and manufacturers with their regulatory and quality challenges and be a catalyst for the exciting revolution!"

The new NMPA guidance CMDE 2023 No. 38 outlines the agency’s expectations on the clinical evaluation of AI-assisted diagnostic medical device software in China. It includes recommendations on clinical trial design, study subjects, evaluation metrics, clinical reference, sample size and statistics for AI-assisted software devices. This blog post provides a short English summary of this guidance document.

On November 7, 2023, the Center for Medical Device Evaluation (CMDE) of the Chinese National Medical Product Administration (NMPA) released the guidance document “Guidelines for the Registration Review of the Clinical Evaluation of AI-assisted Diagnostic Medical Devices (Software)” (CMDE 2023 No. 38). The document is aimed at guiding manufacturers of AI-assisted diagnostic medical device software (MDSW), as well as the NMPA reviewers, for the preparation and review of this type of MDSW’s clinical evaluation.


The guidance focuses on AI-assisted MDSW for clinical decision support. This refers to MDSW, either standalone or built-in, that are based on AI algorithms and may include functions such as pattern recognition and data analysis. These MDSW, through methods such as identification, labeling, highlighting, etc., prompt physicians to focus on potential areas of abnormality/lesions, thereby assisting physicians in making corresponding diagnostic and treatment decisions. These MDSW may also include non-decision support functions such as report generation, before-and-after image comparison, segmentation of normal anatomical structures, dimension measurement, CT value measurement and non-clinical functions. 

 Note, the following types of AI-assisted MDSW are excluded from the scope of this guidance document:

  • MDSW that identify malignancy, disease stage, or subtype
  • MDSW that predict the probability of disease occurrence 
  • MDSW that assist in detecting and distinguishing multiple lesions simultaneously 
  • MDSW for triage and referral
  • MDSW used in conjunction with IvD products

Nonetheless, manufacturers of these MDSW can use relevant principles outlined in this guidance as a reference. 

Key Takeaways

  • Trial design
    •  Clinical trials of these MDSW shall focus on their diagnostic performance. In addition, their usability and safety can also be investigated. 
    •  As the clinical significance of these AI-assisted MDSW lies in improving the detection accuracy of physicians, controlled trials are typically needed. Depending on the product's characteristics and clinical practices, relevant trial designs include randomized parallel control, crossover self-control, or multiple-reader multiple-case (MRMC) trials.  
  • Investigational subjects
    •  Imaging data from the intended population is typically used as the investigational subject of a trial. For clinical trials of MDSW for real-time imaging-based detection assistance, it is recommended to collect imaging data prospectively
    •  Imaging data should be independent from the data used for the device and its predecessor’s development (i.e. training and test sets used).
    •  Collect data with considerations of disease spectrum distribution, such as subtypes and stages.
    •  Gather comprehensive disease-related information when leveraging existing clinical data.
    •  Due to the variability of physicians’ performance and their interaction with patient variability and the AI, it is generally advisable to include physicians that the MDSW intends to assist as subjects in the trial. 
    •  For non-real-time imaging assistive products, MRMC design is advisable as it requires fewer samples.
  • Evaluation metrics
    •  The selection of evaluation metrics should include product design features considerations. Generally, metrics such as sensitivity, specificity, receiver operating characteristic (ROC) curve or its derivatives are less affected by differences in disease prevalence, making them preferable. 
    •  Regardless of metric choice, clinical trials should consider overall effectiveness design, e.g., area-under-curve for ROC, superior sensitivity under non-inferiority specificity, or enhanced detection rates. 
  • Clinical reference (ground truth)
    • Manufacturers should provide detailed information on the selection, construction methods, and rationale for clinical reference that serves as the ground truth. Available methods for constructing clinical references include clinical confirmation and expert panel judgment. The guidance provides detailed requirements for the construction of each type of reference
  • Sample size estimation and statistical analysis
    •  Sample size estimation should consider clinical trial design, primary evaluation metrics, and statistical requirements. Manufacturers should provide information on calculation formulas, relevant parameters, justification, and the statistical software used.
    •  For sample size calculation of parallel controlled trials, the manufacturer shall refer to the NMPA guidance document “Guidelines for the Design of Clinical Trials for Medical Devices” (CMDE 2018 No. 6)
    •  For MRMC trials, sample size calculation needs to take the