LANDMARK: Evaluating the reliability of drug-induced sleep endoscopy

Interrater Reliability of Drug-Induced Sleep Endoscopy

Kezirian E, White D, Malhotra A, Ma W, McCulloch C, Goldberg A.

Kezirian EJ, White DP, Malhotra A, Ma W, Mcculloch CE, Goldberg AN. Interrater reliability of drug-induced sleep endoscopy. Arch Otolaryngol Head Neck Surg. 2010;136(4):393-7.

Take Home Points:

    • Overall, interrater reliability of the DISE test using the grading system proposed is moderate to substantial*.
  • Scoring method is both region and structure based, which allows for characterization of pattern of obstruction and potentially better selection of surgical treatment (as this method could help you determine which structures/what level is most responsible for obstruction).

The Details:

    • Study Type: Prospective cohort study.
    • Inclusion: Patients 18 years and older with sleep apnea (AHI index higher than 5 on sleep study) and unable to tolerate positive airway pressure therapy.
    • Exclusion: Pregnancy and allergies to propofol or its components.
    • Patients underwent DISE in the OR using continuous IV propofol for sedation.
    • Video images from procedure were independently reviewed by two surgeons:
        • Unblinded – Lead author who performed procedure.
      • Blinded – Had no knowledge of patient with the exception of if they had previously had a tonsillectomy.
    • Grading of video from procedure based on:
        • Analysis 1: Global assessment of presence of obstruction at level of palate and hypopharynx.
        • Analysis 2: Degree of palatal and hypopharyngeal obstruction (none or mild <50%, moderate 50-75%, severe >75%).
      • Analysis 3: (1) Determine primary structure contributing to obstruction at level of palate and hypopharynx. (2) Determine individual structures’ contribution to obstruction.
  • 108 patients met inclusion criteria
    • All patients were shown to have airway obstruction (Mean AHI was 39.6).
    • Most subjects showed obstruction at the levels of both the palate and hypopharynx (81 of 108 [75%] for the unblinded reviewer and 85 of 108 [79%] for the blinded reviewer). (see Table 1)
    • The  inter-rater reliability of the global assessment of obstruction was 0.79 for the palate and 0.76 for the hypopharynx in Analysis 1.  This was higher than reliability for the degree of obstruction (0.60 for the palate and 0.44 for the hypopharynx) in n Analysis 2.
    • Inter-rater reliability was greater for the evaluation of primary structure contributing to airway obstruction (0.70-0.86 Analysis 3) than for individual structures independent of level (0.42-0.71 Analysis 3).
  • Evaluation of hypopharyngeal structures showed higher interrater reliability than for palatal structures.


    • Single Institution Study.
    • Grading completed by two reviewers (sleep surgeons who developed this scoring method).
  • Drug induced sedation cannot perfectly replicate natural sleep (or its effect on airway).

* Cohen’s kappa (κ) is used to evaluate inter-rater reliability. A value of 1 means perfect agreement between raters, while 0 would indicate the amount of agreement you could attribute to chance alone. Values between 0.41 – 0.60 are considered moderate and between 0.61 – 0.80 substantial, which explains the author’s’ conclusion, since most values fell within these ranges. McHugh, Mary L. “Interrater reliability: the kappa statistic.” Biochemia medica: Biochemia medica22.3 (2012): 276-282.

Summary contributed by Elizabeth Shay