Abstract
Objective
YouTube is widely used in medical education, particularly for procedural skills such as regional anesthesia. However, concerns remain about the quality and reliability of its content. This study aimed to assess the educational value, reliability, and viewer engagement of regional anesthesia videos on YouTube.
Method
A total of 174 English-language videos were analyzed cross-sectionally. Five anesthesiologists independently evaluated each video using the DISCERN instrument, global quality score (GQS), and JAMA Benchmark Criteria. Viewer interaction was quantified through like ratio, view ratio, and video power index (VPI). Statistical analysis employed the Kruskal-Wallis test and Spearman’s correlation.
Results
The median scores were 57 [50-64] for DISCERN, 4 [3-4] for GQS, and 3 [3-3] for JAMA. Video source was significantly associated with DISCERN Q#16, DISCERN, GQS, JAMA, View Ratio, and VPI scores (p=0.010, p=0.005, p=0.048, p<0.001, p<0.001, and p<0.001, respectively), with videos from educational organizations and medical device companies generally showing higher quality scores. Ultrasound-guided videos performed best, while those from anonymous sources and neurostimulator-only techniques had the lowest scores. Correlations between quality and engagement were weak, though view ratio and VPI showed moderate associations with educational value.
Conclusion
The educational quality of YouTube videos on regional anesthesia is inconsistent. Professional, ultrasound-based content offers superior value, while “likes” are unreliable indicators of quality. Evidence-based, peer-reviewed contributions are needed to optimize open-access medical education.
Introduction
Due to the increasing global demand for accessible medical knowledge video sharing platforms, particularly YouTube, have become one of the primary sources of education for healthcare providers and the general public. Although YouTube is considered an educational platform, the quality, accuracy, and reliability of its medical content are still in question (1, 2).
Regional anesthesia has become a popular search term as its use in surgical procedures has expanded. It is a technically complex procedure and the guidance techniques are evolving quickly (e.g., ultrasound versus neurostimulation) (3-5). Due to their visual and skill-based nature, video demonstrations have become an integral part of residency training and continuing medical education in this field.
This study evaluates the educational quality, reliability and viewer engagement of YouTube videos related to regional anesthesia using DISCERN, the global quality score (GQS), the JAMA Benchmark Criteria and viewer engagement metrics (6-8). We aimed to determine whether video popularity aligns with educational merit and whether video source or guidance method is associated with higher-quality content.
Materials and Methods
The present cross-sectional descriptive study was conducted to evaluate the quality, reliability and educational value of YouTube videos related to regional anesthesia. The independent review and assessment of all eligible videos was conducted from February 17 to March 27, 2025. The video search was performed using the keyword “regional anesthesia” in the YouTube search bar. In order to prioritize content that has been viewed a significant number of times, the default filter setting, which was previously set to “sort by relevance” has been changed to “sort by view count”. All searches were conducted via a web browser after clearing cache, cookies and search history and without logging into any personal account. This method was employed to minimize algorithm-driven personalization of search results.
A preliminary collection of 326 videos exceeding 60 seconds in duration was identified. Exclusion criteria included non-English language, irrelevant content including central neuraxial blocks), duplicate entries, muted or soundless videos, animal-based demonstrations and videos aimed at medical specialties that were not related to anesthesiology (e.g., dentistry, veterinary medicine, ophthalmology or obstetric surgery). Moreover, five videos that had accumulated fewer than 10 views despite being online for over a year were excluded on the grounds of minimal exposure. After applying these criteria, 174 videos remained for final analysis, including 44 (25.3%) upper-extremity peripheral nerve blocks, 52 (29.9%) lower-extremity nerve blocks and 78 (44.8%) truncal/fascial plane blocks and mixed regional anesthesia education videos. The primary analytical framework centered on block videos, which were classified according to the guidance method as ultrasound, neurostimulation, or both. Consequently, central neuraxial blocks were not included in this study.
Each video was independently assessed by five anesthesiologists who were selected on the basis of active clinical practice in regional anesthesia. All reviewers had at least five years of post-residency anesthesia experience and routine clinical experience with ultrasound-guided and/or neurostimulator-assisted upper-extremity, lower-extremity and truncal/fascial-plane techniques. The reviewers were unaware of the evaluations conducted by their peers and were instructed to evaluate each video independently, without collaboration, discussion or the use of supplementary reference materials. This approach was designed to reduce potential bias and ensure consistency in judgment across evaluators. In order to assess the educational quality and reliability of each video, three validated scoring tools were employed: The DISCERN instrument, the GQS and the JAMA benchmark criteria.
The DISCERN tool, developed by the University of Oxford, is a widely utilized instrument for evaluating the quality of health information with a particular focus on treatment options. The scale used to assess the instrument’s reliability, treatment-specific content and overall quality consists of 15 questions, each rated on a scale from 1 to 5. This results in a total score ranging from 15 to 75. The videos were then categorized based on their total score with the following classifications: Excellent (63-75), good (51-62), fair (39-50), poor (27-38) or very poor (15-26) (9, 10).
The GQS is a five-point Likert-type scale that provides a structured yet subjective assessment of the educational value, clarity and organization of multimedia health content. This score is indicative of the efficacy with which a video conveys pertinent and comprehensible information to its intended audience (7, 11).
The JAMA benchmark criteria, originally proposed by Silberg et al., comprise four binary items evaluating authorship, attribution, disclosure and currency. Each criterion is assigned a score of either 1 (present) or 0 (absent), with a maximum total score of 4 indicating a high degree of transparency and credibility in the content (12).
For each included video, the following metadata were extracted and recorded in Microsoft Excel (Microsoft Corp., Redmond, WA, USA): Video title, total view count, duration (in seconds), upload date, evaluation date (used to calculate time since upload in days), number of likes and dislikes and total number of comments. Furthermore, three viewer engagement metrics were calculated to allow for standardized comparison across videos with varying visibility and age:
Like ratio (%) = (Number of Likes × 100) ÷ (Likes + Dislikes)
View ratio = Total views ÷ Days since upload
Video power index (VPI) = (Like ratio × View ratio) ÷ 100
The VPI, developed by Erdem et al., is a quantitative metric that evaluates a video’s popularity and audience engagement on social media. These indices enabled a comparative evaluation of each video’s popularity and audience interaction, independent of its publication date (13).
Ethical approval and informed consent were not required for this study because it was based exclusively on publicly available data (YouTube videos) that contained no human participants or identifiable personal information.
Statistical Analysis
The data obtained from the evaluated YouTube videos were recorded and analyzed using IBM SPSS Statistics version 31.0 (IBM Corp., Armonk, NY, USA).
The distribution of continuous variables was assessed using the Shapiro-Wilk test. Descriptive statistics were presented according to the distributional characteristics of the variables. Normally distributed continuous variables were summarized as mean ± standard deviation, whereas non-normally distributed continuous variables were summarized as median [interquartile range (IQR)]. In the group comparison tables, when a variable did not conform to normal distribution in at least one comparison group, the same variable was presented as median [IQR] across all groups to maintain a consistent row-wise presentation. For descriptive completeness, minimum and maximum values were also reported where appropriate. Since several variables deviated from normality, non-parametric statistical methods were used for group comparisons. The Kruskal-Wallis test was used to compare DISCERN Q#16, DISCERN, GQS, JAMA, like ratio, view ratio, and VPI scores across video source categories and guidance method groups. When a significant difference was detected, post hoc pairwise comparisons were performed using the Bonferroni-corrected Mann-Whitney U test. The relationships between educational quality scores and viewer engagement metrics were assessed using Spearman’s rank correlation coefficient (ρ). All statistical tests were two-sided, and a p-value of less than 0.05 was considered statistically significant.
Results
A total of 326 videos were initially retrieved from YouTube using the keyword “regional anesthesia”, applying a minimum video duration threshold of 60 seconds. Following the screening process, 152 videos were excluded based on predefined criteria (Table 1). The following results summarize the content characteristics, quality assessment scores, and engagement metrics of the 174 regional anesthesia videos included in the final analysis.
Descriptive analysis revealed marked variability across both video engagement metrics and quality scores. The median number of views was 4102 [649-25659], while the median number of likes and comments were 57 [10-269] and 2 [0-13], respectively. The median DISCERN, GQS, and JAMA scores were 57 [50-64], 4 [3-4], and 3 [3-3], respectively. The median view ratio was 3.27 [0.59-16.61], and the median VPI was 3.68 [0.62-17.23] (Table 2).
When videos were compared according to source, medical device company videos had the highest median DISCERN score (61 [50-75]) and GQS score (5 [3-5]), whereas videos categorized as “others” had the lowest DISCERN and GQS scores (50 [4257] and 3 [2-4], respectively). JAMA scores were also highest in medical device company videos (4 [3-4]). Educational organization videos showed the highest viewer engagement, with a median view ratio of 24.70 [2.41-77.51] and a median VPI of 24.49 [2.28-75.50]. Significant differences were observed across video source groups for DISCERN Q#16, DISCERN, GQS, JAMA, like ratio, view ratio, and VPI scores (p=0.010, p=0.005, p=0.048, p<0.001, p<0.001, p<0.001, and p<0.001, respectively) (Table 3).
According to guidance method, neurostimulation-only videos had the lowest median DISCERN Q#16, DISCERN, and GQS scores (3 [2-4], 52 [42-55], and 3 [2-4], respectively). In comparison, ultrasound-guided videos had higher median DISCERN Q#16, DISCERN, and GQS scores (4 [3-5], 59 [52-66], and 4 [3-5], respectively), while videos using both ultrasound and neurostimulation had a median DISCERN score of 60 [52-64]. Significant differences were observed in DISCERN Q#16, DISCERN, GQS, JAMA, view ratio, and VPI scores (p<0.001, p<0.001, p<0.001, p<0.001, p=0.042, and p=0.016, respectively). Like ratio did not differ significantly among guidance method groups (p=0.411) (Table 4).
DISCERN Q#16 showed strong positive correlations with DISCERN and GQS scores (ρ=0.883 and ρ = 0.947, respectively; both p<0.001), and DISCERN was also strongly correlated with GQS (ρ=0.877, p<0.001). View ratio was positively correlated with DISCERN Q#16, DISCERN, and GQS scores (ρ=0.342, p<0.001; ρ=0.235, p=0.002; and ρ=0.271, p<0.001, respectively). Similarly, VPI was positively correlated with DISCERN Q#16, DISCERN, and GQS scores (ρ=0.336, p<0.001; ρ=0.224, p=0.003; and ρ=0.262, p=0.001, respectively). In contrast, like ratio showed only a weak negative correlation with DISCERN Q#16 (ρ=*0.178, p=0.021) and was not significantly correlated with DISCERN, GQS, or JAMA scores (Table 5).
Discussion
This study provides a comprehensive evaluation of the educational quality, reliability and viewer engagement of YouTube videos related to regional anesthesia. It uses three respected tools—DISCERN, GQS and the JAMA Benchmark Criteria—along with engagement measurements such as like ratio, view ratio and VPI. The findings reveal the complex relationship between content quality and viewer popularity on open-access platforms and underscore the variability and inconsistency of publicly available educational materials.
The results demonstrate that, although the average video quality ranged from fair to good, the content spectrum was wide. Some videos scored as “excellent,” while others fell into the “very poor” category. This type of variance is normal for any unregulated digital platform, such as YouTube, which hosts videos that are neither peer-reviewed nor editorially controlled. Concurrent findings from prior studies of medical education content indicate that such heterogeneity stems from open-access policies and a paucity of curation (1, 4). These findings underscore the necessity for content governance and professional involvement in ensuring educational value.
One of the most important findings of our analysis is the significant impact of the video source on quality and reliability metrics. Videos uploaded by academic institutions and medical device companies received higher DISCERN, GQS, and JAMA scores than those published by individual trainers or non-professional sources. Specifically, medical device companies obtained the highest scores across all domains, including DISCERN (61 [50-75]), GQS (5 [3-5]) and JAMA (4 [3-4]). These results indicate that medical device companies adhere to rigorous standards in their content production. Educational organizations demonstrated the highest view ratio (24.70 [2.41-77.51]) and VPI (24.49 [2.28-75.50]), indicative of both quality and dissemination capacity. These results support previous findings that institutional origin is a strong predictor of content quality, suggesting that educational videos associated with established organizations are more likely to adhere to evidence-based practices (13, 14).
Post-hoc pair-wise comparisons further corroborated these disparities, with educational organization videos attaining significantly higher DISCERN scores compared to training physician videos (p=0.029). Medical device companies exhibited higher JAMA scores than the other source categories, including educational organizations, medical education platforms, training physicians, and others (p=0.005, p=0.004, p<0.001, and p=0.001, respectively). It is noteworthy that educational institutions exhibited lower like ratios compared to other entities; however, their engagement metrics (VPI, view ratio) were also superior. This prompts the inquiry of whether superficial metrics of approval, such as “likes”, may not be the most effective measure of pedagogical value.
The superior performance of ultrasound-guided videos across all assessed dimensions is a particularly noteworthy outcome. The videos obtained high DISCERN Q#16 (4 [3-5]), GQS (4 [3-5]), view ratio (5.01 [0.60-49.77]), and VPI (6.51 [0.68-51.88]) scores, suggesting that they not only provide strong educational value but also resonate well with viewers. This finding aligns with contemporary trends in clinical education practices, where ultrasound has become the prevailing standard in regional anesthesia training. The utilization of ultrasound technology in this context offers distinct advantages, including its capacity to provide clear anatomical visualization and real-time dynamic imaging (4, 15). Conversely, neurostimulator-only videos received the lowest quality scores across all measures (DISCERN 52 [42-55]; GQS 3 [2-4]; JAMA =3 [3-3]), reflecting a clear difference in value and relevance of different guidance techniques.
It is noteworthy that the combination of neurostimulation and ultrasound techniques received the highest ratings in the DISCERN and JAMA assessments, with an average score of 60 [52-64] and 3 [3-4], respectively. However, when evaluated against ultrasound-only videos, the combined technique ranked second. This suggests that the combination of techniques may result in the attainment of technical depth and credibility in content, but it may not necessarily lead to its availability or impact on online viewers.
Our study also reveals that viewer engagement metrics, particularly the “like ratio”, are not a reliable indicator of the video’s educational value. The like ratio exhibited a weak or even negative correlation with DISCERN and GQS scores, while the view ratio and VPI both demonstrated a moderately positive correlation with quality scores. This suggests that the popularity of a video does not necessarily equate to its educational value or the extent to which it is supported by data. These findings are consistent with the conclusions of previous research by Osman et al. (8), which cautioned against using popularity as a proxy for quality in health-related content. Moreover, the findings of this study corroborate the conclusions of Szmuda et al. (15), who observed that user feedback metrics frequently fall short in accurately reflecting the scientific accuracy or educational value of online videos.
Despite the presence of high-quality videos, our findings indicate that a considerable portion of YouTube content on regional anesthesia remains suboptimal in quality and reliability. Videos from anonymous sources or classified as “others” in particular did poorly on all metrics. These videos may exhibit a lack of authorship transparency, contain information that is outdated, or demonstrate a methodological rigor that is deficient, thereby posing a risk of misinformation. This lack of quality control is alarming and indicates the urgent need for better content governance, especially since YouTube is being used more and more by trainees and students for clinical learning (1, 16). In accordance with the findings of Alver et al. (16), it has been determined that YouTube, when utilized as a standalone platform, is inadequate for procedural education due to several limitations, including but not limited to substandard video quality, restricted language options, challenges in accessing contemporary techniques, and an absence of explanatory materials regarding critical anatomical landmarks (17). These issues underscore the imperative for supplementing YouTube content with structured, curriculum-based educational interventions.
Study Limitations
This study has several limitations. First, it included only English-language videos identified using a single keyword (“regional anesthesia”), potentially overlooking relevant content under alternative terms or in other languages. Second, the dynamic nature of YouTube—where views, likes, and recommendations evolve over time—may affect reproducibility. Third, while DISCERN, GQS, and JAMA are validated tools, they may not fully capture the procedural nuances of visual learning. Finally, although all videos were independently assessed by five anesthesiologists with active clinical experience in regional anesthesia, the reviewers’ individual procedural volumes and block-specific case numbers were not formally documented. Therefore, some degree of reviewer-related variability in the assessment of block-specific video content cannot be entirely excluded.
Conclusion
This study identifies a significant disparity in the quality and reliability of regional anesthesia videos posted on YouTube. The present study reveals that videos from academic centers and medical device companies, particularly those employing ultrasound guidance, demonstrate consistently higher levels of educational value, credibility, and systematic presentation compared to other sites. However, the use of anonymous or unverified sources can introduce a risk of misinformation, particularly for trainees who rely on the internet for their clinical education.
Specifically, the study corroborates the hypothesis that audience engagement metrics, particularly the Like Ratio, are not an effective indicators of quality in learning. This underscores the necessity for learners and instructors to employ critical appraisal strategies when selecting online learning materials.
Given the centrality of YouTube to medical education, educational institutions, specialty societies, and credible organizations must assume a more proactive role in the production and dissemination of high-quality, evidence-based, open digital learning materials. Subsequent studies must transcend the limitations of quality scoring and assess the tangible value of these videos in terms of learning, procedural competence, and patient outcomes. Cross-platform analyses have the potential to offer further insight into the optimization of digital education across various social media platforms.


