Factors Contributing to Fatalities in Helicopter Emergency Medical Service Accidents
INTRODUCTION: This study aimed to update and reinforce previous research on helicopter emergency medical service accidents in the United States. By investigating predictors of fatalities after helicopter emergency medical service crashes through the application of machine learning techniques, we updated existing data sets and sought to uncover patterns that traditional analysis might not reveal.
METHODS: Using the National Transportation Safety Board database, the authors analyzed a dataset of 267 helicopter emergency medical service accidents between 1991–2022. We first calculated fatalities odds ratios for each condition. We then plotted geospatial locations of all reported accidents. Finally, we used XGBoost regression to understand the most important features contributing to fatality after an accident.
RESULTS: The findings reaffirm previous research and identify significant predictors of fatalities in helicopter emergency medical service accidents. Key factors such as adverse flight conditions (weather), the absence of a copilot, and postcrash fires are highlighted as critical to understanding and mitigating risks of fatality.
DISCUSSION: These findings emphasize the utility of machine learning in extracting meaningful insights from accident data, suggesting that such techniques offer a more nuanced understanding of the conditions leading to fatalities. It points out the potential of these methods to not only enhance aviation safety but also to be applied across other sectors. We conclude by underlining the significant potential of techniques like XGBoost in advancing safety measures within helicopter emergency medical service and possibly other aviation sectors.
Korentsides J, Keebler JR, Berezovski M, Chaparro A. Factors contributing to fatalities in helicopter emergency medical service accidents. Aerosp Med Hum Perform. 2025; 96(2):111–115.
Helicopter emergency medical service (HEMS) is a means to transport patients from an accident to a designated hospital quickly and safely. However, contrary to the intended purpose, evidence shows that, compared to other modes of transport, helicopter emergency medical services have had the highest accident-related fatality rate. Simonson et al. used Bayesian models to understand helicopter emergency medical service fatal accidents,1,2 focusing on the factors affecting the odds of a fatality when a helicopter emergency medical service crash occurred. Their work used a Bayesian logistic regression or Bayesian inference, which is a method that integrates previous knowledge and current data into a model designed to comprehend conditional probabilities, allowing for a more nuanced understanding of the factors that influence fatality rates. The authors reported that flying at night, flying under instrument flight rules (IFR), and postcrash fires significantly contributed to the higher likelihood of a fatality. Further, the crash rates have not changed over the past few decades, demonstrating that interventions have arguably not been very effective.1,2
The aim of this research was to identify and analyze the key factors that contribute to fatalities in HEMS accidents using both traditional Bayesian models and modern machine learning techniques to provide a comprehensive understanding and to suggest actionable safety improvements. Additionally, the purpose of this research was to update the data used by Simonson et al.1,2 and to conduct additional analyses using a different approach based in machine learning (i.e., XGBoost).3 Machine learning offers the ability to analyze vast datasets with numerous variables, identifying patterns and predictors of accidents that traditional statistical methods may not reveal. Algorithmic models, in the context of large data handling, refer to computational methods and processes that automatically analyze and interpret complex data sets, identifying key patterns and relationships that might be missed by human analysis. This approach aligns with recent research by Mehta et al.,4 who demonstrated the potential of machine learning in improving aviation safety by accurately predicting accident severities. By deploying a wide spectrum of algorithms, including advanced ensemble techniques such as the Stacking Ensemble Model, the research underscores the significant role of machine learning in refining safety measures and operational practices. Notably, XGBoost, a decision-tree-based ensemble machine learning algorithm, is particularly noted for its speed and performance in both classification and regression tasks. This method calculates feature importance by analyzing how much each feature is used to split the data across all of the trees in the model, providing a detailed understanding of the factors contributing to accident severity. Their analysis revealed that the Stacking Ensemble Model, by integrating the strengths of various machine learning approaches, achieved an exceptional prediction accuracy of 91.66%. This outcome emphasizes the advantage of leveraging a combination of machine learning models to bolster the accuracy of aviation crash severity predictions, marking a significant contribution to the enhancement of aviation safety protocols.4 Furthermore, the application of machine learning in analyzing HEMS accidents is supported by the broader literature on aviation safety. For example, Baker et al.5 explored the use of predictive analytics in reducing the risk of helicopter mishaps, suggesting that data-driven strategies could lead to significant improvements in safety outcomes. Similarly, Mosier et al.6 investigated the role of technology in enhancing the decision-making capabilities of pilots, indicating that advanced analytics and real-time data processing could mitigate some of the risks associated with HEMS operations.
Studies have shown that HEMS operations carry a high-risk profile, particularly in challenging conditions such as night flights and adverse weather. For instance, Boyd and Macchiarella7 found that while the overall HEMS accident rate decreased over the years, the fraction of fatal accidents remained high, emphasizing the need for continued safety improvements.7 Additionally, Aherne et al.8 highlighted the critical role of pilot experience in night HEMS operations, noting that pilots with less domain task experience were more likely to be involved in fatal accidents due to poor decision-making in hazardous conditions.9
The National Transportation Safety Board (NTSB)10 has a public database of all recorded aviation accidents in the United States. By incorporating updated data from the NTSB, this research aims to build on the existing body of knowledge, offering fresh insights into the factors that contribute to HEMS accidents and fatalities. The goal is not only to validate previous findings but also to explore new avenues for improving safety through the application of advanced analytical techniques. Understanding causal factors related to HEMS mishaps, as well as what occurs in a crash that leads to fatalities and injuries, is ultimately important to better understanding ways to mitigate and prevent future accidents or at least prevent injury or death when accidents do occur.
METHODS
This research used a public database of aircraft accidents from the NTSB.10 In the work by Simonson et al.,1,2 the authors used a dataset of 131 HEMS crashes between the years of April 31, 2005, to April 26, 2018. Using these data, they used a Bayesian inference method, which is an approach that integrates previous knowledge and current data into a model designed to comprehend conditional probabilities.1,2 The data for the current study was updated to a final set of 267 crashes (8.6 crashes per year, on average), with 96 being fatal and 171 being nonfatal. We developed pivot tables in Excel with a complete dataset of all emergency medical helicopter accidents that occurred between January 26, 1991, through May 18, 2022. The conditions analyzed included: flying at night, flying under IFR, presence of postcrash fires, condition of the accident site, pilot’s flight rating, flight time in the last 30 and 90 d, presence of a second pilot, and whether the pilot had a level 2 medical certification.
Our approach used three different data analysis techniques: a test of odds ratios for fatalities by condition; a geospatial map of all crashes and whether they were fatal or not; and an XGBoost, a feature-based decision-tree regression that explains and ranks conditions by their importance in regards to fatalities occurring after a crash. In this context, the machine learning method refers specifically to the use of the XGBoost algorithm, which is a scalable and efficient implementation of gradient boosting for decision trees. XGBoost is particularly suitable for handling large datasets and can accommodate different types of data while being robust to noisy data. This algorithm was chosen for its superior performance in both classification and regression tasks, and it offers automated feature selection which helps in identifying the most significant variables affecting the outcomes.
The XGBoost model was trained on the dataset to classify the accidents as fatal or nonfatal based on the conditions mentioned. The model’s performance was evaluated using training and validation accuracy, with the training accuracy achieving 94.40% and the validation accuracy achieving 79.00%. The model also provided a feature importance ranking, highlighting the conditions that had the most significant impact on the likelihood of a fatality. This approach allowed us to not only validate the findings from previous studies but also to uncover additional insights and patterns that traditional statistical methods might not reveal. Unlike the previous analyses using Bayesian analysis, XGBoost adds some unique advantages, including flexibility with different types of data, robustness to noisy data, and automatized feature selection.3
RESULTS
Table I lists the conditions that were included in the following models. Our first aim was to understand if each condition increased or decreased the probability (i.e., odds ratio) of a fatality. We first calculated fatality odds ratios for each of the above factors. The fatality ratio is defined as the number of fatal injuries divided by the number of nonfatal injuries across all crashes. Nonfatal injuries include major, minor, and no injuries. Fatality ratios over 1 indicate a higher probability of a fatality occurring (i.e., a fatality ratio of 3 indicates a threefold increase in probability of fatality), while whole numbers below 1 indicate a lowered probability of a fatality.
We also used Google Maps and coordinates from the NTSB reports to create a map of all crashes, including crashes that did (red circles) and did not (yellow circles) lead to fatalities (Fig. 1). This allowed us to better understand where crashes occur and supports the commonsense notion that there are more crashes in and around urban areas throughout the United States. However, in the absence of data such as hours flown or the number of flights in metropolitan vs. rural areas, it is not possible to assign any meaningful conclusion regarding the clustering of crashes near metropolitan areas. This limitation has been taken into consideration when interpreting these findings.
Citation: Aerospace Medicine and Human Performance 96, 2; 10.3357/AMHP.6461.2025

Utilizing XGBoost,3 a machine learning model for regression, allowed us to determine feature importance. Feature importance indicates how much each feature contributes to the overall prediction accuracy of the model, similar to a standardized beta-weight in a regression equation. XGBoost calculates feature importance by analyzing how much each feature is used to split the data across all of the trees in the model. The higher the value, the more important the feature is for making predictions. Any individual feature score is a percent of the overall contribution of that feature to the prediction accuracy of the model. An XGBoost model was fit to predict binary classification from 10 predictor variables in a sample of 268 events using 100 trees. The model achieved a training accuracy of 94.40% (192 of 204 correct predictions) with a 95.00% confidence interval of [0.92, 0.96] and a validation accuracy of 79.00% (51 of 64 correct predictions) with a 95.00% confidence interval of [0.77, 0.81]. From this analysis, we also get an estimate of mean squared error (MSE), which indicates a good fitting model the closer it is to zero. The MSE is calculated as follows:
The MSE for this model was 0.056, indicating excellent fit. Feature importance of the final model is listed in Fig. 2.
Citation: Aerospace Medicine and Human Performance 96, 2; 10.3357/AMHP.6461.2025

DISCUSSION
Our results demonstrate a variety of factors that can contribute to the odds of a fatality after a crash, including IFR conditions at the site of the accident, the pilot’s flight rating, fire after crash, amount of hours flown by the pilot in the last 1–3 mo, whether the flight was during the day or at night, and whether the pilot had a level 2 medical certification. This is in line with recent previous work by Simonson et al.,1,2 but adds detail to previous findings through the assessment of additional contributing factors and a novel approach (XGBoost) at detecting factor rankings of the most/least significant factors. Given these findings, it may be beneficial to add better fire-resistant materials and extinguishing systems to current HEMS platforms to prevent or mitigate fires.11 The effect of time of day also demonstrates that night flying is more dangerous and may be an area where having a second pilot should be considered.
The geospatial data gathered from this research demonstrates that crashes may appear to happen more frequently, and subsequently have a higher propensity for fatalities, in and around metropolitan areas. However, as mentioned in the results, in the absence of data such as hours flown or the number of flights in metropolitan vs. rural areas, we are not able to assign any meaningful conclusion regarding the clustering of crashes near metropolitan areas. This may provide useful guidance in regard to when to use a second pilot. In other words, given the conditions found in this study, perhaps a second pilot should be introduced to HEMS flights that are at night, in metropolitan areas, and/or have IFR conditions to prevent fatalities if a crash occurs.12
The machine learning model demonstrated similar factors to the odds ratio analysis, with conditions at the accident site, having a second pilot, and a postcrash fire being the most prominent contributors to the model, accounting for almost 50% of the overall prediction accuracy of the model. This study aimed to replicate and extend the findings of Simonson et al.,1,2 with a focus on employing machine learning techniques to refine the models predicting fatalities in HEMS crashes. Our analysis not only corroborates the findings of the previous study but also enhances our understanding by providing detailed odds ratios for factors associated with fatality risk, prioritizing these conditions by their significance, and introducing geospatial visualization of HEMS crash sites across the United States. Consistently, factors such as adverse flight conditions, the absence of a copilot, and postcrash fires emerged as the most significant contributors to fatalities in the aftermath of a HEMS crash. The geospatial analysis revealed a tendency for crashes to cluster around major metropolitan areas, a finding that aligns with expectations given the higher volume of HEMS operations in densely populated regions. This insight, while anticipated, underscores the potential of geospatial data to inform future research and targeted safety interventions.
The application of machine learning, particularly through a feature-based predictive model like XGBoost, has proven invaluable in dissecting the multifaceted nature of crash data. While many of the significant factors identified by XGBoost are consistent with those found in traditional analyses, the machine learning approach provided a more nuanced understanding of the relative importance of each factor and revealed complex interactions that might not be easily detected with traditional methods. This study not only underscores the efficacy of machine learning in enhancing our understanding of HEMS flight safety, but also suggests its broader applicability to analyzing other types of aviation accidents. By pinpointing specific risk factors amenable to intervention, this approach offers a promising avenue for reducing the incidence of accidents and fatalities across various high-risk contexts. However, it is important to acknowledge that the insights gained from machine learning should be complemented with domain expertise and further validated through additional studies.
To conclude, the integration of machine learning into the analysis of HEMS crash data represents a significant advancement in the ability to identify and mitigate the factors contributing to aviation fatalities. This research not only supports and expands upon previous studies but also demonstrates the potential of machine learning to provide deeper insights and identify new patterns of data in aviation safety. We recommend further research to continue exploring these methodologies to enhance safety across the broader aviation industry, while also considering the limitations and ensuring a balanced interpretation of the findings.

Geospatial visualization of HEMS crashes from 1991–2022. Yellow markers indicate a nonfatal crash; red markers indicate a fatal crash.

Feature importance of XGBoost regression.
Contributor Notes

