Integrated feature selection and classification algorithm in the prediction of work-related accidents in the retail sector: a comparative study Conference Paper uri icon


  • Assessing the different factors that contribute to accidents in the workplace is essential to ensure the safety and well-being of employees. Given the importance of risk identification in hazard prediction, this work proposes a comparative study between different feature selection techniques (χ2 test and Forward Feature Selection) combined with learning algorithms (Support Vector Machine, Random Forest, and Naive Bayes), both applied to a database of a leading company in the retail sector, in Portugal. The goal is to conclude which factors of each database have the most significant impact on the occurrence of accidents. Initial databases include accident records, ergonomic workplace analysis, hazard intervention and risk assessment, climate databases, and holiday records. Each method was evaluated based on its accuracy in the forecast of the occurrence of the accident. The results showed that the Forward Feature Selection-Random Forest pair performed better among the assessed combinations, considering the case study database. In addition, data from accident records and ergonomic workplace analysis have the largest number of features with the most significant predictive impact on accident prediction. Future studies will be carried out to evaluate factors from other databases that may have meaningful information for predicting accidents.
  • The authors are grateful to the Foundation for Science and Technology (FCT, Portugal) for financial support through national funds FCT/MCTES (PIDDAC) to CeDRI (UIDB/05757/2020 and UIDP/05757/2020) and SusTEC (LA/P/0007/2021). This work has been supported by NORTE-01-0247-FEDER-072598 iSafety: Intelligent system for occupational safety and well-being in the retail sector. Inˆes Sena was supported by FCT PhD grant UI/BD/153348/2022.

publication date

  • January 1, 2022