[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Fairness-Aware Processing Techniques in Survival Analysis: Promoting Equitable Predictions

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track (ECML PKDD 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14174))

Abstract

As machine learning (ML) systems are becoming pervasive in high-stakes applications, the issue of ML fairness is receiving increasing attention. A large variety of fair ML solutions have been developed to ensure that bias and inaccuracies in the data and model do not lead to decisions that treat individuals unfavorably on the basis of sensitive characteristics. While most of the fair ML literature focus on classification and regression setting, fairness of survival analysis for time-to-event outcomes are under-explored. In contrast to existing fair survival analysis solutions which typically incorporate fairness constraints in the learning mechanisms, we propose several pre-processing and post-processing approaches. Due to the model-agnostic nature of pre-processing and post-processing methods, they may offer more flexible fairness intervention. Additionally, pre-processing and post-processing methods tend to be more intuitive and explainable compared to in-processing methods. We carry out experimental studies with medical and non-medical data sets to evaluate the proposed fairness methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 59.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 74.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Allen, A.M., Therneau, T.M., Larson, J.J., Coward, A., Somers, V.K., Kamath, P.S.: Nonalcoholic fatty liver disease incidence and impact on metabolic burden and death: a 20 year-community study. Hepatology 67(5), 1726–1736 (2018)

    Article  Google Scholar 

  2. Antolini, L., Boracchi, P., Biganzoli, E.: A time-dependent discrimination index for survival data. Stat. Med. 24(24), 3927–3944 (2005)

    Article  MathSciNet  Google Scholar 

  3. Bender, A., Scheipl, F.: Pammtools: piece-wise exponential additive mixed modeling tools. arXiv preprint arXiv:1806.01042 (2018)

  4. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  5. Caton, S., Haas, C.: Fairness in machine learning: a survey. arXiv preprint arXiv:2010.04053 (2020)

  6. Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc.: Ser. B (Methodol.) 34(2), 187–202 (1972)

    MathSciNet  MATH  Google Scholar 

  7. Dispenzieri, A., et al.: Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population. In: Mayo Clinic Proceedings, vol. 87, no. 6, pp. 517–523. Elsevier (2012)

    Google Scholar 

  8. Eleuteri, A., Tagliaferri, R., Milano, L., De Placido, S., De Laurentiis, M.: A novel neural network-based survival analysis model. Neural Netw. 16(5–6), 855–864 (2003)

    Article  MATH  Google Scholar 

  9. Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 259–268 (2015)

    Google Scholar 

  10. Ganapaneni, S.: DSPP1 [data set]. Kaggle (2022). https://www.kaggle.com/datasets/gsagar12/dspp1

  11. Graf, E., Schmoor, C., Sauerbrei, W., Schumacher, M.: Assessment and comparison of prognostic classification schemes for survival data. Stat. Med. 18(17–18), 2529–2545 (1999)

    Article  Google Scholar 

  12. Harrison, T., Ansell, J.: Customer retention in the insurance industry: using survival analysis to predict cross-selling opportunities. J. Financ. Serv. Mark. 6, 229–239 (2002)

    Article  Google Scholar 

  13. Hort, M., Chen, Z., Zhang, J.M., Sarro, F., Harman, M.: Bia mitigation for machine learning classifiers: a comprehensive survey. arXiv preprint arXiv:2207.07068 (2022)

  14. Hosmer, D.W., Lemeshow, S., May, S.: Regression modeling of time to event data. New York (1999)

    Google Scholar 

  15. Hougaard, P.: Fundamentals of survival data. Biometrics 55(1), 13–22 (1999)

    Article  MATH  Google Scholar 

  16. Hu, S., Chen, G.H.: Distributionally robust survival analysis: a novel fairness loss without demographics. In: Machine Learning for Health, pp. 62–87. PMLR (2022)

    Google Scholar 

  17. Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random survival forests. Ann. Appl. Stat. 2(3), 841–860 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  18. Jin, Z., Shang, J., Zhu, Q., Ling, C., Xie, W., Qiang, B.: RFRSF: employee turnover prediction based on random forests and survival analysis. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds.) WISE 2020. LNCS, vol. 12343, pp. 503–515. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62008-0_35

    Chapter  Google Scholar 

  19. Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrimination. Knowl. Inf. Syst. 33(1), 1–33 (2012)

    Article  Google Scholar 

  20. Kamishima, T., Akaho, S., Asoh, H., Sakuma, J.: Fairness-aware classifier with prejudice remover regularizer. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 35–50. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_3

    Chapter  Google Scholar 

  21. Katzman, J.L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., Kluger, Y.: Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18(1), 1–12 (2018)

    Article  Google Scholar 

  22. Keya, K.N., Islam, R., Pan, S., Stockwell, I., Foulds, J.: Equitable allocation of healthcare resources with fair survival models. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 190–198. SIAM (2021)

    Google Scholar 

  23. Lee, C., Zame, W., Yoon, J., Van Der Schaar, M.: Deephit: a deep learning approach to survival analysis with competing risks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1 (2018)

    Google Scholar 

  24. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)

    Article  Google Scholar 

  25. Pagano, T.P., et al.: Bias and unfairness in machine learning models: a systematic literature review. arXiv preprint arXiv:2202.08176 (2022)

  26. Pessach, D., Shmueli, E.: Algorithmic fairness. arXiv preprint arXiv:2001.09784 (2020)

  27. Pope, D.G., Sydnor, J.R.: Implementing anti-discrimination policies in statistical profiling models. Am. Econ. J. Econ. Pol. 3(3), 206–231 (2011)

    Article  Google Scholar 

  28. Rahman, M.M., Purushotham, S.: Fair and interpretable models for survival analysis. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1452–1462 (2022)

    Google Scholar 

  29. Ripley, R.M., Harris, A.L., Tarassenko, L.: Non-linear survival analysis using neural networks. Stat. Med. 23(5), 825–842 (2004)

    Article  Google Scholar 

  30. Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for cox’s proportional hazards model via coordinate descent. J. Stat. Softw. 39(5), 1 (2011)

    Article  Google Scholar 

  31. Sonabend, R., Pfisterer, F., Mishler, A., Schauer, M., Burk, L., Vollmer, S.: Flexible group fairness metrics for survival analysis. arXiv preprint arXiv:2206.03256 (2022)

  32. Türk, U., Sap, S.: The effect of the Covid-19 on sharing economy: survival analysis of Airbnb listings. Bus. Manag. Stud. Int. J. 9(1), 215–226 (2021)

    Google Scholar 

  33. Uno, H., Cai, T., Tian, L., Wei, L.J.: Evaluating prediction rules for T-year survivors with censored regression models. J. Am. Stat. Assoc. 102(478), 527–537 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  34. Verma, S.: Weapons of math destruction: how big data increases inequality and threatens democracy. Vikalpa 44(2), 97–98 (2019)

    Article  Google Scholar 

  35. Wang, P., Li, Y., Reddy, C.K.: Machine learning for survival analysis: a survey. ACM Comput. Surv. (CSUR) 51(6), 1–36 (2019)

    Article  Google Scholar 

  36. Wei, L.J.: The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Stat. Med. 11(14–15), 1871–1879 (1992)

    Article  Google Scholar 

  37. Wijaya, D.: Employee turnover dataset [data set]. Kaggle (2020). https://www.kaggle.com/datasets/davinwijaya/employee-turnover. Original data from: Babushkin, Edward. (2017). Employee Turnover: How to Predict Individual Risks of Quitting [Blog post]. https://edwvb.blogspot.com/2017/10/employee-turnover-how-to-predict-individual-risks-of-quitting.html

  38. Xin, X., Huang, F.: Anti-discrimination insurance pricing: regulations, fairness criteria, and models. Fairness Criteria, and Models (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhouting Zhao .

Editor information

Editors and Affiliations

Ethics declarations

Ethical Statement

This research aims to explore the discrimination mitigation processing techniques (both pre-processing and post-processing) in time-to-event analysis. The following ethical considerations have been addressed in the research design and procedures:

1. Voluntary participation: This study does not involve primary data collection.

2. Confidentiality: All data used in this study are publicly available and are properly cited.

3. Potential for harm: There is minimal risk or potential harm associated with this study.

4. Gender consideration: This study investigates potential biases and unfairness associated with using gender as a feature in survival analysis and proposes several approaches to address the unfairness.

5. Results communication: The study’s results will serve academic purposes exclusively and will be reported in academic journals or conferences. The results will be presented accurately and without bias, acknowledging any limitations or ethical dilemmas encountered throughout the study.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, Z., Ng, T.L.J. (2023). Fairness-Aware Processing Techniques in Survival Analysis: Promoting Equitable Predictions. In: De Francisci Morales, G., Perlich, C., Ruchansky, N., Kourtellis, N., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14174. Springer, Cham. https://doi.org/10.1007/978-3-031-43427-3_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43427-3_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43426-6

  • Online ISBN: 978-3-031-43427-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics