Abstract
With the advancement of society and the improvement of living standards, the significance of students’ physical health has become increasingly prominent. However, currently, the assessment and analysis of students’ physical health heavily depend on conventional statistical methods. Even with the application of data mining-related methodologies for analysis and evaluation, the exploitation and utilization of physical health big data remain relatively restricted. In this paper, an improved Canopy-K-means algorithm based on principal component analysis (PCA) is used to construct and analyze a portrait of students’ physical fitness and health. The method combines data dimensionality reduction techniques and cluster analysis techniques, and its combined performance is the best compared to other algorithms in the ablation experiments for both male and female student data groups. In this paper, the algorithm was used to process the grouping of physical fitness test data of male and female students, realize the construction and analysis of students’ physical fitness and health portrait, and give the exercise prescription for different categories of students. In this paper, the physical health test data of students of Yunnan Agricultural University in 2020–2022 were collected to carry out experiments, and the results found that there are differences in physical health status among students of different genders, grades, and majors in this university, and the physical health status of the students of Classes 2018 and 2019 is generally deficient; on different majors, the students of the Faculty of Agricultural Sciences are slightly superior to the Faculty of Science and Technology, and the students of the Faculty of Science and Technology are slightly superior to the students of the Faculty of Humanities and Social Sciences. Our study offers novel methods and ideas for the assessment and analysis of students’ physical health, holding significant implications for schools and related departments in formulating scientific and effective physical education policies and health promotion strategies.
Similar content being viewed by others
Data availability
To avoid possible damage to personal privacy caused by publicizing the datasets, coupled with the relevant data management requirements of the data source units, the datasets used in this study are not available to the public, but can be obtained from the corresponding author of the study if there are reasonable requirements such as validation needs.
Change history
10 June 2024
A Correction to this paper has been published: https://doi.org/10.1007/s11227-024-06157-y
References
Huo P, Ran S, Gui Y, Liu D (2023) Research on the analysis of sports health consciousness and behavior of college students in Hubei province under the background of “healthy china’’. Contemp Sports Technol 13(9):150–153
Fatima EB, Abdelmajid EM (2017) A new approach to text classification based on naïve bayes and modified tf-idf algorithms. In: Proceedings of the Mediterranean Symposium on Smart City Application, pp 1–5
Orsoni M, Giovagnoli S, Garofalo S, Magri S, Benvenuti M, Mazzoni E, Benassi M (2023) Preliminary evidence on machine learning approaches for clusterizing students’ cognitive profile. Heliyon 9(3):e14506
Li B (2020) A study on the application of visualisation of college students’ physical fitness and health data. In: Paper presented at the 30th National Collegiate Athletics Research Paper Presentation, Shandong sport university
Peng C, Long P (2021) Analysis of college students’ physical health test data based on big data and health promotion countermeasures. Hubei Sports Sci 40(1):76–80 ((in Chinese))
Zhang K (2022) Research on optimization algorithm and platformdesign of college students physical health evaluation based on data mining. Shandong Univ 2:81
Zhou Q, Zhou Z, Zhou C (2022) An empirical study of decision tree algorithm in physical fitness test for college students. J Shandong Open Univ 2:79–84 ((in Chinese))
Yang Y, Yu W (2022) User portraits research that lntegrate the content characteristics and behavior characteristics of danmaku users-a case study of bilibili’s teaching video. Inform Sci 40(12):161–169 ((in Chinese))
Zhang B, Li Y (2022) A user profile of tendering and bidding corruption in the construction industry based on som clustering: a case study of china. Buildings 12(12):2103
Huang X, Gao J, Huang H (2021) Reduction semi-supervised clustering ensemble with pairwise constraints based on pca dimension. Comput Mod. 1:94–99 ((in Chinese))
Xu B, Zheng L, Wang K, Song C (2005) Dim targets detection based on local gray probability analysis. Laser Infrared 35(3):187–189 ((in Chinese))
Pearson K (1901) Liii. on lines and planes of closest fit to systems of points in space. Lond Edinb Dublin Philos Mag J Sci 2(11):559–572
Xu Z, Li T, Dong Y, Dong Y (2022) The construction of student user portraits based on improved kmeans algorithm. J Hebei Univ Technol 51(3):19–24 ((in Chinese))
Zang L, Zheng K (2022) Research on classified management of medical consumables lnventory based on pca and k. means clustering algorithm. China Med Dev 37(1):5–819 ((in Chinese))
Shen G, Jiang Z (2022) A canopy bisecting k-means algorithm based on density and central index. Comput Eng Sci 44(02):372–380 ((in Chinese))
Zhao X, Wang L, Xing Y, Zhao Y, Zhao J, Qian Y (2020) Optimization and parallel strategy of improved ck-means. Appl Res Comput 37(11):3287–3291 ((in Chinese))
Chen S, Jia R (2019) An improved k-edoids algorithm based on density weight canopy. Comput Eng Sci 41(10):1823–1828 ((in Chinese))
Wang S, Liu C, Xing S (2022) Review on k-means clustering algorithm. J East China Jiaotong Univ 39(5):119–126 ((in Chinese))
Cai S, Wu L, Zheng D (2022) Research on the construction of consumer group portrait and precision marketing strategy of agricultural products based on k-means cluster analysis. Rural Econ Sci Technol 33(22):251–254262 ((in Chinese))
Tao Y, Wang Y (2022) lmproved k-means algorithm based on the selection of initial clustering centers. Foreign Electron Meas Technol 41(9):54–59 ((in Chinese))
Cui S, Wang D, Wang S, Xia J, Wang Y, Yaochu J (2018) Prostate cancer diagnosis method based on gmm-rbf neural network. J Manag Sci 31(1):33–47 ((in Chinese))
Wang S, Wang Y, Wang D, Yin Y, Wang Y, Jin Y (2020) An improved random forest-based rule extraction method for breast cancer diagnosis. Appl Soft Comput 86:105941
Zhai M, Cheng J, Wang S, Wang Y (2021) Portraying student education based on the k-prototype clustering method. J Dalian Univ Technol 42(6):22–31 ((in Chinese))
Liang S, Han D, Yang Y (2020) Cluster validity index for irregular clustering results. Appl Soft Comput 95:106583
Yu S, Yan H, Du S, Lin Y (2022) Research on the customer value portrait model of lndustrial power enterprise in China Basedon spectral clustering technology and rough set theory. Chin J Manag Sci 30(3):106–116 ((in Chinese))
Ji H, Ni F, Liu J, Lu Q, Zhang X, Que Z (2021) Prediction of telecom customer churn based on xgb-bfs feature selection algorithm. Comput Technol Dev 31(5):21–25 ((in Chinese))
Li G, Wang Z, Hao Y (2023) Chinese expert consensus on exercise prescription. Chinese J Sports Med 42(1):3–13 ((in Chinese))
Zhao D (2022) Research on the promotion path of college students’ physicalactivity behavior from the perspective of physical health. Shenyang Univ 2:75
Jiang M (2023) Discussion on the reform of teaching mode of public physical education in colleges and universities. Boxing Fight, 79–81
Peng W, Lv Z, Zeng Z (2020) Sports activities under the “second classroom’’ system exploration of the path to enhance students’ ability. J Guangzhou Sport Univ 40:114–116 ((in Chinese))
Funding
This study was supported by the Yunnan Provincial Department of Education Research Fund Project “Research on the Construction of Student Portrait Model Based on the Physical Health Test Data of Agricultural College Students” (Project No.: 2023Y1039), the Yunnan Provincial Department of Education Research Fund Project “Research on the Construction of University Physical Education Learner Portraits Based on Big Data” (Project No.: 2023Y0939), and the Major Project of Yunnan Science and Technology (Project No.: 202302AE09002003).
Author information
Authors and Affiliations
Contributions
The author, RJ, wrote the main part of the manuscript and performed the experiments. JY put forward more suggestions on the model of the algorithm. JY put forward constructive comments on the analysis of the experimental results. YW and YL assisted in completing the data preprocessing. RL and JC participated in the discussion and made some suggestions, which are very helpful for code debugging. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.
Consent to participate
All authors of this paper volunteered to participate in this research.
Consent for publication
This paper has been published with the consent of all authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: “In this article Jianke Yang should have been denoted as an equally contributing author instead of Jianping Yang”.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ji, R., Yang, J., Wu, Y. et al. Construction and analysis of students’ physical health portrait based on principal component analysis improved Canopy-K-means algorithm. J Supercomput 80, 15940–15973 (2024). https://doi.org/10.1007/s11227-024-06091-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-06091-z