Robust Hand Gesture Recognition Using a Deformable Dual-Stream Fusion Network Based on CNN-TCN for FMCW Radar
<p>Processing of radar data cubes by FFT.</p> "> Figure 2
<p>RTM for left swipe before noise reduction. In the RTM, pixel color, x-axis, and y-axis correspond to Doppler power, range, and time, respectively.</p> "> Figure 3
<p>RTM for left swipe after noise reduction. In the RTM, pixel color, x-axis, and y-axis correspond to Doppler power, range, and time, respectively.</p> "> Figure 4
<p>Comparison of RAMS before and after noise reduction. (<b>a</b>) RAMS for left swipe before noise reduction. (<b>b</b>) RAMS for left swipe after noise reduction. The rows represent the time series of four frames. In the RAMS, pixel color, x-axis, and y-axis correspond to Doppler power, range, and AoA, respectively.</p> "> Figure 5
<p>Structure of the DDF-CT network.</p> "> Figure 6
<p>Structure of the TCN_se.</p> "> Figure 7
<p>Dynamic hand gestures. (<b>a</b>) PH and PL. (<b>b</b>) RS and LS. (<b>c</b>) CT and AT.</p> "> Figure 8
<p>Confusion matrix for the model without DeformConv and SEnet. (0: PH, 1: PL, 2: LS, 3: RS, 4: CT, 5: AT).</p> "> Figure 9
<p>Confusion matrix for the model with DeformConv. (0: PH, 1: PL, 2: LS, 3: RS, 4: CT, 5: AT).</p> "> Figure 10
<p>Confusion matrix for the model with DeformConv and SEnet. (0: PH, 1: PL, 2: LS, 3: RS, 4: CT, 5: AT).</p> "> Figure 11
<p>Confusion matrix for CNN-LSTM. (0: PH, 1: PL, 2: LS, 3: RS, 4: CT, 5: AT).</p> "> Figure 12
<p>Confusion matrix for CNN-BiGRU. (0: PH, 1: PL, 2: LS, 3: RS, 4: CT, 5: AT).</p> "> Figure 13
<p>Confusion matrix for the DDF-CT network (0: PH, 1: PL, 2: LS, 3: RS, 4: CT, 5: AT).</p> ">
Abstract
:1. Introduction
2. Signal Processing
2.1. Principle of the FMCW Radar
2.2. Acquisition of Datasets
Algorithm 1 Noise Reduction |
Input: Total number of frames: , FFT size: L, Doppler bin threshold: , scale factor of the angle bin power threshold: , Doppler power threshold: , Range Doppler Matrix: RD, Range Angle Matrix: RA Output: Range Angle Map Sequence:
|
3. Proposed Network
4. Experiment and Analysis
4.1. Experimental Platform
4.2. Dataset
4.3. Ablation Studies
4.4. Comparison of Different Methods
4.5. Comparison of Different Inputs
4.6. Testing in a New Environment
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhao, Y.; Gao, R.; Liu, S.; Xie, L.; Wu, J.; Tu, H.; Chen, B. Device-free secure interaction with hand gestures in WiFi-enabled IoT environment. IEEE Internet Things J. 2020, 8, 5619–5631. [Google Scholar] [CrossRef]
- Jayaweera, N.; Gamage, B.; Samaraweera, M.; Liyanage, S.; Lokuliyana, S.; Kuruppu, T. Gesture driven smart home solution for bedridden people. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, Virtual Event, 21–25 September 2020; pp. 152–158. [Google Scholar] [CrossRef]
- Qi, W.; Ovur, S.E.; Li, Z.; Marzullo, A.; Song, R. Multi-sensor guided hand gesture recognition for a teleoperated robot using a recurrent neural network. IEEE Robot. Autom. Lett. 2021, 6, 6039–6045. [Google Scholar] [CrossRef]
- Chen, T.; Xu, L.; Xu, X.; Zhu, K. Gestonhmd: Enabling gesture-based interaction on low-cost vr head-mounted display. IEEE Trans. Vis. Comput. Graph. 2021, 27, 2597–2607. [Google Scholar] [CrossRef]
- Suarez, J.; Murphy, R.R. Hand gesture recognition with depth images: A review. In Proceedings of the 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, Paris, France, 9–13 September 2012; pp. 411–417. [Google Scholar] [CrossRef]
- Wang, C.; Liu, Z.; Chan, S.C. Superpixel-based hand gesture recognition with kinect depth camera. IEEE Trans. Multimed. 2014, 17, 29–39. [Google Scholar] [CrossRef]
- Yuan, G.; Liu, X.; Yan, Q.; Qiao, S.; Wang, Z.; Yuan, L. Hand gesture recognition using deep feature fusion network based on wearable sensors. IEEE Sens. J. 2020, 21, 539–547. [Google Scholar] [CrossRef]
- Jiang, S.; Kang, P.; Song, X.; Lo, B.P.; Shull, P.B. Emerging wearable interfaces and algorithms for hand gesture recognition: A survey. IEEE Rev. Biomed. Eng. 2021, 15, 85–102. [Google Scholar] [CrossRef]
- Ahmed, S.; Kallu, K.D.; Ahmed, S.; Cho, S.H. Hand gestures recognition using radar sensors for human-computer-interaction: A review. Remote Sens. 2021, 13, 527. [Google Scholar] [CrossRef]
- Hasch, J.; Topak, E.; Schnabel, R.; Zwick, T.; Weigel, R.; Waldschmidt, C. Millimeter-wave technology for automotive radar sensors in the 77 GHz frequency band. IEEE Trans. Microw. Theory Tech. 2012, 60, 845–860. [Google Scholar] [CrossRef]
- Tang, G.; Wu, T.; Li, C. Dynamic Gesture Recognition Based on FMCW Millimeter Wave Radar: Review of Methodologies and Results. Sensors 2023, 23, 7478. [Google Scholar] [CrossRef]
- Wang, S.; Song, J.; Lien, J.; Poupyrev, I.; Hilliges, O. Interacting with soli: Exploring fine-grained dynamic gesture recognition in the radio-frequency spectrum. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, Tokyo, Japan, 16–19 October 2016; pp. 851–860. [Google Scholar] [CrossRef]
- Hayashi, E.; Lien, J.; Gillian, N.; Giusti, L.; Weber, D.; Yamanaka, J.; Bedal, L.; Poupyrev, I. Radarnet: Efficient gesture recognition technique utilizing a miniature radar sensor. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021; pp. 1–14. [Google Scholar] [CrossRef]
- Marvasti-Zadeh, S.M.; Cheng, L.; Ghanei-Yakhdan, H.; Kasaei, S. Deep learning for visual tracking: A comprehensive survey. IEEE Trans. Intell. Transp. Syst. 2021, 23, 3943–3968. [Google Scholar] [CrossRef]
- Malysa, G.; Wang, D.; Netsch, L.; Ali, M. Hidden Markov model-based gesture recognition with FMCW radar. In Proceedings of the 2016 IEEE Global Conference on Signal and information processing (GlobalSIP), Washington, DC, USA, 7–9 December 2016; pp. 1017–1021. [Google Scholar] [CrossRef]
- Li, G.; Zhang, R.; Ritchie, M.; Griffiths, H. Sparsity-driven micro-Doppler feature extraction for dynamic hand gesture recognition. IEEE Trans. Aerosp. Electron. Syst. 2017, 54, 655–665. [Google Scholar] [CrossRef]
- Ryu, S.J.; Suh, J.S.; Baek, S.H.; Hong, S.; Kim, J.H. Feature-based hand gesture recognition using an FMCW radar and its temporal feature analysis. IEEE Sens. J. 2018, 18, 7593–7602. [Google Scholar] [CrossRef]
- Zhu, J.; Chen, H.; Ye, W. A hybrid CNN–LSTM network for the classification of human activities based on micro-Doppler radar. IEEE Access 2020, 8, 24713–24720. [Google Scholar] [CrossRef]
- Chen, H.; Ye, W. Classification of human activity based on radar signal using 1-D convolutional neural network. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1178–1182. [Google Scholar] [CrossRef]
- Choi, J.W.; Ryu, S.J.; Kim, J.H. Short-range radar based real-time hand gesture recognition using LSTM encoder. IEEE Access 2019, 7, 33610–33618. [Google Scholar] [CrossRef]
- Wang, L.; Cao, Z.; Cui, Z.; Cao, C.; Pi, Y. Negative latency recognition method for fine-grained gestures based on terahertz radar. IEEE Trans. Geosci. Remote Sens. 2020, 58, 7955–7968. [Google Scholar] [CrossRef]
- Wang, Y.; Shu, Y.; Jia, X.; Zhou, M.; Xie, L.; Guo, L. Multifeature fusion-based hand gesture sensing and recognition system. IEEE Geosci. Remote Sens. Lett. 2021, 19, 3507005. [Google Scholar] [CrossRef]
- Hazra, S.; Santra, A. Robust gesture recognition using millimetric-wave radar system. IEEE Sens. Lett. 2018, 2, 7001804. [Google Scholar] [CrossRef]
- Yan, B.; Wang, P.; Du, L.; Chen, X.; Fang, Z.; Wu, Y. mmGesture: Semi-supervised gesture recognition system using mmWave radar. Expert Syst. Appl. 2023, 213, 119042. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, D.; Fu, Y.; Yao, D.; Xie, L.; Zhou, M. Multi-Hand Gesture Recognition Using Automotive FMCW Radar Sensor. Remote Sens. 2022, 14, 2374. [Google Scholar] [CrossRef]
- Gan, L.; Liu, Y.; Li, Y.; Zhang, R.; Huang, L.; Shi, C. Gesture recognition system using 24 GHz FMCW radar sensor realized on real-time edge computing platform. IEEE Sens. J. 2022, 22, 8904–8914. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, S.s.; Tian, Z.s.; Zhou, M.; Wu, J.j. Two-stream fusion neural network approach for hand gesture recognition based on FMCW radar. Acta Electonica Sin. 2019, 47, 1408. [Google Scholar] [CrossRef]
- Yang, Z.; Zheng, X. Hand gesture recognition based on trajectories features and computation-efficient reused LSTM network. IEEE Sens. J. 2021, 21, 16945–16960. [Google Scholar] [CrossRef]
- Tu, Z.; Zhang, J.; Li, H.; Chen, Y.; Yuan, J. Joint-bone fusion graph convolutional network for semi-supervised skeleton action recognition. IEEE Trans. Multimed. 2022, 25, 1819–1831. [Google Scholar] [CrossRef]
- Dai, C.; Liu, X.; Lai, J. Human action recognition using two-stream attention based LSTM networks. Appl. Soft Comput. 2020, 86, 105820. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Zang, B.; Ding, L.; Feng, Z.; Zhu, M.; Lei, T.; Xing, M.; Zhou, X. CNN-LRP: Understanding convolutional neural networks performance for target recognition in SAR images. Sensors 2021, 21, 4536. [Google Scholar] [CrossRef]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9308–9316. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Number of transmitter antennas | 2 |
Number of receiver antennas | 4 |
Frame periodicity | 50 ms |
Total bandwidth | 3999.48 MHz |
Number of sample points | 128 |
number of chirps in one frame | 128 |
Model | Dataset | Accuracy (%) |
---|---|---|
CNN-TCN | RAMS | 94.44 |
CNN-TCN with DeformConv | RAMS | 95.27 |
CNN-TCN with DeformConv and SEnet | RAMS | 96.94 |
Model | Dataset | Accuracy (%) |
---|---|---|
CNN | RT | 91.77 |
CNN with deformConv | RT | 93.83 |
Model | Accuracy (%) |
---|---|
3D-CNN | 84.16 |
CNN-GRU | 90.55 |
CNN-LSTM | 93.05 |
CNN-BiGRU | 91.94 |
Ours | 98.61 |
Model | Dataset | Accuracy (%) |
---|---|---|
TFM stream of the DDF-CT network | RT | 93.83 |
SMS stream of the DDF-CT network | RAMS | 96.94 |
Entire DDF-CT network | RAMS + RT | 98.61 |
Model | Accuracy (%) |
---|---|
3D-CNN | 79.00 |
CNN-GRU | 84.33 |
CNN-LSTM | 86.66 |
CNN-BiGRU | 82.66 |
DDF-CT network | 97.22 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, M.; Zhang, C.; Wang, J.; Sun, L.; Fu, M. Robust Hand Gesture Recognition Using a Deformable Dual-Stream Fusion Network Based on CNN-TCN for FMCW Radar. Sensors 2023, 23, 8570. https://doi.org/10.3390/s23208570
Zhu M, Zhang C, Wang J, Sun L, Fu M. Robust Hand Gesture Recognition Using a Deformable Dual-Stream Fusion Network Based on CNN-TCN for FMCW Radar. Sensors. 2023; 23(20):8570. https://doi.org/10.3390/s23208570
Chicago/Turabian StyleZhu, Meiyi, Chaoyi Zhang, Jianquan Wang, Lei Sun, and Meixia Fu. 2023. "Robust Hand Gesture Recognition Using a Deformable Dual-Stream Fusion Network Based on CNN-TCN for FMCW Radar" Sensors 23, no. 20: 8570. https://doi.org/10.3390/s23208570
APA StyleZhu, M., Zhang, C., Wang, J., Sun, L., & Fu, M. (2023). Robust Hand Gesture Recognition Using a Deformable Dual-Stream Fusion Network Based on CNN-TCN for FMCW Radar. Sensors, 23(20), 8570. https://doi.org/10.3390/s23208570