Abstract
This chapter examines how digital pen interaction can be expanded by detecting different hand postures formed primarily by the hand while it grips the pen. Three systems using different types of sensors are considered: an EMG armband, the raw capacitive image of the touchscreen, and a pen-top fisheye camera. In each case, deep neural networks are used to perform classification or regression to detect hand postures and gestures. Additional analyses are provided to demonstrate the benefit of deep learning over conventional machine-learning methods, as well as explore the impact on model accuracy resulting from the number of postures to be recognised, user-dependent versus user-independent models, and the amount of training data. Examples of posture-based pen interaction in applications are discussed and a number of usability aspects resulting from user evaluations are identified. The chapter concludes with perspectives on the recognition and design of posture-based pen interaction for future systems.
Similar content being viewed by others
References
Appert C, Zhai S (2009) Using strokes as command shortcuts: cognitive benefits and toolkit support. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 2289–2298
Aslan I, Buchwald I, Koytek P, André E (2016) Pen + Mid-Air: an exploration of mid-air gestures to complement pen input on tablets. In: Proceedings of the 9th Nordic conference on human-computer interaction, NordiCHI ’16, pp 1:1-1:10, New York, NY, USA. ACM
Bandini A, Zariffa J (2020) Analysis of the hands in egocentric vision: a survey. IEEE Trans Pattern Anal Mach Intell
Batmaz AU, Mutasim AK, Stuerzlinger W (2020) Precision vs. power grip: a comparison of pen grip styles for selection in virtual reality. In: 2020 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW), pp 23–28. IEEE
Hongliang B, Jian Z, Yanjiao C (2020) Smartge: identifying pen-holding gesture with smartwatch. IEEE Access 8:28820–28830
Bi X, Moscovich T, Ramos G, Balakrishnan R, Hinckley K (2008) An exploration of pen rolling for pen-based interaction. In: Proceedings of the 21st annual ACM symposium on User interface software and technology, pp 191–200
Brandl P, Forlines C, Wigdor D, Haller M, Shen C (2008) Combining and measuring the benefits of bimanual pen and direct-touch interaction on horizontal interfaces. In: Proceedings of the working conference on advanced visual interfaces, pp 154–161, Napoli, Italy. ACM
Cami D, Matulic F, Calland RG, Vogel B, Vogel D (2018) Unimanual Pen+Touch input using variations of precision grip postures. In: Proceedings of the 31st annual ACM symposium on user interface software and technology, UIST ’18, pp 825–837, New York, NY, USA. ACM
Theocharis C, Andreas S, Dimitrios K, Kosmas D, Petros D (2020) A comprehensive study on deep learning-based 3d hand pose estimation methods. Appl Sci 10(19):6850
Weiya C, Yu C, Tu C, Zehua L, Jing T, Ou S, Fu Y, Zhidong X (2020) A survey on hand pose estimation with wearable sensors and computer-vision-based methods. Sensors 20(4):1074
Côté-Allard U, Fall CL, Drouin A, Campeau-Lecours A, Gosselin C, Glette K, Laviolette F, Gosselin B (2019) Deep learning for electromyographic hand gesture signal classification using transfer learning. IEEE Trans Neural Syst Rehab Eng 27(4):760–771
Dementyev A, Paradiso JA (2014) Wristflex: low-power gesture input with wrist-worn pressure sensors. In: Proceedings of the 27th annual ACM symposium on user interface software and technology, UIST ’14, pp 161–166, New York, NY, USA. Association for Computing Machinery
Drey T, Gugenheimer J, Karlbauer J, Milo M, Rukzio E (2020) Vrsketchin: exploring the design space of pen and tablet interaction for 3d sketching in virtual reality. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–14
Du H, Li P, Zhou H, Gong W, Luo G, Yang P (2018) Wordrecorder: accurate acoustic-based handwriting recognition using deep learning. In: IEEE INFOCOM 2018-IEEE conference on computer communications, pp 1448–1456. IEEE
Elkin LA, Beau J-B, Casiez G, Vogel D (2020) Manipulation, learning, and recall with tangible pen-like input. In: Proceedings of the 2020 CHI conference on human factors in computing systems, CHI ’20, pp 1–12, New York, NY, USA. Association for Computing Machinery
Fellion N, Pietrzak T, Girouard A (2017) Flexstylus: leveraging bend input for pen interaction. In: Proceedings of the 30th annual ACM symposium on user interface software and technology, UIST ’17, pages 375–385, New York, NY, USA. ACM
Frisch M, Heydekorn J, Dachselt R (2009) Investigating multi-touch and pen gestures for diagram editing on interactive surfaces. Proc ITS 2009:149–156
Ge L, Ren Z, Li Y, Xue Z, Wang Y, Cai J, Yuan J (2019) 3d hand shape and pose estimation from a single rgb image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10833–10842
Gesslein T, Biener V, Gagel P, Schneider D, Kristensson PO, Ofek E, Pahud M, Grubert J (2020) Pen-based interaction with spreadsheets in mobile virtual reality. arXiv:2008.04543
Oliver G, Wu S, Daniele P, Otmar H, Olga S-H (2019) Interactive hand pose estimation using a stretch-sensing soft glove. ACM Trans Graph (TOG) 38(4):1–15
Grossman T, Hinckley K, Baudisch P, Agrawala M, Balakrishnan R (2006) Hover widgets: using the tracking state to extend the capabilities of pen-operated devices. In Proceedings of the SIGCHI conference on Human Factors in computing systems, pp 861–870, Montréal, Québec, Canada. ACM
Hamilton W, Kerne A, Robbins T (2012) High-performance pen+ touch modality interactions: a real-time strategy game esports context. In: Proceedings of the 25th annual ACM symposium on user interface software and technology, pp 309–318
Haque F, Nancel M, Vogel D (2015) Myopoint: pointing and clicking using forearm mounted electromyography and inertial motion sensors. In: Proceedings of the 33rd annual ACM conference on human factors in computing systems, CHI ’15, pp 3653–3656, New York, NY, USA. Association for Computing Machinery
Hasan K, Yang X- D, Bunt A, Irani P (2012) A-coord input: coordinating auxiliary input streams for augmenting contextual pen-based interactions. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’12, pp 805–814, New York, NY, USA. ACM
Hasson Y, Varol G, Tzionas D, Kalevatykh I, Black MJ, Laptev I, Schmid C (2019) Learning joint reconstruction of hands and manipulated objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11807–11816
Hinckley K, ’Anthony’ Chen X, Benko H (2013) Motion and context sensing techniques for pen computing. In: Proceedings of graphics interface 2013, GI ’13, pp 71–78, Toronto, Ont., Canada, Canada. Canadian Information Processing Society
Hinckley K, Pahud M, Benko H, Irani P, Guimbretière F, Gavriliu M, ’Anthony’ Chen X, Matulic F, Buxton W, Wilson A (2014) Sensing techniques for tablet+stylus interaction. In: Proceedings of the 27th annual ACM symposium on user interface software and technology, UIST ’14, pp 605–614, New York, NY, USA. ACM
Hinckley K, Yatani K, Pahud M, Coddington N, Rodenhouse J, Wilson A, Benko H, Buxton B (2010) Pen + touch = new tools. In: Proceedings of the 23nd annual ACM symposium on User interface software and technology, pp 27–36, New York, New York, USA. ACM
Howard J, Gugger S (2020) Fastai: a layered api for deep learning. Information 11(2):108
Hu F, He P, Xu S, Li Y, Zhang C (2020) Fingertrak: continuous 3d hand pose tracking by deep learning hand silhouettes captured by miniature thermal cameras on wrist. Proc ACM Interact Mob Wearable Ubiquitous Technol 4(2)
Hwang S, Bianchi A, Ahn M, Wohn K (2013) MagPen: magnetically driven pen interactions on and around conventional smartphones. In: Proceedings of the 15th international conference on human-computer interaction with mobile devices and services, MobileHCI ’13, pp 412–415, New York, NY, USA. ACM
Iravantchi Y, Zhang Y, Bernitsas E, Goel M, Harrison C (2019) Interferi: gesture sensing using on-body acoustic interferometry. In: Proceedings of the 2019 CHI conference on human factors in computing systems, CHI ’19, pp 1–13, New York, NY, USA. Association for Computing Machinery
Jiang S, Lv B, Guo W, Zhang C, Wang H, Sheng X, Shull PB (2017) Feasibility of wrist-worn, real-time hand, and surface gesture recognition via semg and imu sensing. IEEE Trans Ind Inf 14(8):3376–3385
Kefer K, Holzmann C, Findling RD (2017) Evaluating the placement of arm-worn devices for recognizing variations of dynamic hand gestures. J Mobile Multimedia 12(3&4):225–242
Kim C, Chiu P, Oda H (2017) Capturing handwritten ink strokes with a fast video camera. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 1269–1274. IEEE
Kim D, Hilliges O, Izadi S, Butler AD, Chen J, Oikonomidis I, Olivier P (2012) Digits: freehand 3d interactions anywhere using a wrist-worn gloveless sensor. In: Proceedings of the 25th annual ACM symposium on user interface software and technology, UIST ’12, pp 167–176, New York, NY, USA. Association for Computing Machinery
Kim J-H, Thang ND, Kim T-S (2009) 3-d hand motion tracking and gesture recognition using a data glove. In: 2009 IEEE international symposium on industrial electronics, pp 1013–1018. IEEE
Li Y, Hinckley K, Guan Z, Landay J (2005) Experimental analysis of mode switching techniques in pen-based user interfaces. CHI ’05: proceedings of the sigchi conference on Human factors in computing systems, pp 461–470
Lin J-W, Wang C, Huang Y, Chou K-T, Chen H-Y, Tseng W-L, Chen MY (2015) Backhand: sensing hand gestures via back of the hand. In: Proceedings of the 28th annual ACM symposium on user interface software and technology, UIST ’15, pp 557–564, New York, NY, USA. Association for Computing Machinery
Logitech vr ink pilot edition. https://www.logitech.com/en-roeu/promo/vr-ink.html. Accessed 17 Dec 2020
Matsubara T, Morimoto J (2013) Bilinear modeling of emg signals to extract user-independent features for multiuser myoelectric interface. IEEE Trans Biomed Eng 60(8):2205–2213
Matulic F (2018) Colouraize: Ai-driven colourisation of paper drawings with interactive projection system. In: Proceedings of the 2018 ACM international conference on interactive surfaces and spaces, pp 273–278
Matulic F, Arakawa R, Vogel B, Vogel D (2020) Pensight: enhanced interaction with a pen-top camera. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–14
Matulic F, Norrie M (2012) Empirical evaluation of uni- and bimodal pen and touch interaction properties on digital tabletops. In: Proceedings of the 2012 ACM international conference on interactive tabletops and surfaces, ITS ’12, pp 143–152, New York, NY, USA. ACM
Matulic F, Norrie MC (2013) Pen and touch gestural environment for document editing on interactive tabletops. In: Proceedings of the 2013 ACM international conference on interactive tabletops and surfaces, ITS ’13, pp 41–50, New York, NY, USA. ACM
Matulic F, Vogel B, Kimura N, Vogel D (2019) Eliciting pen-holding postures for general input with suitability for emg armband detection. In: Proceedings of the 2019 ACM international conference on interactive surfaces and spaces, pp 89–100
Matulic F, Vogel D, Dachselt R (2017) Hand contact shape recognition for posture-based tabletop widgets and interaction. In: Proceedings of the 2017 ACM international conference on interactive surfaces and spaces, ISS ’17, pp 3–11, New York, NY, USA. ACM
McIntosh J, Marzo A, Fraser M (2017) Sensir: detecting hand gestures with a wearable bracelet using infrared transmission and reflection. In: Proceedings of the 30th annual ACM symposium on user interface software and technology, UIST ’17, pp 593–597, New York, NY, USA. Association for Computing Machinery
McIntosh J, Marzo A, Fraser M, Phillips C (2017) Echoflex: hand gesture recognition using ultrasound imaging. In: Proceedings of the 2017 CHI conference on human factors in computing systems, CHI ’17, pp 1923–1934, New York, NY, USA. Association for Computing Machinery
McIntosh J, McNeill C, Fraser M, Kerber F, Löchtefeld M, Krüger A (2016) Empress: practical hand gesture classification with wrist-mounted emg and pressure sensing. In: Proceedings of the 2016 CHI conference on human factors in computing systems, CHI ’16, pp 2332–2342, New York, NY, USA. Association for Computing Machinery
Panteleris P, Oikonomidis I, Argyros A (2018) Using a single rgb frame for real time 3d hand pose estimation in the wild. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 436–445. IEEE
Pham D-M, Stuerzlinger W (2019) Is the pen mightier than the controller? A comparison of input devices for selection in virtual and augmented reality. In: 25th ACM symposium on virtual reality software and technology, VRST ’19, New York, NY, USA. Association for Computing Machinery
Protalinski E (2019) Ctrl-labs ceo: we’ll have neural interfaces in less than 5 years. VentureBeat
Ramos G, Boulos M, Balakrishnan R (2004) Pressure widgets. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp 487–494, Vienna, Austria. ACM
Rekimoto J (1997) Pick-and-drop: a direct manipulation technique for multiple computer environments. In: Proceedings of the 10th annual ACM symposium on user interface software and technology, UIST ’97, pp 31–39, New York, NY, USA. ACM
Roland T, Wimberger K, Amsuess S, Russold MF, Baumgartner W (2019) An insulated flexible sensor for stable electromyography detection: application to prosthesis control. Sensors 19(4):961
Saponas TS, Tan DS, Morris D, Balakrishnan R (2008) Demonstrating the feasibility of using forearm electromyography for muscle-computer interfaces. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’08, pp 515–524, New York, NY, USA. Association for Computing Machinery
Saponas TS, Tan DS, Morris D, Turner J, Landay JA (2010) Making muscle-computer interfaces more practical. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’10, pp 851–854, New York, NY, USA. Association for Computing Machinery
Schrapel M, Stadler M-L, Rohs M (2018) Pentelligence: combining pen tip motion and writing sounds for handwritten digit recognition. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–11
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp 618–626
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Smith B, Wu C, Wen H, Peluse P, Sheikh Y, Hodgins JK, Shiratori T (2020) Constraining dense hand surface tracking with elasticity. ACM Trans Graph (TOG), 39(6):1–14
Song H, Benko H, Guimbretiere F, Izadi S, Cao X, Hinckley K (2011) Grips and gestures on a multi-touch pen. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’11, pp 1323–1332, New York, NY, USA. ACM
Sridhar S, Mueller F, Zollhöfer M, Casas D, Oulasvirta A, Theobalt C (2016) Real-time joint tracking of a hand manipulating an object from rgb-d input. In: European conference on computer vision, pp 294–310. Springer
Suzuki Y, Misue K, Tanaka J (2009) Interaction technique for a pen-based interface using finger motions. In: Jacko JA (ed) Human-computer interaction. Novel interaction methods and techniques, pp 503–512. Springer, Berlin Heidelberg
Tekin B, Bogo F, Pollefeys M (2019) H+ o: unified egocentric recognition of 3d hand-object poses and interactions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4511–4520
Tian F, Xu L, Wang H, Zhang X, Liu Y, Setlur V, Dai G (2008) Tilt menu: using the 3d orientation information of pen devices to extend the selection capability of pen-based user interfaces. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’08, pp 1371–1380, New York, NY, USA. ACM
van Drempt N, McCluskey A, Lannin NA (2011) A review of factors that influence adult handwriting performance. Aust Occup Therapy J 58(5):321–328
Vogel D, Balakrishnan R (2010) Direct pen interaction with a conventional graphical user interface. Human-Comput Inter 25(4):324–388
Vogel D, Casiez G (2011) Conté: multimodal input inspired by an artist’s crayon. In: Proceedings of the 24th annual ACM symposium on User interface software and technology, pp 357–366
Wacker P, Nowak O, Voelker S, Borchers J (2019) Arpen: mid-air object manipulation techniques for a bimanual ar system with pen & smartphone. In: Proceedings of the 2019 CHI conference on human factors in computing systems, pp 1–12
Wacom vr pen. https://developer.wacom.com/en-us/wacomvrpen. Accessed 17 Dec 2020
Wen H, Rojas JR, Dey AK (2016) Serendipity: finger gesture recognition using an off-the-shelf smartwatch. In: Proceedings of the 2016 CHI conference on human factors in computing systems, pp 3847–3851
Westerman W (1999) Hand tracking, finger identification, and chordic manipulation on a multi-touch surface. PhD thesis, University of Delaware
Wu E, Yuan Y, Yeo H-S, Quigley A, Koike H, Kitani KM (2020) Back-hand-pose: 3d hand pose estimation for a wrist-worn camera via dorsum deformation network. In: Proceedings of the 33rd annual ACM symposium on user interface software and technology, UIST ’20, pp 1147–1160, New York, NY, USA. Association for Computing Machinery
Xin Y, Bi X, Ren X (2011) Acquiring and pointing: an empirical study of pen-tilt-based interaction. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’11, pp 849–858, New York, NY, USA. ACM
Xu C, Pathak PH, Mohapatra P (2015) Finger-writing with smartwatch: a case for finger and hand gesture recognition using smartwatch. In: Proceedings of the 16th international workshop on mobile computing systems and applications, pp 9–14
Zhang X, Chen X, Li Y, Lantz V, Wang K, Yang J (2011) A framework for hand gesture recognition based on accelerometer and emg sensors. IEEE Trans Syst Man Cybernet-Part A: Syst Hum 41(6):1064–1076
Zhang Y, Harrison C (2015) Tomo: wearable, low-cost electrical impedance tomography for hand gesture recognition. In: Proceedings of the 28th annual ACM symposium on user interface software and technology, UIST ’15, pp 167–173, New York, NY, USA. Association for Computing Machinery
Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5551–5560
Acknowledgements
We would like to acknowledge our co-authors on the three publications on which this article is based: Drini Cami, Brian Vogel, Richard G. Calland, Naoki Kimura and Riku Arakawa. While all evaluations of the neural networks presented in this article are new, we wish to recognise their contributions in the original publications.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Matulic, F., Vogel, D. (2021). Deep Learning-Based Hand Posture Recognition for Pen Interaction Enhancement. In: Li, Y., Hilliges, O. (eds) Artificial Intelligence for Human Computer Interaction: A Modern Approach. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-82681-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-82681-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82680-2
Online ISBN: 978-3-030-82681-9
eBook Packages: Computer ScienceComputer Science (R0)