Letter
Published: 29 May 2019

Learning the signatures of the human grasp using a scalable tactile glove

Subramanian Sundaram^1,2^nAff3^nAff4,
Petr Kellnhofer^1,2,
Yunzhu Li^1,2,
Jun-Yan Zhu^1,2,
Antonio Torralba^1,2 &
…
Wojciech Matusik^1,2

Nature volume 569, pages 698–702 (2019)Cite this article

59k Accesses
816 Citations
504 Altmetric
Metrics details

Subjects

Abstract

Humans can feel, weigh and grasp diverse objects, and simultaneously infer their material properties while applying the right amount of force—a challenging set of tasks for a modern robot¹. Mechanoreceptor networks that provide sensory feedback and enable the dexterity of the human grasp² remain difficult to replicate in robots. Whereas computer-vision-based robot grasping strategies^3,4,5 have progressed substantially with the abundance of visual data and emerging machine-learning tools, there are as yet no equivalent sensing platforms and large-scale datasets with which to probe the use of the tactile information that humans rely on when grasping objects. Studying the mechanics of how humans grasp objects will complement vision-based robotic object handling. Importantly, the inability to record and analyse tactile signals currently limits our understanding of the role of tactile information in the human grasp itself—for example, how tactile maps are used to identify objects and infer their properties is unknown⁶. Here we use a scalable tactile glove and deep convolutional neural networks to show that sensors uniformly distributed over the hand can be used to identify individual objects, estimate their weight and explore the typical tactile patterns that emerge while grasping objects. The sensor array (548 sensors) is assembled on a knitted glove, and consists of a piezoresistive film connected by a network of conductive thread electrodes that are passively probed. Using a low-cost (about US$10) scalable tactile glove sensor array, we record a large-scale tactile dataset with 135,000 frames, each covering the full hand, while interacting with 26 different objects. This set of interactions with different objects reveals the key correspondences between different regions of a human hand while it is manipulating objects. Insights from the tactile signatures of the human grasp—through the lens of an artificial analogue of the natural mechanoreceptor network—can thus aid the future design of prosthetics⁷, robot grasping tools and human–robot interactions^1,8,9,10.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: The STAG as a platform to learn from the human grasp.**

**Fig. 2: Identifying and weighing objects from tactile information.**

**Fig. 3: Cooperativity among regions of the hand during object manipulation and grasp.**

Capturing forceful interaction with deformable objects using a deep learning-powered stretchable tactile array

Article Open access 04 November 2024

A soft thumb-sized vision-based sensor with accurate all-round force perception

Article Open access 23 February 2022

Deep learning-assisted object recognition with hybrid triboelectric-capacitive tactile sensor

Article Open access 07 November 2024

Code availability

Custom code used in the current study is available from the corresponding author on request.

Data availability

Source data for key figures in the manuscript are included as interactive maps in Supplementary Data 1–3. Please load (and refresh) all ‘∗.html’ pages in Firefox or Chrome. The tactile datasets generated and analysed during this study are available from the corresponding author on request.

References

Bartolozzi, C., Natale, L., Nori, F. & Metta, G. Robots with a sense of touch. Nat. Mater. 15, 921–925 (2016).
Article ADS CAS Google Scholar
Johansson, R. & Flanagan, J. Coding and use of tactile signals from the fingertips in object manipulation tasks. Nat. Rev. Neurosci. 10, 345–359 (2009).
Article CAS Google Scholar
Mahler, J., Matl, M., Satish, V., Danielczuk, M., DeRose, B., McKinley, S. & Goldberg, K. Learning ambidextrous robot grasping policies. Sci. Robot. 4, eaau4984 (2019).
Article Google Scholar
Levine, S., Finn, C., Darrell, T. & Abbeel, P. End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17, 1334–1373 (2016).
MathSciNet MATH Google Scholar
Morrison, D., Corke, P. & Leitner, J. Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. In Proc. Robotics: Science and Systems https://doi.org/10.15607/RSS.2018.XIV.021 (RSS Foundation, 2018).
Saal, H., Delhaye, B., Rayhaun, B. & Bensmaia, S. Simulating tactile signals from the whole hand with millisecond precision. Proc. Natl Acad. Sci. USA 114, E5693–E5702 (2017).
Article CAS Google Scholar
Osborn, L. et al. Prosthesis with neuromorphic multilayered e-dermis perceives touch and pain. Sci. Robot. 3, eaat3818 (2018).
Article Google Scholar
Okamura, A. M., Smaby, N. & Cutkosky, M. R. An overview of dexterous manipulation. In Proc. IEEE International Conference on Robotics and Automation (ICRA’00) 255–262 https://doi.org/10.1109/ROBOT.2000.844067 (2000).
Cannata, G., Maggiali, M., Metta, G. & Sandini, G. (2008). An embedded artificial skin for humanoid robots. In Proc. International Conference on Multisensor Fusion and Integration for Intelligent Systems 434–438 https://doi.org/10.1109/MFI.2008.4648033 (2008).
Romano, J., Hsiao, K., Niemeyer, G., Chitta, S. & Kuchenbecker, K. Human-inspired robotic grasp control with tactile sensing. IEEE Trans. Robot. 27, 1067–1079 (2011).
Article Google Scholar
Marzke, M. Precision grips, hand morphology, and tools. Am. J. Phys. Anthropol. 102, 91–110 (1997).
Article CAS Google Scholar
Niewoehner, W., Bergstrom, A., Eichele, D., Zuroff, M. & Clark, J. Manual dexterity in Neanderthals. Nature 422, 395 (2003).
Article ADS CAS Google Scholar
Feix, T., Kivell, T., Pouydebat, E. & Dollar, A. Estimating thumb-index finger precision grip and manipulation potential in extant and fossil primates. J. R. Soc. Interf. 12, https://doi.org/10.1098/rsif.2015.0176 (2015).
Article Google Scholar
Chortos, A., Liu, J. & Bao, Z. Pursuing prosthetic electronic skin. Nat. Mater. 15, 937–950 (2016).
Article ADS CAS Google Scholar
Li, R. et al. Localization and manipulation of small parts using GelSight tactile sensing. In Proc. International Conference Intelligent Robots and Systems 3988–3993 https://doi.org/10.1109/IROS.2014.6943123 (IEEE/RSJ, 2014).
Yamaguchi, A. & Atkeson, C. G. Combining finger vision and optical tactile sensing: reducing and handling errors while cutting vegetables. In Proc. IEEE 16th International Conference on Humanoid Robots (Humanoids) 1045–1051 https://doi.org/10.1109/HUMANOIDS.2016.7803400 (IEEE-RAS, 2016).
Wettels, N. & Loeb, G. E. Haptic feature extraction from a biomimetic tactile sensor: force, contact location and curvature. In Proc. International Conference on Robotics and Biomimetics 2471–2478 (IEEE, 2011).
Park, J., Kim, M., Lee, Y., Lee, H. & Ko, H. Fingertip skin-inspired microstructured ferroelectric skins discriminate static/dynamic pressure and temperature stimuli. Sci. Adv. 1, e1500661 (2015).
Article ADS Google Scholar
Yau, J., Kim, S., Thakur, P. & Bensmaia, S. Feeling form: the neural basis of haptic shape perception. J. Neurophysiol. 115, 631–642 (2016).
Article Google Scholar
Bachmann, T. Identification of spatially quantised tachistoscopic images of faces: how many pixels does it take to carry identity? Eur. J. Cogn. Psychol. 3, 87–103 (1991).
Article Google Scholar
Torralba, A., Fergus, R. & Freeman, W. 80 million tiny images: a large dataset for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1958–1970 (2008).
Article Google Scholar
D’Alessio, T. Measurement errors in the scanning of piezoresistive sensors arrays. Sens. Actuators A 72, 71–76 (1999).
Article Google Scholar
Ko, J., Bhullar, S., Cho, Y., Lee, P. & Byung-Guk Jun, M. Design and fabrication of auxetic stretchable force sensor for hand rehabilitation. Smart Mater. Struct. 24, 075027 (2015).
Article ADS Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 https://doi.org/10.1109/CVPR.2016.90 (IEEE, 2016).
Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).
Article MathSciNet Google Scholar
Brodie, E. & Ross, H. Sensorimotor mechanisms in weight discrimination. Percept. Psychophys. 36, 477–481 (1984).
Article CAS Google Scholar
Napier, J. The prehensile movements of the human hand. J. Bone Joint Surg. Br. 38-B, 902–913 (1956).
Article CAS Google Scholar
Lederman, S. & Klatzky, R. Hand movements: a window into haptic object recognition. Cognit. Psychol. 19, 342–368 (1987).
Article CAS Google Scholar
Feix, T., Romero, J., Schmiedmayer, H., Dollar, A. & Kragic, D. The GRASP taxonomy of human grasp types. IEEE Trans. Hum. Mach. Syst. 46, 66–77 (2016).
Article Google Scholar
Simon, T., Joo, H., Matthews, I. & Sheikh, Y. Hand keypoint detection in single images using multiview bootstrapping. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 4645–4653 https://doi.org/10.1109/CVPR.2017.494 (IEEE, 2017).
Lazzarini, R., Magni, R. & Dario, P. A tactile array sensor layered in an artificial skin. In Proc. IEEE International Conference on Intelligent Robots and Systems (Human Robot Interaction and Cooperative Robots) 114–119 https://doi.org/10.1109/IROS.1995.525871 (IEEE/RSJ, 1995).
Newell, F., Ernst, M., Tjan, B. & Bülthoff, H. Viewpoint dependence in visual and haptic object recognition. Psychol. Sci. 12, 37–42 (2001).
Article CAS Google Scholar
Higy, B., Ciliberto, C., Rosasco, L. & Natale, L. Combining sensory modalities and exploratory procedures to improve haptic object recognition in robotics. In Proc. 16th International Conference on Humanoid Robots (Humanoids) 117–124 https://doi.org/10.1109/HUMANOIDS.2016.7803263 (IEEE-RAS, 2016).
Klatzky, R., Lederman, S. & Metzger, V. Identifying objects by touch: an “expert system”. Percept. Psychophys. 37, 299–302 (1985).
Article CAS Google Scholar
Lederman, S. & Klatzky, R. Haptic perception: a tutorial. Atten. Percept. Psychophys. 71, 1439–1459 (2009).
Article CAS Google Scholar
Kappassov, Z., Corrales, J. & Perdereau, V. Tactile sensing in dexterous robot hands. Robot. Auton. Syst. 74, 195–220 (2015).
Article Google Scholar
Gao, Y., Hendricks, L. A., Kuchenbecker, K. J. & Darrell, T. Deep learning for tactile understanding from visual and haptic data. In Proc. International Conference on Robotics and Automation (ICRA) 536–543 https://doi.org/10.1109/ICRA.2016.7487176 (IEEE, 2016).
Meier, M., Walck, G., Haschke, R. & Ritter, H. J. Distinguishing sliding from slipping during object pushing. In Proc. IEEE Intelligent Robots and Systems (IROS) 5579–5584 https://doi.org/10.1109/IROS.2016.7759820 (2016).
Baishya, S. S. & Bäuml, B. Robust material classification with a tactile skin using deep learning. In Proc. IEEE Intelligent Robots and Systems (IROS) 8–15 https://doi.org/10.1109/IROS.2016.7758088 (2016).
Tompson, J., Goroshin, R., Jain, A., LeCun, Y. & Bregler, C. Efficient object localization using convolutional networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 648–656 https://doi.org/10.1109/CVPR.2015.7298664 (IEEE, 2015).
Paszke, A. et al. Automatic differentiation in PyTorch. In Proc. 31st Conference on Neural Information Processing Systems (NIPS) 1–4 (2017).
Bau, D., Zhou, B., Khosla, A., Oliva, A. & Torralba, A. Network dissection: quantifying interpretability of deep visual representations. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3319–3327 https://doi.org/10.1109/CVPR.2017.354 (IEEE, 2017).
Flanagan, J. & Bandomir, C. Coming to grips with weight perception: effects of grasp configuration on perceived heaviness. Percept. Psychophys. 62, 1204–1219 (2000).
Article CAS Google Scholar
Maaten, L. V. D. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
MATH Google Scholar
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Article CAS Google Scholar

Download references

Acknowledgements

S.S. thanks M. Baldo, V. Bulovic and J. Lang for their comments and discussions. P.K. and S.S. thank K. Myszkowski for discussions. We gratefully acknowledge support from the Toyota Research Institute.

Reviewer information

Nature thanks Giulia Pasquale and Alexander Schmitz for their contribution to the peer review of this work.

Author information

Subramanian Sundaram
Present address: Biological Design Center, Boston University, Boston, MA, USA
Subramanian Sundaram
Present address: Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA

Authors and Affiliations

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
Subramanian Sundaram, Petr Kellnhofer, Yunzhu Li, Jun-Yan Zhu, Antonio Torralba & Wojciech Matusik
Electrical Engineering and Computer Science Department, Massachusetts Institute of Technology, Cambridge, MA, USA
Subramanian Sundaram, Petr Kellnhofer, Yunzhu Li, Jun-Yan Zhu, Antonio Torralba & Wojciech Matusik

Authors

Subramanian Sundaram
View author publications
You can also search for this author in PubMed Google Scholar
Petr Kellnhofer
View author publications
You can also search for this author in PubMed Google Scholar
Yunzhu Li
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Yan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Torralba
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Matusik
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.S. conceived the sensor and hardware designs, performed experiments, was involved in all aspects of the work and led the project. P.K. performed all data analysis with input from all authors. Y.L. performed network dissection. S.S. and P.K. generated the results. A.T. and W.M. supervised the work. All authors discussed ideas and results and contributed to the manuscript.

Corresponding author

Correspondence to Subramanian Sundaram.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Figure 1 STAG images and readout circuit architecture.

a, Image of the finished STAG just before the electrodes are insulated. b, Scan of the STAG. c, Electrical-grounding-based signal isolation circuit (based on ref. ²²). The active row during readout is selected by grounding one of the 32 single-pole double throw (SPDT) switches. A 32:1 analog switch is used to select one of the 32 columns at a time. Here R_c is the charging resistor, V_ref is the reference voltage, and R_g sets the amplifier gain. d, Fabricated printed circuit board that interfaces with the STAG. The two connectors shown on the top right and bottom are connected to the column and row electrodes of the sensor matrix. The charging resistors (R_c) are on the back of the printed circuit board.

Extended Data Figure 2 Characteristics of the STAG sensing elements.

a, The resistance of a single sensing element shows the linear working range (in logarithmic force units). The sensor is not sensitive below about 20 mN of force and saturates in response when a load exceeding 0.8 N is applied. b, Response of three separate sensors in the force range 20 mN to 0.5 N. The sensors show minimal hysteresis (17.5 ± 2.8%; see Supplementary Fig. 2). c, The sensor response after 10, 100 and 1,000 cycles of linear force ramps up to 0.5 N for three separate devices. The resistance measurements are shown in d over the entire set of cycles. e, Differential scanning calorimetry measurements of the FSF material shows a two-polymer blend response with softening/melting temperatures of around 100 °C and 115.1 °C. f, Through-film resistance of an unloaded sensor after treating at different temperatures in a convection oven for 10 min. The film becomes insulating above about 80 °C.

Extended Data Figure 3 Sensor architectures and regular 32 × 32 arrays.

a, A simplified version of the sensor laminate architecture. b, The sensor is assembled by laminating a FSF along with orthogonal electrodes on each side, that are held in place and insulated by a layer of two-sided adhesive and a stretchable LDPE film (see Methods). c, Fixture used to assemble parallel electrodes. The individual electrodes can be threaded into the structure (like a needle) for assembling parallel electrodes with a spacing of 2.5 mm. d, Assembled version of the architecture shown in a. e, A regular 32 × 32 array version of the STAG based on the design in b.

Extended Data Figure 4 Sample recordings of nine objects on regular 32 × 32 arrays on a flat surface.

Nine different objects are manipulated on a regular sensor array (Extended Data Fig. 3d) placed on a flat surface. The resting patterns of these objects can be seen easily. Pressing the tactile array with sharp objects like a pen or the needles of a kiwano yields signals with a single sensor resolution.

Extended Data Figure 5 Auxetic designs for stretchable sensor arrays.

a, Standard auxetic design laser cut from the FSF. b, The actual design of the auxetic includes holes to route the electrodes (shown in red and blue), and slots allow the square, sensing island to rotate, enhancing the stretchability of the sensor array. c, Close-up of the fabricated array showing the conductive thread electrodes before insulation. d, A fully fabricated 10 × 10 array with an auxetic design. e, Auxetic patterning allows the sensor array to be folded, crushed and stretched easily with no damage. f, The array can also be stretched in multiple directions (see Supplementary Video 2).

Extended Data Figure 6 Dataset objects.

In total, 26 objects are used in our dataset; images of 24 objects are shown here. In addition to these objects, our dataset includes two cola cans (one empty can and one full can).

Extended Data Figure 7 Confusion maps and learned convolution filters.

a–h, The actual object and predicted object labels are shown in these confusion matrices for different networks, each taking 1 to 8 (or N) inputs where each input is obtained from a distinct cluster for N > 1 (approach shown in Fig. 2e; see Methods). These matrices correspond to the ‘clustering’ curve in Fig. 2b. Objects with similar shapes, sizes or weights are more likely to be confused with each other. For example, the empty can and full can are easily mistaken for each other when they are resting on the table. Likewise, lighter objects such as the safety glasses, plastic spoon, or the coin are more likely to be confused with each other or other objects. Large, heavy objects with distinct signatures such as the tea box have high detection accuracy across different numbers of inputs (N). i, Original first-layer convolution filters (3 × 3) learned by the network shown in Fig. 2a for N = 1 inputs. j, Visualization of the first-layer convolution filters of ResNet-18 trained on ImageNet.

Extended Data Figure 8 Weight estimation examples and performance.

a, Four representative examples from the weight estimation dataset, in which the objects are lifted using multi-finger grasps from the top (see Supplementary Video 6 for an example recording). b, The weight estimation performance is shown in terms of the mean absolute and relative errors (normalized to the weight of each object) in each weight interval. The relative error is analogous to the Weber fraction. We observe that the CNN outperforms the linear baseline with or without the hand pose signal removed. The overall errors of the two linear baselines are comparable.

Extended Data Figure 9 Correspondence maps for six individual sensors using the decomposed hand pose signal.

The hand pose signal decomposed from object interactions is used to collectively extract correlations between the sensors and the full hand (analogous to Fig. 3b where the decomposed object-related signal is used). The pixels at the fingertips show less structured correlations with the remaining fingers, unlike in Fig. 3b.

Extended Data Figure 10 Hand pose signals from articulated hands.

a, Images of the hand poses used in the hand pose dataset. The poses G1 to G7 are extracted from a recent grasp taxonomy. In the recordings, each pose is continuously articulated from the neutral empty hand pose. b, When the tactile data from this dataset is clustered using t-SNE, each distinct group represents a hand pose. Sample tactile maps are shown on the right. The corresponding samples are marked in red (see Supplementary Data 3). c, The hand pose signals can be classified with 89.4% accuracy (average of ten runs with 3,080 training frames and 1,256 distinct test frames) using the same CNN architecture shown in Fig. 2a. The confusion matrix elements denote how often each hand pose (column) is classified as one of the possible hand poses (rows). It shows that hand poses G1 and G6 are sometimes misidentified but the other hand poses are identified nearly perfectly.

Supplementary information

Supplementary Information

This file contains Supplementary Table 1, Supplementary References and Supplementary Figures 1-6

Supplementary Data 1

This zipped file contains the interactive version of the k-means clustering example in Fig. 2e

Supplementary Data 2

This zipped file contains ‘sensor-level.html’, ‘region-level.html’ and ‘finger-level.html’ which contain interactive maps of the correlations seen over the entire dataset – between the individual sensors, different hand regions, and fingers respectively. ‘sensor-level.html’ is the interactive version of the map in Fig. 3b

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sundaram, S., Kellnhofer, P., Li, Y. et al. Learning the signatures of the human grasp using a scalable tactile glove. Nature 569, 698–702 (2019). https://doi.org/10.1038/s41586-019-1234-z

Download citation

Received: 05 November 2018
Accepted: 09 April 2019
Published: 29 May 2019
Issue Date: 30 May 2019
DOI: https://doi.org/10.1038/s41586-019-1234-z

This article is cited by

Programmed multimaterial assembly by synergized 3D printing and freeform laser induction
- Bujingda Zheng
- Yunchao Xie
- Jian Lin
Nature Communications (2024)
On non-von Neumann flexible neuromorphic vision sensors
- Hao Wang
- Bin Sun
- Ming Liang Jin
npj Flexible Electronics (2024)
Adaptive tactile interaction transfer via digitally embroidered smart gloves
- Yiyue Luo
- Chao Liu
- Wojciech Matusik
Nature Communications (2024)
Designed wrinkles for optical encryption and flexible integrated circuit carrier board
- Shilong Zhong
- Zhaoxiang Zhu
- Xudong Chen
Nature Communications (2024)
Self-decoupling three-axis forces in a simple sensor
- Kuanming Yao
- Qiuna Zhuang
Nature Machine Intelligence (2024)