research-article

Fine-grained kitchen activity recognition using RGB-D

Authors:

Jinna Lei,

Xiaofeng Ren,

Dieter FoxAuthors Info & Claims

UbiComp '12: Proceedings of the 2012 ACM Conference on Ubiquitous Computing

Pages 208 - 211

https://doi.org/10.1145/2370216.2370248

Published: 05 September 2012 Publication History

Get Access

Abstract

We present a first study of using RGB-D (Kinect-style) cameras for fine-grained recognition of kitchen activities. Our prototype system combines depth (shape) and color (appearance) to solve a number of perception problems crucial for smart space applications: locating hands, identifying objects and their functionalities, recognizing actions and tracking object state changes through actions. Our proof-of-concept results demonstrate great potentials of RGB-D perception: without need for instrumentation, our system can robustly track and accurately recognize detailed steps through cooking activities, for instance how many spoons of sugar are in a cake mix, or how long it has been mixing. A robust RGB-D based solution to fine-grained activity recognition in real-world conditions will bring the intelligence of pervasive and interactive systems to the next level.

References

[1]

L. Bo, X. Ren, and D. Fox. Depth Kernel Descriptors for Object Recognition. In IROS, pages 821--826, 2011.

Crossref

Google Scholar

[2]

M. Buettner, R. Prasad, M. Philipose, and D. Wetherall. Recognizing daily activities with RFID-based sensors. In Ubicomp, pages 51--60, 2009.

Digital Library

Google Scholar

[3]

K. Lai, L. Bo, X. Ren, and D. Fox. A scalable tree-based approach for joint object and pose recognition. In AAAI, 2011.

Crossref

Google Scholar

[4]

I. Laptev. On space-time interest points. Int'l. J. Comp. Vision, 64(2):107--123, 2005.

Digital Library

Google Scholar

[5]

R. Messing, C. Pal, and H. Kautz. Activity recognition using the velocity histories of tracked keypoints. In ICCV, pages 104--111. IEEE, 2009.

Crossref

Google Scholar

[6]

I. Oikonomidis, N. Kyriazis, and A. Argyros. Efficient model-based 3d tracking of hand articulations using kinect. In BMVC, 2011.

Crossref

Google Scholar

[7]

X. Ren and C. Gu. Figure-ground segmentation improves handled object recognition in egocentric video. In CVPR, pages 3137--3144. IEEE, 2010.

Crossref

Google Scholar

[8]

J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake. Real-time human pose recognition in parts from single depth images. In CVPR, volume 2, page 3, 2011.

Digital Library

Google Scholar

[9]

E. Spriggs, F. De La Torre, and M. Hebert. Temporal segmentation and activity classification from first-person sensing. In First Workshop on Egocentric Vision, 2009.

Google Scholar

[10]

Q. Tran, G. Calcaterra, and E. Mynatt. Cook's collage. Home-Oriented Informatics and Telematics, 2005.

Crossref

Google Scholar

[11]

J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. Rehg. A scalable approach to activity recognition based on object use. In ICCV, pages 1--8, 2007.

Crossref

Google Scholar

[12]

R. Ziola, S. Grampurohit, N. Landes, J. Fogarty, and B. Harrison. Examining interaction with general-purpose object recognition in LEGO OASIS. In Visual Languages and Human-Centric Computing, pages 65--68, 2011.

Crossref

Google Scholar

Cited By

View all

Li FLiu MKane SCarrington P(2024)A Contextual Inquiry of People with Vision Impairments in CookingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642233(1-14)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642233
Liu MSuh SVargas JZhou BGrünerbl ALukowicz P(2024)A Wearable Multi-modal Edge-Computing System for Real-Time Kitchen Activity RecognitionHuman Activity Recognition and Anomaly Detection10.1007/978-981-97-9003-6_9(132-145)Online publication date: 17-Nov-2024
https://doi.org/10.1007/978-981-97-9003-6_9
Guo XWang YCheng JChen YGuo XWang YCheng JChen Y(2024)Contactless Activity Identification Using Commodity WiFiMobile Technologies for Smart Healthcare System Design10.1007/978-3-031-57345-3_2(13-47)Online publication date: 3-Jul-2024
https://doi.org/10.1007/978-3-031-57345-3_2
Show More Cited By

Index Terms

Fine-grained kitchen activity recognition using RGB-D
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

A supervised learning approach for fast object recognition from RGB-D data
PETRA '14: Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments

Object recognition serves obvious purposes in assisted living environments, where robotic devices can be used as companions to assist humans in need. The recent introduction of vision based sensors, which are able to extract depth sensing information ...
Recognizing multi-view objects with occlusions using a deep architecture

Image-based object recognition is employed widely in many computer vision applications such as image semantic annotation and object location. However, traditional object recognition algorithms based on the 2D features of RGB data have difficulty when ...
RGB-D action recognition using linear coding

In this paper, we investigate action recognition using an inexpensive RGB-D sensor (Microsoft Kinect). First, a depth spatial-temporal descriptor is developed to extract the interested local regions in depth image. Such descriptors are very robust to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

UbiComp '12: Proceedings of the 2012 ACM Conference on Ubiquitous Computing

September 2012

1268 pages

ISBN:9781450312240

DOI:10.1145/2370216

General Chair:
Anind K. Dey
Carnegie Mellon University
,
Program Chairs:
Hao-Hua Chu
National Taiwan University, Taiwan
,
Gillian Hayes
University of California, Irvine

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGSPATIAL: ACM Special Interest Group on Spatial Information

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 September 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

Ubicomp '12

Sponsor:

Ubicomp '12: The 2012 ACM Conference on Ubiquitous Computing

September 5 - 8, 2012

Pennsylvania, Pittsburgh

Acceptance Rates

UbiComp '12 Paper Acceptance Rate 58 of 301 submissions, 19%;

Overall Acceptance Rate 764 of 2,912 submissions, 26%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

86
Total Citations
View Citations
900
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)4

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Li FLiu MKane SCarrington P(2024)A Contextual Inquiry of People with Vision Impairments in CookingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642233(1-14)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642233
Liu MSuh SVargas JZhou BGrünerbl ALukowicz P(2024)A Wearable Multi-modal Edge-Computing System for Real-Time Kitchen Activity RecognitionHuman Activity Recognition and Anomaly Detection10.1007/978-981-97-9003-6_9(132-145)Online publication date: 17-Nov-2024
https://doi.org/10.1007/978-981-97-9003-6_9
Guo XWang YCheng JChen YGuo XWang YCheng JChen Y(2024)Contactless Activity Identification Using Commodity WiFiMobile Technologies for Smart Healthcare System Design10.1007/978-3-031-57345-3_2(13-47)Online publication date: 3-Jul-2024
https://doi.org/10.1007/978-3-031-57345-3_2
Li XYin MZhang YYang PWan CGuo XTan H(2023)Back-Guard: Wireless Backscattering Based User Sensing With Parallel Attention ModelIEEE Transactions on Mobile Computing10.1109/TMC.2022.321501222:12(7466-7481)Online publication date: Dec-2023
https://doi.org/10.1109/TMC.2022.3215012
Knoefel FWallace BThomas NSveistrup HGoubran RLaurin CKnoefel FWallace BThomas NSveistrup HGoubran RLaurin C(2023)Activities of Daily LivingSupportive Smart Homes10.1007/978-3-031-37337-4_10(113-125)Online publication date: 24-Sep-2023
https://doi.org/10.1007/978-3-031-37337-4_10
Ma NWu ZCheung YGuo YGao YLi JJiang B(2022)A Survey of Human Action Recognition and Posture PredictionTsinghua Science and Technology10.26599/TST.2021.901006827:6(973-1001)Online publication date: Dec-2022
https://doi.org/10.26599/TST.2021.9010068
Liu MSuh SZhou BGruenerbl ALukowicz P(2022)Smart-Badge: A wearable badge with multi-modal sensors for kitchen activity recognitionAdjunct Proceedings of the 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the 2022 ACM International Symposium on Wearable Computers10.1145/3544793.3560391(356-363)Online publication date: 11-Sep-2022
https://dl.acm.org/doi/10.1145/3544793.3560391
Bhattacharya SAdaimi RThomaz E(2022)Leveraging Sound and Wrist Motion to Detect Activities of Daily Living with Commodity SmartwatchesProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35345826:2(1-28)Online publication date: 7-Jul-2022
https://dl.acm.org/doi/10.1145/3534582
Hu ZZhang YYu TPan S(2022)VMA: Domain Variance- and Modality-Aware Model Transfer for Fine-Grained Occupant Activity Recognition2022 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN54338.2022.00028(259-270)Online publication date: May-2022
https://doi.org/10.1109/IPSN54338.2022.00028
Wang LZhou YLi RDing L(2022)A fusion of a deep neural network and a hidden Markov model to recognize the multiclass abnormal behavior of elderly peopleKnowledge-Based Systems10.1016/j.knosys.2022.109351252:COnline publication date: 27-Sep-2022
https://dl.acm.org/doi/10.1016/j.knosys.2022.109351
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

A supervised learning approach for fast object recognition from RGB-D data

Recognizing multi-view objects with occlusions using a deep architecture

RGB-D action recognition using linear coding