Watch Do: A Smart IoT Interaction System with Object Detection and Gaze Estimation

The Internet of Things (IoT) attempts to help people access internet-connected devices, applications, and services anytime and anywhere. However, how providing an efficient and intuitive method of interaction between people and IoT devices is still an open challenge. In this work, we propose a novel interaction system called Watch Do, where users can control an IoT device by gazing at it and doing simple gestures. The proposed system mainly consists of 1) object detection module, 2) gaze estimation module, 3) hand gesture recognition module, and 4) IoT controller module. The target device is identified by various deep learning-based gaze estimation and object detection techniques. Afterwards, hand gesture recognition is applied to generate an IoT device control command which is transmitted to the IoT platform. The experimental results and case studies demonstrate the feasibility of the proposed system and imply the future research directions.

