The goal of this project is to collect a large dataset of daily activities such as cooking and to make the dataset freely available on the web for experiments in activity recognition and monitoring. The dataset includes video and audio from multiple locations in the space as well as a body-worn camera and audio recording device, accelerometers, and motion capture data as ground truth.
In addition to our own algorithms, as an important by-product of this activity, we will make the resulting dataset publicly available, so that our colleagues can try their algorithms and so that conferences and journals can use it for benchmarking. In that respect, we can make a large contribution by focusing the attention of the vision and other sensor processing communities on QoLT problems. Within QoLT, we plan to use the data initially to develop more general versions of the learning and prediction techniques.
An important feature of the Grand Challenge data is the use of a wide selection of sensors, which will enable systematic evaluation and comparison of the sensor modalities for different applications, development of sensor fusion strategies, and the use of less commonly used sensors. In particular, we are including accelerometers as well as video and audio sensors worn by users to collect perceptual data from the user’s perspective instead of from the environment’s perspective (a mode of sensing termed “inside-out” sensing).