Program Schedule

Call for Papers


Invited Talks


Latest News

Contact Us


3D Reconstruction and Understanding with Video and Sound


One of the driving factors for innovation in computer vision is the availability of new types of input sensors and algorithms to process the sensor data. With the proliferation of commodity RGBD cameras, it is increasingly common to see new techniques that take advantage of the additional depth channel for tasks such as reconstruction, segmentation, and understanding.


For the scope of this workshop, we aim to investigate (or reintroduce) another readily available, but often ignored source of information ??? audio or acoustic sensors ??? that can be combined with visual cameras, and thereby result in use of audio?visual sensors and a new generation of algorithms and applications.


There are two major thrusts for this workshop. The first is the multimodal analysis of videos with sound for enhanced recognition accuracy, including application areas such as audio?visual speech recognition, video categorization or classification, and event detection in videos, as well as technical areas such as early vs. late fusion and end-?to-?end training of models. The second is to explore the use of acoustic sensors to facilitate the reconstruction and understanding of 3D objects/models beyond the capability of current 3D RGBD sensors. This could include robust handling of scenes with specular/transparent objects, or even reconstruction around corners (i.e. non line?-of-?sight) and through obstacles or capturing other material characteristics (e.g. acoustic material properties for aural rendering). In this context, acoustic sensors refer to a broad frequency range of sound from subsonic to ultrasound.


This workshop will have invited speakers from a broad spectrum of research areas, from traditional multimodal visual analysis, to digital signal processing, to inverse acoustic simulation. It is our hope to seed the vision community with a variety of promising ideas from other disciplines, and result in new class of algorithms. In the spirit of advancing sensing capability, there will also be an industrial panel with leading companies in 3D sensors and mobile handsets, presenting an opportunity to exchange ideas between academic research and industrial development. We hope the discussion will lead to a roadmap for the next-?generation of 3D sensors.