Curriculum vitae | Teaching | Research |Others | Home


For more up-to-date research description, please see my GRAVITY Lab web page.



Space-time Light Field Rendering

So far extending light field rendering to dynamic scenes has been trivially treated as the rendering of static light fields stacked in time. This type of approaches requires input video sequences in strict synchronization and allows only discrete exploration in the temporal domain determined by the capture rate. In this paper we propose a novel framework, space-time light field rendering, which allows continuous exploration of a dynamic scene in both spatial and temporal domain with unsynchronized input video sequences.

In order to synthesize novel views from any viewpoint at any time instant, we develop a two-stage rendering algorithm. We first interpolate in the temporal domain to generate globally synchronized images using a robust spatial-temporal image registration algorithm followed by edge-preserving image morphing. We then interpolate those software-synchronized images in the spatial domain to synthesize the final view. Our experimental results show that our approach is robust and capable of maintaining photo-realistic results.

ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D) 2005 Full Paper (PDF, 2.8Mbytes)  Video (encoded in DIVX, 4.1Mbytes)

3D Physically-based 2D View Synthesis

As part of my thesis work, I am working on a new statistical approach for view synthesis. It is particular effective for texture-less regions and specular highlights, two major problems that most existing reconstruction techniques would have difficulty with. Some initial results are presented on the left, the top row shows several input images while the bottom row shows the reconstructed point cloud.

Real-time Stereo

A multi-resolution stereo algorithm that can be implemented on commodity graphics hardware.

Real-time View Synthesis on Graphics

We present a novel use of commodity graphics hardware that effectively combines a plane-sweeping algorithm with view synthesis for real-time, on-line 3D scene acquisition and view synthesis. The heart of our method is to use programmable Pixel Shader technology to square intensity differences between reference image pixels, and then to choose final colors that correspond to the minimum difference, i.e. the most consistent color. We filed an invention disclosure with UNC.

Eye-Gaze Correction

It focused on maintaining eye-contact for desktop video teleconferencing. we took a model-based approach that incorporates a detailed individualized three-dimensional head model with stereoscopic analysis. This approach is very effective; we probably achieved the most realistic results in published literature for eye gaze correction (see the images on the left; the middle is the synthesized view that preserves eye-contact, the other two are the input images.). In the process, we can also get very accurate 3D tracking results of the head pose. The images below show the face model projected on the tracked head. MSR has filed two patent applications for our algorithms and systems.


Group Teleconferencing

We want to design a system that facilitate many-to-many teleconferencing. Instead of providing a perceptively correct view for every single user, we strive to provide the best approximating view for the entire group as a whole. We demonstrate two real-time acquisition-through-rendering algorithms: one is based on view dependent texture mapping with automatically acquired approximating geometry, and the other uses an array of cameras to perform Light Field style rendering. 

3D Tele-Immersion

The goal of Tele-Immersion is to enable users at geographically distributed sites to collaborate in real time in a shared, simulated environment as if they were in the same physical room. While the entire project was a interdisciplinary, multi-site collaboration, I was mainly invovled in in real-time data capture and distribution.

2D Immersive Teleconferencing

We worked on improving the field of view and resolution for 2D video teleconferencing. The result is a simple, yet effective technique for producing geometrically correct imagery for teleconferencing environments. The necessary image transformations are derived by finding a direct one-to-one mapping between a capture device and a display device for a fixed viewer location, thus completely avoiding the need for any intermediate, complex representations of screen geometry, capture and display distortions, and viewer location. Using this technique, we can easily build an immersive teleconferencing system using multiple projectors and cameras. 


PixelFlex: A Reconfigurable Multi-Projector Display System

The PixelFlex system is composed of ceiling-mounted projectors, each with computer-controlled pan, tilt, zoom and focus; and a camera for closed-loop calibration. Working collectively, these controllable projectors function as a single logical display capable of being easily modified into a variety of spatial formats. The left image shows a stacked configuration that can be used for stereo display. 

Automatic Projector Display Surface Estimation Using Every-Day Imagery 

We introduce a new method for continuous display surface auto-calibration. Using a camera that observes the display surface, we match image features in whatever imagery is being projected, with the corresponding features that appear on the display surface, to continually refine an estimate for the display surface geometry. In effect we enjoy the high signal-to-noise ratio of "structured" light (without getting to choose the structure) and the unobtrusive nature of passive correlation-based methods.

Past Projects and Research