How to animate a digital model of a person from images collected from the Internet

UW researchers have reconstructed 3-D models of celebrities such as Tom Hanks from large Internet photo collections. The models can also be controlled and animated by photos or videos of another person. (credit: University of Washington)

University of Washington researchers have demonstrated that it’s possible for machine learning algorithms to capture the “persona” and create a digital model of a well-photographed person like Tom Hanks from the vast number of images of them available on the Internet. With enough visual data to mine, the algorithms can also animate the digital model of Tom Hanks to deliver speeches that the real actor never performed.

Tom Hanks has appeared in many acting roles over the years, playing young and old, smart and simple. Yet we always recognize him as Tom Hanks. Why? Is it his appearance? His mannerisms? The way he moves? “One answer to what makes Tom Hanks look like Tom Hanks can be demonstrated with a computer system that imitates what Tom Hanks will do,” said lead author Supasorn Suwajanakorn, a UW graduate student in computer science and engineering.

The technology relies on advances in 3-D face reconstruction, tracking, alignment, multi-texture modeling, and puppeteering that have been developed over the last five years by a research group led by UW assistant professor of computer science and engineering Ira Kemelmacher-Shlizerman. The new results will be presented in an open-access paper at the International Conference on Computer Vision in Chile on Dec. 16.


Supasorn Suwajanakorn | What Makes Tom Hanks Look Like Tom Hanks

The team’s latest advances include the ability to transfer expressions and the way a particular person speaks onto the face of someone else — for instance, mapping former president George W. Bush’s mannerisms onto the faces of other politicians and celebrities.

It’s one step toward a grand goal shared by the UW computer vision researchers: creating fully interactive, three-dimensional digital personas from family photo albums and videos, historic collections or other existing visuals.

As virtual and augmented reality technologies develop, they envision using family photographs and videos to create an interactive model of a relative living overseas or a far-away grandparent, rather than simply Skyping in two dimensions.

“You might one day be able to put on a pair of augmented reality glasses and there is a 3-D model of your mother on the couch,” said senior author Kemelmacher-Shlizerman. “Such technology doesn’t exist yet — the display technology is moving forward really fast — but how do you actually re-create your mother in three dimensions?”

One day the reconstruction technology could be taken a step further, researchers say.

“Imagine being able to have a conversation with anyone you can’t actually get to meet in person — LeBron James, Barack Obama, Charlie Chaplin — and interact with them,” said co-author Steve Seitz, UW professor of computer science and engineering. “We’re trying to get there through a series of research steps. One of the true tests is can you have them say things that they didn’t say but it still feels like them? This paper is demonstrating that ability.”


Supasorn Suwajanakorn | George Bush driving crowd

Existing technologies to create detailed three-dimensional holograms or digital movie characters like Benjamin Button often rely on bringing a person into an elaborate studio. They painstakingly capture every angle of the person and the way they move — something that can’t be done in a living room.

Other approaches still require a person to be scanned by a camera to create basic avatars for video games or other virtual environments. But the UW computer vision experts wanted to digitally reconstruct a person based solely on a random collection of existing images.

Learning in the wild

To reconstruct celebrities like Tom Hanks, Barack Obama and Daniel Craig, the machine learning algorithms mined a minimum of 200 Internet images taken over time in various scenarios and poses — a process known as learning “in the wild.”

“We asked, ‘Can you take Internet photos or your personal photo collection and animate a model without having that person interact with a camera?’” said Kemelmacher-Shlizerman. “Over the years we created algorithms that work with this kind of unconstrained data, which is a big deal.”

Suwajanakorn more recently developed techniques to capture expression-dependent textures — small differences that occur when a person smiles or looks puzzled or moves his or her mouth, for example.

By manipulating the lighting conditions across different photographs, he developed a new approach to densely map the differences from one person’s features and expressions onto another person’s face. That breakthrough enables the team to “control” the digital model with a video of another person, and could potentially enable a host of new animation and virtual reality applications.

“How do you map one person’s performance onto someone else’s face without losing their identity?” said Seitz. “That’s one of the more interesting aspects of this work. We’ve shown you can have George Bush’s expressions and mouth and movements, but it still looks like George Clooney.”

Perhaps this could be used to create VR experiences by integrating the animated images in 360-degree sets?


Abstract of What Makes Tom Hanks Look Like Tom Hanks

We reconstruct a controllable model of a person from a large photo collection that captures his or her persona, i.e., physical appearance and behavior. The ability to operate on unstructured photo collections enables modeling a huge number of people, including celebrities and other well photographed people without requiring them to be scanned. Moreover, we show the ability to drive or puppeteer the captured person B using any other video of a different person A. In this scenario, B acts out the role of person A, but retains his/her own personality and character. Our system is based on a novel combination of 3D face reconstruction, tracking, alignment, and multi-texture modeling, applied to the puppeteering problem. We demonstrate convincing results on a large variety of celebrities derived from Internet imagery and video.

Google Glass helps cardiologists complete difficult coronary artery blockage surgery

Google Glass allowed the surgeons to clearly visualize the distal coronary vessel and verify the direction of the guide wire advancement relative to the course of the occluded vessel segment. (credit: Maksymilian P. Opolski et al./Canadian Journal of Cardiology)

Cardiologists from the Institute of Cardiology, Warsaw, Poland have used Google Glass in a challenging surgical procedure, successfully clearing a blockage in the right coronary artery of a 49-year-old male patient and restoring blood flow, reports the Canadian Journal of Cardiology.

Chronic total occlusion, a complete blockage of the coronary artery, sometimes referred to as the “final frontier in interventional cardiology,” represents a major challenge for catheter-based percutaneous coronary intervention (PCI), according to the cardiologists.

That’s because of the difficulty of recanalizing (forming new blood vessels through an obstruction) combined with poor visualization of the occluded coronary arteries.

Coronary computed tomography angiography (CTA) is increasingly used to provide physicians with guidance when performing PCI for this procedure. The 3-D CTA data can be projected on monitors, but this technique is expensive and technically difficult, the cardiologists say.

So a team of physicists from the Interdisciplinary Centre for Mathematical and Computational Modelling of the University of Warsaw developed a way to use Google Glass to clearly visualize the distal coronary vessel and verify the direction of the guide-wire advancement relative to the course of the blocked vessel segment.

Three-dimensional reconstructions displayed on Google Glass revealed the exact trajectory of the distal right coronary artery (credit: Maksymilian P. Opolski et al./Canadian Journal of Cardiology)

The procedure was completed successfully, including implantation of two drug-eluting stents.

“This case demonstrates the novel application of wearable devices for display of CTA data sets in the catheterization laboratory that can be used for better planning and guidance of interventional procedures, and provides proof of concept that wearable devices can improve operator comfort and procedure efficiency in interventional cardiology,” said lead investigator Maksymilian P. Opolski, MD, PhD, of the Department of Interventional Cardiology and Angiology at the Institute of Cardiology, Warsaw, Poland.

“We believe wearable computers have a great potential to optimize percutaneous revascularization, and thus favorably affect interventional cardiologists in their daily clinical activities,” he said. He also advised that “wearable devices might be potentially equipped with filter lenses that provide protection against X-radiation.


Abstract of First-in-Man Computed Tomography-Guided Percutaneous Revascularization of Coronary Chronic Total Occlusion Using a Wearable Computer: Proof of Concept

We report a case of successful computed tomography-guided percutaneous revascularization of a chronically occluded right coronary artery using a wearable, hands-free computer with a head-mounted display worn by interventional cardiologists in the catheterization laboratory. The projection of 3-dimensional computed tomographic reconstructions onto the screen of virtual reality glass allowed the operators to clearly visualize the distal coronary vessel, and verify the direction of the guide wire advancement relative to the course of the occluded vessel segment. This case provides proof of concept that wearable computers can improve operator comfort and procedure efficiency in interventional cardiology.

Bitdrones: Interactive quadcopters allow for ‘programmable matter’ explorations

Could an interactive swarm of flying “3D pixels” (voxels) allow users to explore virtual 3D information by interacting with physical self-levitating building blocks? (credit: Roel Vertegaal)

We’ll find out Monday, Nov. 9, when Canadian Queen’s University’s Human Media Lab professor Roel Vertegaal and his students will unleash their “BitDrones” at the ACM Symposium on User Interface Software and Technology in Charlotte, North Carolina.

Programmable matter

Vertegaal believes his BitDrones invention is the first step towards creating interactive self-levitating programmable matter — materials capable of changing their 3D shape in a programmable fashion, using swarms of tiny quadcopters. Possible applications: real-reality 3D modeling, gaming, molecular modeling, medical imaging, robotics, and online information visualization.

“BitDrones brings flying programmable matter closer to reality,” says Vertegaal. “It is a first step towards allowing people to interact with virtual 3D objects as real physical objects.”

Vertegaal and his team at the Human Media Lab created three types of BitDrones, each representing self-levitating displays of distinct resolutions.

PixelDrones are equipped with one LED and a small dot matrix display. Users could physically explore a file folder by touching the folder’s associated PixelDrone, for example. When the folder opens, its contents are shown by other PixelDrones flying in a horizontal wheel below it. Files in this wheel are browsed by physically swiping drones to the left or right.

PixelDrone (credit: Roel Vertegaal)

ShapeDrones are augmented with a lightweight mesh and a 3D-printed geometric frame; they serve as building blocks for real-time, complex 3D models.

ShapeDrones (credit: Roel Vertegaal)

DisplayDrones are fitted with a curved flexible high-resolution touchscreen, a forward-facing video camera and Android smartphone board. Remote users could move around locally through a DisplayDrone with Skype for telepresence. A DisplayDrone can automatically track and replicate all of the remote user’s head movements, allowing a remote user to virtually inspect a location and making it easier for the local user to understand the remote user’s actions.

DisplayDrone (credit: Roel Vertegaal)

All three BitDrone types are equipped with reflective markers, allowing them to be individually tracked and positioned in real time via motion capture technology. The system also tracks the user’s hand motion and touch, allowing users to manipulate the voxels in space.

“We call this a ‘real reality’ interface rather than a virtual reality interface. This is what distinguishes it from technologies such as Microsoft HoloLens and the Oculus Rift: you can actually touch these pixels, and see them without a headset,” says Vertegaal.

The system currently only supports a dozen comparatively large 2.5 to 5 inch sized drones, but the team is working to scale up their system to support thousands of drones measuring under a half-inch in size, allowing users to render more seamless, high-resolution programmable matter.

Other forms of programmable matter

BitDrones are somewhat related to MIT Media Lab scientist Neil Gershenfeld’s “programmable pebbles” — reconfigurable robots that self-assemble into different configurations (see A reconfigurable miniature robot), MIT’s “swarmbots” — self-assembling swarming microbots that snap together into different shape (see MIT inventor unleashes hundreds of self-assembling cube swarmbots), J. Storrs Hall’s “utility fog” concept in which a swarm of nanobots, called “foglets,” can take the shape of virtually anything, and change shape on the fly (see Utility Fog: The Stuff that Dreams Are Made Of), and Autodesk Research’s Project Cyborg, a cloud-based meta-platform of design tools for programming matter across domains and scales.


Human Media Lab | BitDrones: Interactive Flying Microbots Show Future of Virtual Reality is Physical


Abstract of BitDrones: Towards Levitating Programmable Matter Using Interactive 3D Quadcopter Displays

In this paper, we present BitDrones, a platform for the construction of interactive 3D displays that utilize nano quadcopters as self-levitating tangible building blocks. Our prototype is a first step towards supporting interactive mid-air, tangible experiences with physical interaction techniques through multiple building blocks capable of physically representing interactive 3D data.

Minority Report, Limitless TV shows launch Monday, Tuesday

A sequel to Steven Spielberg’s epic movie, MINORITY REPORT is set in Washington, D.C., 10 years after the demise of Precrime, a law enforcement agency tasked with identifying and eliminating criminals … before their crimes were committed. Now, in 2065, crime-solving is different, and justice leans more on sophisticated and trusted technology than on the instincts of the precogs. Sept. 21 series premiere Mondays 9/8:00c

LIMITLESS, based on the feature film, is a fast-paced drama about Brian Finch, who discovers the brain-boosting power of the mysterious drug NZT and is coerced by the FBI into using his extraordinary cognitive abilities to solve complex cases for them. Sept. 22 series premiere Tuesdays 10/9c

A user-friendly 3-D printing interface for customizing designs

A new browser-based interface for design novices allows a wide range of modifications to a basic design — such as a toy car — that are guaranteed to be both structurally stable and printable on a 3-D printer. (credit: the researchers, edited by MIT News)

Researchers at MIT and the Interdisciplinary Center Herzliya in Israel have developed a system that automatically turns CAD files into visual models that users can modify in real time, simply by moving virtual sliders on a Web page. Once the design meets their specifications, they can hit the print button to send it to a 3-D printer.

Currently, 3-D printing an object from any but the simplest designs requires expertise with computer-aided design (CAD) applications. Even for the experts, the design process is immensely time-consuming.

“We envision a world where everything you buy can potentially be customized,” says Masha Shugrina, an MIT graduate student in computer science and engineering and one of the new system’s designers.

The researchers presented their new system, dubbed “Fab Forms,” at the Association for Computing Machinery’s Siggraph conference in August.

How Fab Forms works

Fab Forms begins with a design created by a seasoned CAD user. It then sweeps through a wide range of values for the design’s parameters — the numbers that a CAD user would typically change by hand — calculating the resulting geometries and storing them in a database.

For each of those geometries, the system also runs a battery of tests, specified by the designer, and it again stores the results.

An automatically created Web App using Fab Form (credit: Maria Shugrina et al.)

Finally, the system generates a user interface, a Web page that can be opened in an ordinary browser. The interface consists of a central window, which displays a 3-D model of an object, and a group of sliders, which vary the parameters of the object’s design. The system automatically weeds out all the parameter values that lead to unprintable or unstable designs, so the sliders are restricted to valid designs.

Moving one of the sliders — changing the height of the shoe’s heel, say, or the width of the mug’s base — sweeps through visual depictions of the associated geometries, presenting in real time what would take hours to calculate with a CAD program. “The sample density is high enough that it looks continuous to the user,” Matusik says.

If, however, a particularly sharp-eyed user wanted a value for a parameter that fell between two of the samples stored in the database, the system can call up the CAD program, calculate the associated geometry, and then run tests on it. That might take several minutes, but at that point, the user will have a good idea of what the final design should look like.


Abstract of Fab Forms: Customizable Objects for Fabrication with Validity and Geometry Caching

We address the problem of allowing casual users to customize para metric models while maintaining their valid state as 3D-printable functional objects. We define Fab Form as any design representation that lends itself to interactive customization by a novice user, while remaining valid and manufacturable. We propose a method to achieve these Fab Form requirements for general parametric designs tagged with a general set of automated validity tests and a small number of parameters exposed to the casual user. Our solution separates Fab Form evaluation into a precomputation stage and a runtime stage. Parts of the geometry and design validity (such as manufacturability) are evaluated and stored in the precomputation stage by adaptively sampling the design space. At runtime the remainder of the evaluation is performed. This allows interactive navigation in the valid regions of the design space using an automatically generated Web user interface (UI). We evaluate our approach by converting several parametric models into corresponding Fab Forms.

Disney researchers develop 2-legged robot that walks like an animated character

Robot mimics character’s movements (credit: Disney Research)

Disney researchers have found a way for a robot to mimic an animated character’s walk, bringing a cartoon (or other) character to life in the real world.

Beginning with an animation of a diminutive, peanut-shaped character that walks with a rolling, somewhat bow-legged gait, Katsu Yamane and his team at Disney Research Pittsburgh analyzed the character’s motion to design a robotic frame that could duplicate the walking motion. using 3D-printed links and servo motors, while also fitting inside the character’s skin. They then created control software that could keep the robot balanced while duplicating the character’s gait as closely as possible.

“The biggest challenge is that designers don’t necessarily consider physics when they create an animated character,” said Yamane, senior research scientist. Roboticists, however, wrestle with physical constraints throughout the process of creating a real-life version of the character.

“It’s important that, despite physical limitations, we do not sacrifice style or the quality of motion,” Yamane said. The robots will need to not only look like the characters, but move in the way people are accustomed to seeing those characters move.

(credit: Disney Research)

The researchers are describing the techniques and technologies they used to create the bipedal robot at the IEEE International Conference on Robotics and Automation, ICRA 2015, May 26–30 in Seattle.


DisneyResearchHub | Development of a Bipedal Robot that Walks Like an Animation Character