Aug 31

Reported by Dave Moshe,in Wired Science, 28 Aug. 2012.

Seeking a way to crowdsource better computer vision, roboticists have launched a website that allows users to record pieces of their environments in 3-D with a Kinect camera.

Called Kinect@Home, the open source and browser-based effort remains in its infancy. Users have uploaded only a few dozen models of their living room couches, kitchen countertops and themselves.Should the project catch on, however, researchers may be on the cusp of an unprecedented way to amass 3-D data to improve navigation and object-recognition algorithms that allow robots to cruise and manipulate indoor environments.

“For robots to work in everyday space and homes, we need lots of 3-D data. Big data is where it’s at, as Google understands with its efforts,” said roboticist Alper Aydemir of the Royal Institute of Technology in Sweden. “But no one has been able to do this efficiently yet [with 3-D data].”

With the advent of Microsoft’s low-cost yet highly effective 3-D camera system, called Kinect, and sanctioned ways to hack the device, computer vision research is experiencing a revolution.

“I think we’ve developed a win-win situation,” said Aydemir, who leads the Kinect@Home effort. “Users get access to 3-D models they can embed anywhere on the internet, and we use this data to create better computer vision algorithms.”

Populations are growing older, health insurance costs are rising and care systems are increasingly stretched, so autonomous robots offer a dreamy vision of the future for many people.

The trouble is that most automatons can only bumble through crowded human environments. Incorporating building blueprints into navigation algorithms pushes them only so far because such plans lack couches, tables, dogs and other oddities that people cram into indoor spaces.

What’s more, helper robots are only useful if they can recognize and interact with a dizzying variety of objects. Some crowdsourced schemes use Amazon Mechanical Turk to categorize objects in 2-D images acquired by robots, but these images don’t inform any item’s 3-D shape or behavior.

Helper robots must be able to distinguish a refrigerator from an oven, for example, and open these labyrinthine 3-D objects to cook a casserole or deliver a cold beer to beckoning human owners.

“If you can get real-world 3-D data for 5,000 refrigerators, you can develop an algorithm to generalize a refrigerator and then test a robot’s ability to generalize them,” Aydemir said.

In hopes of gathering these and other data that define human environments, Aydemir created Kinect@Home. Users install a plugin, attach their Kinect to a computer and start recording whatever they please.

“I think making 3-D models should be as easy as making a YouTube clip,” said Aydemir, who recognizes the ambition in his new enterprise. “The long-term vision is that experiencing 3-D places should be easy, whether you’re trying to sell a couch or seeking advice to remodel a kitchen.”

Kinect@Home’s 3-D models are by no means perfect. Gaping holes appear in textures, and images are fuzzy because 95 percent of details are removed to make models download in seconds.

As computing power, server bandwidth and the algorithms used to render the data improve, Aydemir says all of the models will be reprocessed to fill in the holes, increase detail and overall improve their realism.

“All of this was not possible a year ago,” he said. “Our current capability is at the limit of current state-of-the-art research.”

To conservatively test Kinect@Home, Wired booted up a 7-year-old Windows Vista laptop and installed drivers supplied by the website, which totaled about 100MB and downloaded in a few minutes. Installation took much longer and forced a reboot.

One small browser plugin installation later, we clicked “record” and filmed our first 3-D model of three Wired magazine issues on a couch. Per the site’s instructions, we moved the Kinect slowly around the magazines from the top, left, right and below, and then clicked “stop.”

Next we made a model of the magazines to a busy bookshelf (above). Filming took roughly as long and the data recorded locally by the camera totaled about 100 megabytes.

A third recording of your author on a couch turned out hazy and indiscernible. “The tracking got a little lost,” Aydemir wrote in an e-mail to Wired after our upload. “The … Kinect does not give back any depth information when you’re too close to things. So it takes some time to master how to build good 3-D models, but not too much.”

The good news is that Kinect@Home proactively uploads data during recording. The bad news: It’s not yet possible to preview a model before it’s beamed to a remote server, where it’s rendered in the cloud. So if your model stinks, you won’t know until dozens of megabytes later.

Aydemir said he’s met with the Kinect team in Seattle and asked them to craft a version of Kinect software that doesn’t include motor, audio or other capabilities, since it will slim the hefty download and speed up modeling performance.

As the team continues improving its service, which merely began as a side project for Aydemir, and internet service providers loosen their iron grip on broadband speeds, he hopes Kinect@Home will catch on.

“One of my colleagues has joked, ‘You’re trying to replace every image on the web with a 3-D image,’” he said. “I told him Google has done something similar by indexing almost every word on the internet, so why not?”

[3-D model: Click and drag to move the 3-D model, and zoom in or out using a mouse’s scrolling wheel. (Dave Mosher and Tad Greenleaf/Kinect@Home)]

Comments are closed.

preload preload preload