{"id":4487,"date":"2015-12-23T02:43:52","date_gmt":"2015-12-23T02:43:52","guid":{"rendered":"http:\/\/www.kurzweilai.net\/?p=269421"},"modified":"2015-12-24T04:55:15","modified_gmt":"2015-12-24T04:55:15","slug":"how-to-teach-machines-to-see","status":"publish","type":"post","link":"https:\/\/hoo.central12.com\/fugic\/2015\/12\/23\/how-to-teach-machines-to-see\/","title":{"rendered":"How to teach machines to see"},"content":{"rendered":"<div id=\"attachment_269567\" class=\"wp-caption aligncenter\" style=\"width: 655px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;\"><a href=\"http:\/\/www.kurzweilai.net\/images\/SegNet-test.jpg\"><img class=\" wp-image-269567 \" title=\"SegNet test\" src=\"http:\/\/www.kurzweilai.net\/images\/SegNet-test.jpg\" alt=\"\" width=\"645\" height=\"285\" \/><\/a><p style=' padding: 0 4px 5px; margin: 0;'  class=\"wp-caption-text\">In a test by KurzweilAI using a Google Maps image of Market Street In San Francisco, the SegNet system accurately identified the various elements, even hard-to-see pedestrians (shown in brown on the left) and road markings. (credit: KurzweilAI\/Cambridge University\/Google)<\/p><\/div>\n<p>Two new technologies that use deep-learning techniques to help machines see and analyze images (such as roads and people) could improve visual performance for driveless cars and create a new generation of smarter smartphones and cameras.<\/p>\n<p>Designed by <a href=\"http:\/\/www.cam.ac.uk\/\" >University of Cambridge<\/a> researchers, the systems can recognize their own location and surroundings. Most driverless cars currently in development use radar and\u00a0LIDAR sensors, which often cost more than the car itself. (See &#8220;<a href=\"http:\/\/www.kurzweilai.net\/new-laser-design-could-dramatically-shrink-autonomous-vehicle-3-d-laser-ranging-systems\" >New laser design could dramatically shrink autonomous-vehicle 3-D laser-ranging systems<\/a>&#8221; for another solution.)<\/p>\n<p>One of the systems, SegNet, can identify a user\u2019s location and orientation, including places where GPS does not function, and can identify the various components of a road scene in real time on a regular camera or smartphone (see image above or try it yourself <a href=\"http:\/\/mi.eng.cam.ac.uk\/projects\/segnet\/\" >here<\/a>).<\/p>\n<p>SegNet can take an image of a street scene it hasn\u2019t seen before and classify it, sorting objects into 12 different categories &#8212; such as roads, street signs, pedestrians, buildings and cyclists &#8212; in real time. It can deal with light, shadow, and night-time environments, and currently labels more than 90% of pixels correctly, according to the researchers. Previous systems using expensive laser or radar based sensors have not been able to reach this level of accuracy while operating in real time, the researchers say.<\/p>\n<p>To create SegNet, Cambridge undergraduate students manually labeled every pixel in each of 5000 images, with each image taking about 30 minutes to complete. Once the labeling was finished, the researchers &#8220;trained&#8221; the system, which was successfully tested on both city roads and motorways.<\/p>\n<p>\u201cIt\u2019s remarkably good at recognizing things in an image, because it\u2019s had so much practice,\u201d said Alex Kendall, a PhD student in the Department of Engineering. \u201cHowever, there are a million knobs that we can turn to fine-tune the system so that it keeps getting better.\u201d<\/p>\n<p>SegNet was primarily trained in highway and urban environments, so it still has some learning to do for rural, snowy, or desert environments. The system is not yet at the point where it can be used to control a car or truck, but it could be used as a warning system, similar to the anti-collision technologies currently available on some passenger cars.<\/p>\n<p>But teaching a machine to see is far more difficult than it sounds, said Professor Roberto Cipolla, who led the research. &#8220;There are three key technological questions that must be answered to design autonomous vehicles: where am I, what\u2019s around me and what do I do next?&#8221;<\/p>\n<p>SegNet addresses the second question. The researchers&#8217; Visual Localization system answers the first question. Using deep learning, it can determine their location and orientation from a single color image in a busy urban scene. The researchers say the system is far more accurate than GPS and works in places where GPS does not, such as indoors, in tunnels, or in cities where a reliable GPS signal is not available.<\/p>\n<div id=\"attachment_269574\" class=\"wp-caption aligncenter\" style=\"width: 651px;  border: 1px solid #dddddd; background-color: #f3f3f3; padding-top: 4px; margin: 10px; text-align:center; display: block; margin-right: auto; margin-left: auto;\"><a href=\"http:\/\/www.kurzweilai.net\/images\/Visual_Localization-test.jpg\"><img class=\" wp-image-269574 \" title=\"Visual_Localization test\" src=\"http:\/\/www.kurzweilai.net\/images\/Visual_Localization-test.jpg\" alt=\"\" width=\"641\" height=\"223\" \/><\/a><p style=' padding: 0 4px 5px; margin: 0;'  class=\"wp-caption-text\">In a KurzweilAI test of the Visual Localization system (using an image in the Central Cambridge UK demo), the system accurately identified a Cambridge building, displaying the correct Google Maps street view, and marked its location on a Google map (credit: KurzweilAI\/Cambridge University\/Google)<\/p><\/div>\n<p>It has been tested along a kilometer-long stretch of King\u2019s Parade in central Cambridge, and it is able to determine both location and orientation within a few meters and a few degrees, which is far more accurate than GPS &#8212; a vital consideration for driverless cars, according to the researchers. (Try it <a href=\"http:\/\/mi.eng.cam.ac.uk\/projects\/relocalisation\/\" >here<\/a>.)<\/p>\n<p>The localization system uses the geometry of a scene to learn its precise location, and is able to determine, for example, whether it is looking at the east or west side of a building, even if the two sides appear identical.<\/p>\n<p>\u201cIn the short term, we\u2019re more likely to see this sort of system on a domestic robot &#8212; such as a robotic vacuum cleaner, for instance,\u201d said Cipolla. \u201cIt will take time before drivers can fully trust an autonomous car, but the more effective and accurate we can make these technologies, the closer we are to the widespread adoption of driverless cars and other types of autonomous robotics.\u201d<\/p>\n<p>The researchers are presenting\u00a0<a href=\"http:\/\/www.cv-foundation.org\/openaccess\/content_iccv_2015\/papers\/Kendall_PoseNet_A_Convolutional_ICCV_2015_paper.pdf\" >details<\/a>\u00a0of the two technologies at the International Conference on Computer Vision in Santiago, Chile.<\/p>\n<p style=\"text-align: center;\"><iframe frameborder=\"0\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/MxximR-1ln4?rel=0\" width=\"640\"><\/iframe><br \/>\n<em>Cambridge University | Teaching machines to see<\/em><\/p>\n<hr \/>\n<h4>Abstract of\u00a0<em>PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization<\/em><\/h4>\n<p>We present a robust and real-time monocular six degree of freedom relocalization system. Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking 5ms per frame to compute. It obtains approximately 2m and 3\u25e6accuracy for large scale outdoor scenes and 0.5m and 5\u25e6accuracy indoors. This is achieved using an efficient 23 layer deep convnet, demonstrating that convnets can be used to solve complicated out of image plane regression problems. This was made possible by leveraging transfer learning from large scale classi- fication data. We show that the PoseNet localizes from high level features and is robust to difficult lighting, motion blur and different camera intrinsics where point based SIFT registration fails. Furthermore we show how the pose feature that is produced generalizes to other scenes allowing us to regress pose with only a few dozen training examples.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Two new technologies that use deep-learning techniques to help machines see and analyze images (such as roads and people) could improve visual performance for driveless cars and create a new generation of smarter smartphones and cameras. Designed by University of Cambridge researchers, the systems can recognize their own location and surroundings. Most driverless cars currently [&#8230;]<\/p>\n","protected":false},"author":13,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[46,43],"tags":[],"class_list":["post-4487","post","type-post","status-publish","format-standard","hentry","category-airobotics","category-news"],"_links":{"self":[{"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/posts\/4487"}],"collection":[{"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/comments?post=4487"}],"version-history":[{"count":2,"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/posts\/4487\/revisions"}],"predecessor-version":[{"id":4507,"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/posts\/4487\/revisions\/4507"}],"wp:attachment":[{"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/media?parent=4487"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/categories?post=4487"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hoo.central12.com\/fugic\/wp-json\/wp\/v2\/tags?post=4487"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}