Practical artificial intelligence tools you can use today

(credit: KurzweilAI)

By Bob Gourley
Courtesy of
CTOvision.com

Practical artificial intelligence has made its way out of the labs and into our daily lives. And judging from the pace of activity in the startup community and the major IT powerhouses, it will only grow in its ability to help us all get things done.

Most AI solutions today are fielded by the big players in IT.  For example, Apple’s Siri or the capabilities Apple embedded directly in iOS9, Google’s many savvy search solutions, Amazon’s very smart recommendation engine, and IBM’s Watson.

We expect to see a new wave of AI solutions that deliver value from smaller start-up companies as well. This is a very crowded space, with plenty of VC funding for entrepreneurs with capabilities in a wide-range of AI disciplines.

Most of these firms seem to be on one of two paths: Success, which will mean being acquired by Facebook, Apple, Microsoft or IBM; or failure, which will see them acquired by the same firms for a lower price for their talent. Either way, the innovation continues.

We are also on the lookout for hot firms that are not providing capabilities as part of other offerings or don’t seem to be on a path to be acquired by one of the big guys. This part of the market is very likely going to take off, since open-source platforms and open-cloud API’s and services are making it so easy for others to create and field AI.

Here is a short list of capabilities we recommend all futurists, technologists and business executives track. The list is by no means comprehensive, but is definitely representative of what is available now:

On your phone:

  • Siri: Part of Apple’s iOS, watchOS, and tvOS. Intelligent personal assistant.
  • Cortana: Microsoft’s intelligent personal assistant. Designed for Windows Mobile but now on Android, and a limited version runs on Apple iOS. Also works on desktops and Xbox One.
  • Google Now: Available within Google Search mobile app for Android and iOS as well as the Google Chrome web browser on other devices. Delegates requests to web services powered by Google.

From the cloud:


IBM | How IBM Watson learns

Really, most all of the AI on your phone and desktops is communicating with cloud services, so keep in mind that most solutions are a blend that highly leverages cloud capabilities. But we needed a place to talk about Echo and Watson and soon many others, so:

  • Watson: A technology platform that uses natural language processing and machine learning to reveal insights from large amounts of data.
  • Echo: The device you buy is mostly a speaker and microphone with some commodity IT to connect it to the cloud. The real smarts come from Amazon Web Services.

For personal and business use:

  • Gluru: Organize your online documents, calendars, emails and other data and have AI present you with new insights and actionable information.
  • x.ai: Let AI coordinate schedules for you. Your own personal scheduler.
  • CrystalKnows: Using AI to help you know the best way to communicate with others.
  • RecordedFuture: Leverages natural language processing at massive scale in real time to collect and understand more than 700,000 web sources.
  • Tamr: Unique approaches to Big Data, leveraging machine learning.
  • LegalRobot: Automating legal document review in ways that can serve people and businesses.

For developers:

  • Vicarious: Building the next generation of AI algorithms.
  • Soar: a general cognitive architecture for developing systems that exhibit intelligent behavior.
  • Prediction.io: a service with easy to use, open templates for a variety of advanced AI workloads.
  • Jade: Java Agent Development Framework. Simplifies multi-agent system development.
  • Protege: A free, open-source ontology editor and framework for building intelligent systems.
  • h2o.ai: Build smarter machine learning/AI applications that are fast and scalable.
  • Seldon: An open, enterprise-grade machine learning platform that adds intelligence to organizations.
  • SigOpt: Run experiments and make better products with less trial and error.
  • Scaled Inference: Cloud based models and an inference engine to help in model selection.
  • OpenCV: Open-source computer vision, a library of programming functions aimed mainly at computer vision.
  • OpenCog: An open-source software project whose aim is to create an open-source framework for artificial general intelligence (AGI).

For healthcare:

  • Enlitic: Deep learning for healthcare and data-driven medicine.
  • Metamind.io: Automatic image recognition with many use cases, including medicine.
  • Zebra Medical Vision: Closing the gaps between research and result for patients with data and AI.
  • Deep Genomics: Machine learning and AI transforming precision medicine, genetic testing, diagnostics and therapies.
  • Atomwise: Using AI and analytics to predict medicines and discover drugs.
  • Flatiron.com: AI and machine learning delivering insights on treatments.

For Robotics:

  • Mttr.net: Building flying vehicles powered by intelligent software.
  • Skycatch: Software for fully autonomous aerial systems.

For Space:

  • SpaceKnow: Using AI to track global economic trends from Space.
  • OrbitalInsight: Space trends for understanding global issues.

For marketing and customer interaction:

  • DigitalGenius: Computer driven conversation with customers in ways that scale and serve.
  • Conversica: AI to help you find your next customer, including automated email conversations to qualify leads.

We know this is not a complete list. Please let us know who you think we should include. Contact us here.

Bob Gourley is a Co-founder and Partner at Cognitio and the publisher of CTOvision.com and ThreatBrief.com. Bob’s background is as an all-source intelligence analyst and an enterprise CTO. Find him on Twitter at @BobGourley.

The top A.I. breakthroughs of 2015

(credit: iStock)

By Richard Mallah
Courtesy of Future of Life Institute

Progress in artificial intelligence and machine learning has been impressive this year. Those in the field acknowledge progress is accelerating year by year, though it is still a manageable pace for us. The vast majority of work in the field these days actually builds on previous work done by other teams earlier the same year, in contrast to most other fields where references span decades.

Creating a summary of a wide range of developments in this field will almost invariably lead to descriptions that sound heavily anthropomorphic, and this summary does indeed. Such metaphors, however, are only convenient shorthands for talking about these functionalities.

It’s important to remember that even though many of these capabilities sound very thought-like, they’re usually not very similar to how human cognition works. The systems are all of course functional and mechanistic, and, though increasingly less so, each are still quite narrow in what they do. Be warned though: in reading this article, these functionalities may seem to go from fanciful to prosaic.

The biggest developments of 2015 fall into five categories of intelligence: abstracting across environments, intuitive concept understanding, creative abstract thought, dreaming up visions, and dexterous fine motor skills. I’ll highlight a small number of important threads within each that have brought the field forward this year.

Abstracting Across Environments

A long-term goal of the field of AI is to achieve artificial general intelligence, a single learning program that can learn and act in completely different domains at the same time, able to transfer some skills and knowledge learned in, e.g., making cookies and apply them to making brownies even better than it would have otherwise. A significant stride forward in this realm of generality was provided by Parisotto, Ba, and Salakhutdinov. They built on DeepMind’s seminal DQN, published earlier this year in Nature, that learns to play many different Atari games well.


ZeitgeistMinds | Demis Hassabis, CEO, DeepMind Technologies — The Theory of Everything

Instead of using a fresh network for each game, this team combined deep multitask reinforcement learning with deep-transfer learning to be able to use the same deep neural network across different types of games. This leads not only to a single instance that can succeed in multiple different games, but to one that also learns new games better and faster because of what it remembers about those other games. For example, it can learn a new tennis video game faster because it already gets the concept — the meaningful abstraction — of hitting a ball with a paddle from when it was playing Pong. This is not yet general intelligence, but it erodes one of the hurdles to get there.

Reasoning across different modalities has been another bright spot this year. The Allen Institute for AI and University of Washington have been working on test-taking A.I.s over the years, working up from 4th grade level tests to 8th grade level tests, and this year announced a system that addresses the geometry portion of the SAT. Such geometry tests contain combinations of diagrams, supplemental information, and word problems.

In more narrow A.I., these different modalities would typically be analyzed separately, essentially as different environments. This system combines computer vision and natural language processing, grounding both in the same structured formalism, and then applies a geometric reasoner to answer the multiple-choice questions, matching the performance of the average American 11th grade student.

Intuitive Concept Understanding

A more general method of multimodal concept grounding has come about from deep learning in the past few years: Subsymbolic knowledge and reasoning are implicitly understood by a system rather than being explicitly programmed in or even explicitly represented. Decent progress has been made this year in the subsymbolic understanding of concepts that we as humans can relate to.

This progress helps with the age-old symbol grounding problem: how symbols or words get their meaning. The increasingly popular way to achieve this grounding these days is by joint embeddings — deep distributed representations where different modalities or perspectives on the same concept are placed very close together in a high-dimensional vector space.

Last year, this technique helped power abilities like automated image caption writing, and this year a team from Stanford and Tel Aviv University have extended this basic idea to jointly embed images and 3D shapes to bridge computer vision and graphics. Rajendran et al. then extended joint embeddings to support the confluence of multiple meaningfully related mappings at once, across different modalities and different languages.

As these embeddings get more sophisticated and detailed, they can become workhorses for more elaborate A.I. techniques. Ramanathan et al. have leveraged them to create a system that learns a meaningful schema of relationships between different types of actions from a set of photographs and a dictionary.

As single systems increasingly do multiple things, and as deep learning is predicated on, any lines between the features of the data and the learned concepts will blur away. Another demonstration of this deep feature grounding, by a team from Cornell and WUStL, uses a dimensionality reduction of a deep net’s weights to form a surface of convolutional features that can simply be slid along to meaningfully, automatically, photorealistically alter particular aspects of photographs, e.g., changing people’s facial expressions or their ages, or colorizing photos.

(credit: Jacob R. Gardner et al.)

One hurdle in deep learning techniques is that they require a lot of training data to produce good results. Humans, on the other hand, are often able to learn from just a single example. Salakhutdinov, Tenenbaum, and Lake have overcome this disparity with a technique for human-level concept learning through Bayesian program induction from a single example. This system is then able to, for instance, draw variations on symbols in a way indistinguishable from those drawn by humans.

Creative Abstract Thought

Beyond understanding simple concepts lies grasping aspects of causal structure — understanding how ideas tie together to make things happen or tell a story in time — and to be able to create things based on those understandings. Building on the basic ideas from both DeepMind’s neural Turing machine and Facebook’s memory networks, combinations of deep learning and novel memory architectures have shown great promise in this direction this year. These architectures provide each node in a deep neural network with a simple interface to memory.

Kumar and Socher’s dynamic memory networks improved on memory networks with better support for attention and sequence understanding. Like the original, this system could read stories and answer questions about them, implicitly learning 20 kinds of reasoning, like deduction, induction, temporal reasoning, and path finding. It was never programmed with any of those kinds of reasoning. The new end-to-end memory networks of Weston et al. added the ability to perform multiple computational hops per output symbol, expanding modeling capacity and expressivity to be able to capture things like out-of-order access, long-term dependencies, and unordered sets, further improving accuracy on such tasks.

Programs themselves are of course also data, and they certainly make use of complex causal, structural, grammatical, sequence-like properties, so programming is ripe for this approach. Last year neural Turing machines proved deep learning of programs to be possible.

This year Grefenstette et al. showed how programs can be transduced, or generatively figured out from sample output, much more efficiently than with neural Turing machines, by using a new type of memory-based recurrent neural networks (RNNs) where the nodes simply access differentiable versions of data structures such as stacks and queues. Reed and de Freitas of DeepMind have also recently shown how their neural programmer-interpreter can represent lower-level programs that control higher-level and domain-specific functionalities.

Another example of proficiency in understanding time in context, and applying that to create new artifacts, is a rudimentary but creative video summarization capability developed this year. Park and Kim from Seoul National U. developed a novel architecture called a coherent recurrent convolutional network, applying it to creating novel and fluid textual stories from sequences of images.

(credit: Cesc Chunseong Park and Gunhee Kim)

Another important modality that includes causal understanding, hypotheticals, and creativity in abstract thought is scientific hypothesizing. A team at Tufts combined genetic algorithms and genetic pathway simulation to create a system that arrived at the first significant new A.I.-discovered scientific theory of how exactly flatworms are able to regenerate body parts so readily as they do. In a couple of days, it had discovered what eluded scientists for a century. This should provide a resounding answer to those who question why we would ever want to make A.I.s curious in the first place.

Dreaming Up Visions

A.I. did not stop at writing programs, travelogues, and scientific theories this year. There are A.I.s now able to imagine, or using the technical term, hallucinate, meaningful new imagery as well. Deep learning isn’t only good at pattern recognition, but indeed pattern understanding and therefore also pattern creation.

team from MIT and Microsoft Research have created a deep convolution inverse graphic network, which, among other things, contains a special training technique to get neurons in its graphics code layer to differentiate to meaningful transformations of an image. In so doing, they are deep-learning a graphics engine, able to understand the 3D shapes in novel 2D images it receives, and able to photorealistically imagine what it would be like to change things like camera angle and lighting.

team from NYU and Facebook devised a way to generate realistic new images from meaningful and plausible combinations of elements it has seen in other images. Using a pyramid of adversarial networks — with some trying to produce realistic images and others critically judging how real the images look — their system is able to get better and better at imagining new photographs. Though the examples online are quite low-res, offline, I’ve seen some impressive related high-res results.

Also significant in ’15 is the ability to deeply imagine entirely new imagery based on short English descriptions of the desired picture. While scene renderers taking symbolic, restricted vocabularies have been around a while, this year has seen the advent of a purely neural system doing this in a way that’s not explicitly programmed. This University of Toronto team applies attention mechanisms to generation of images incrementally based on the meaning of each component of the description, in any of a number of ways per request. So androids can now dream of electric sheep.

(credit: Elman Mansimov et al.)

There has even been impressive progress in computational imagination of new animated video clips this year. A team from U. Michigan created a deep analogy system, which recognizes complex implicit relationships in exemplars and is able to apply that relationship as a generative transformation of query examples. They’ve applied this in a number of synthetic applications, but most impressive is the demo (from the 10:10-11:00 mark of the video embedded below) where an entirely new short video clip of an animated character is generated based on a single still image of the never-before-seen target character and a comparable video clip of a different character at a different angle.


University of Michigan | Oral Session: Deep Visual Analogy-Making

While the generation of imagery was used in these for ease of demonstration, their techniques for computational imagination are applicable across a wide variety of domains and modalities. Picture these applied to voices or music, for instance.

Agile and Dexterous Fine Motor Skills

This year’s progress in A.I. hasn’t been confined to computer screens.

Earlier in the year, a German primatology team recorded the hand motions of primates in tandem with corresponding neural activity, and they’re able to predict, based on brain activity, what fine motions are going on. They’ve also been able to teach those same fine motor skills to robotic hands, aiming at neural-enhanced prostheses.

In the middle of the year, a team at U.C. Berkeley announced a much more general, and easier, way to teach robots fine motor skills. They applied deep reinforcement learning based guided policy search to get robots to be able to screw caps on bottles, to use the back of a hammer to remove a nail from wood, and other seemingly everyday actions. These are the kind of actions that are typically trivial for people but very difficult for machines, and this team’s system matches human dexterity and speed at these tasks. It actually learns to do these actions by trying to do them using hand-eye coordination, and by practicing, refining its technique after just a few tries.

Watch This Space

This is by no means a comprehensive list of the impressive feats in A.I. and ML for the year. There are also many more foundational discoveries and developments that have occurred this year, including some that I fully expect to be more revolutionary than any of the above, but those are in early days and so out of the scope of these top picks.

This year has certainly provided some impressive progress. But we expect to see even more in 2016. Coming up next year, I expect to see some more radical deep architectures, better integration of the symbolic and subsymbolic, some impressive dialogue systems, an A.I. finally dominating the game of Go, deep learning being used for more elaborate robotic planning and motor control, high-quality video summarization, and more creative and higher-resolution dreaming, which should all be quite a sight.

What’s even more exciting are the developments we don’t expect.

Richard Mallah is the Director of A.I. Projects at technology beneficence nonprofit Future of Life Institute, and is the Director of Advanced Analytics at knowledge integration platform firm Cambridge Semantics, Inc.

How to teach machines to see

In a test by KurzweilAI using a Google Maps image of Market Street In San Francisco, the SegNet system accurately identified the various elements, even hard-to-see pedestrians (shown in brown on the left) and road markings. (credit: KurzweilAI/Cambridge University/Google)

Two new technologies that use deep-learning techniques to help machines see and analyze images (such as roads and people) could improve visual performance for driveless cars and create a new generation of smarter smartphones and cameras.

Designed by University of Cambridge researchers, the systems can recognize their own location and surroundings. Most driverless cars currently in development use radar and LIDAR sensors, which often cost more than the car itself. (See “New laser design could dramatically shrink autonomous-vehicle 3-D laser-ranging systems” for another solution.)

One of the systems, SegNet, can identify a user’s location and orientation, including places where GPS does not function, and can identify the various components of a road scene in real time on a regular camera or smartphone (see image above or try it yourself here).

SegNet can take an image of a street scene it hasn’t seen before and classify it, sorting objects into 12 different categories — such as roads, street signs, pedestrians, buildings and cyclists — in real time. It can deal with light, shadow, and night-time environments, and currently labels more than 90% of pixels correctly, according to the researchers. Previous systems using expensive laser or radar based sensors have not been able to reach this level of accuracy while operating in real time, the researchers say.

To create SegNet, Cambridge undergraduate students manually labeled every pixel in each of 5000 images, with each image taking about 30 minutes to complete. Once the labeling was finished, the researchers “trained” the system, which was successfully tested on both city roads and motorways.

“It’s remarkably good at recognizing things in an image, because it’s had so much practice,” said Alex Kendall, a PhD student in the Department of Engineering. “However, there are a million knobs that we can turn to fine-tune the system so that it keeps getting better.”

SegNet was primarily trained in highway and urban environments, so it still has some learning to do for rural, snowy, or desert environments. The system is not yet at the point where it can be used to control a car or truck, but it could be used as a warning system, similar to the anti-collision technologies currently available on some passenger cars.

But teaching a machine to see is far more difficult than it sounds, said Professor Roberto Cipolla, who led the research. “There are three key technological questions that must be answered to design autonomous vehicles: where am I, what’s around me and what do I do next?”

SegNet addresses the second question. The researchers’ Visual Localization system answers the first question. Using deep learning, it can determine their location and orientation from a single color image in a busy urban scene. The researchers say the system is far more accurate than GPS and works in places where GPS does not, such as indoors, in tunnels, or in cities where a reliable GPS signal is not available.

In a KurzweilAI test of the Visual Localization system (using an image in the Central Cambridge UK demo), the system accurately identified a Cambridge building, displaying the correct Google Maps street view, and marked its location on a Google map (credit: KurzweilAI/Cambridge University/Google)

It has been tested along a kilometer-long stretch of King’s Parade in central Cambridge, and it is able to determine both location and orientation within a few meters and a few degrees, which is far more accurate than GPS — a vital consideration for driverless cars, according to the researchers. (Try it here.)

The localization system uses the geometry of a scene to learn its precise location, and is able to determine, for example, whether it is looking at the east or west side of a building, even if the two sides appear identical.

“In the short term, we’re more likely to see this sort of system on a domestic robot — such as a robotic vacuum cleaner, for instance,” said Cipolla. “It will take time before drivers can fully trust an autonomous car, but the more effective and accurate we can make these technologies, the closer we are to the widespread adoption of driverless cars and other types of autonomous robotics.”

The researchers are presenting details of the two technologies at the International Conference on Computer Vision in Santiago, Chile.


Cambridge University | Teaching machines to see


Abstract of PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization

We present a robust and real-time monocular six degree of freedom relocalization system. Our system trains a convolutional neural network to regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need of additional engineering or graph optimisation. The algorithm can operate indoors and outdoors in real time, taking 5ms per frame to compute. It obtains approximately 2m and 3◦accuracy for large scale outdoor scenes and 0.5m and 5◦accuracy indoors. This is achieved using an efficient 23 layer deep convnet, demonstrating that convnets can be used to solve complicated out of image plane regression problems. This was made possible by leveraging transfer learning from large scale classi- fication data. We show that the PoseNet localizes from high level features and is robust to difficult lighting, motion blur and different camera intrinsics where point based SIFT registration fails. Furthermore we show how the pose feature that is produced generalizes to other scenes allowing us to regress pose with only a few dozen training examples.

AI ‘alarmists’ nominated for 2015 ‘Luddite Award’

An 1844 engraving showing a post-1820s Jacquard loom (credit: public domain/Penny Magazine)

The Information Technology and Innovation Foundation (ITIF) today (Dec. 21) announced 10 nominees for its 2015 Luddite Award. The annual “honor” recognizes the year’s most egregious example of a government, organization, or individual stymieing the progress of technological innovation.

ITIF also opened an online poll and invited the public to help decide the “winner.” The result will be announced in late January.

The nominees include (in no specific order):

1. Alarmists, including respected luminaries such as Elon Musk, Stephen Hawking, and Bill Gates, touting an artificial- intelligence apocalypse.

2. Advocates, including Hawking and Noam Chomsky, seeking a ban on “killer robots.”

3. Vermont and other states limiting automatic license plate readers.

4. Europe, China, and others choosing taxi drivers over car-sharing passengers.

5. The U.S. paper industry opposing e-labeling.

6. California’s governor vetoing RFID tags in driver’s licenses.

7. Wyoming effectively outlawing citizen science.

8. The Federal Communications Commission limiting broadband innovation.

9. The Center for Food Safety fighting genetically improved food.

10. Ohio and other states banning red light cameras.

‘Paranoia about evil machines’

(credit: Paramount Pictures)

“Just as Ned Ludd wanted to smash mechanized looms and halt industrial progress in the 19th century, today’s neo-Luddites want to foil technological innovation to the detriment of the rest of society,” said Robert D. Atkinson, ITIF’s founder and president.

“If we want a world in which innovation thrives, then everyone’s New Year’s resolution should be to replace neo-Luddism with an attitude of risk-taking and faith in the future.”

Atkinson notes that “paranoia about evil machines has swirled around in popular culture for more than 200 years, and these claims continue to grip the popular imagination, in no small part because these apocalyptic ideas are widely represented in books, movies, and music.

“The last year alone saw blockbuster films with a parade of digital villains, such as Avengers: Age of Ultron, Ex Machina, and Terminator: Genisys.”

He also cites statements in Oxford professor Nick Bostrom’s book Superintelligence: Paths, Dangers, Strategies, “reflecting the general fear that ‘superintelligence’ in machines could outperform ‘the best human minds in every field, including scientific creativity, general wisdom and social skills.’ Bostrom argues that artificial intelligence will advance to a point where its goals are no longer compatible with that of humans and, as a result, superintelligent machines will seek to enslave or exterminate us.”

“Raising such sci-fi doomsday scenarios just makes it harder for the public, policymakers,  and scientists to support more funding for AI research, Atkinson concludes. “Indeed, continuing the negative campaign against artificial intelligence could potentially dry up funding for AI research, other than money for how to control, rather than enable AI. What legislator wants to be known as ‘the godfather of the technology that destroyed the human race’?”

Not mentioned in the ITIF statement is the recently announced non-profit “OpenAI” research company founded by Elon Musk and associates, committing $1 billion toward their goal to advance digital intelligence in the way that is most likely to benefit humanity as a whole.”

The 2014 Luddite Award winners

The winners last year: the states of Arizona, Michigan, New Jersey, and Texas, for taking action to prevent Tesla from opening stores in their states to sell cars directly to consumers. Other nominees included:

  • National Rifle Association (NRA) for its opposition to smart guns
  • “Stop Smart Meters” Seeks To Stop Smart Innovation in Meters and Cars
  • Free Press Lobbies for Rules to Stop Innovation in Broadband Networks
  • The Media and Pundits Claiming That “Robots” Are Killing Jobs.

 

 

‘Robot locust’ can jump 11 feet high

Locust-inspired TAUB robot (credit: Tel Aviv University)

A locust-inspired miniature robot that can jump 3.35 meters (11 ft.), covering a distance of 1.37 meters (4.5 ft.) horizontally in one leap is designed to handle search-and-rescue and reconnaissance missions in rough terrain.

The new locust-inspired robot, dubbed “TAUB” (for “Tel Aviv University and Ort Braude College”), is 12.7 cm (5 in.) long and weighs weighs 23 grams (less than one ounce). It was developed by Tel Aviv University and Ort Braude College researchers.

The ABS plastic body of the robot was 3D-printed, its legs are composed of stiff carbon rods, and its torsion springs of steel wire. A small on-board battery powers the robot, which is remotely controlled via an on-board microcontroller.

Torsion springs

Locust vs. robot leg models (credit: Tel Aviv University)

A locust catapults itself in a three-stage process. First, the legs are bent in the preparation stage. Then the legs are locked in place at the joint. Finally, a sudden release of the flexor muscle on the upper leg unlocks the joint and causes a rapid release of energy.

This creates a fast-kicking movement of the legs that propels the locust into the air.

Like the locust, which uses stored mechanical energy to enhance the action of its leg muscles, the robot’s “high-jump” is due to its ability to store energy in its torsion springs.

The researchers are currently working on a gliding mechanism that will enable the robot to extend its jumping range, lower its landing impact, execute multiple steered jumps, and stabilize while airborne, expanding the possible field applications of the robot.


Abstract of A locust-inspired miniature jumping robot

Unmanned ground vehicles are mostly wheeled, tracked, or legged. These locomotion mechanisms have a limited ability to traverse rough terrain and obstacles that are higher than the robot’s center of mass. In order to improve the mobility of small robots it is necessary to expand the variety of their motion gaits. Jumping is one of nature’s solutions to the challenge of mobility in difficult terrain. The desert locust is the model for the presented bio-inspired design of a jumping mechanism for a small mobile robot. The basic mechanism is similar to that of the semilunar process in the hind legs of the locust, and is based on the cocking of a torsional spring by wrapping a tendon-like wire around the shaft of a miniature motor. In this study we present the jumping mechanism design, and the manufacturing and performance analysis of two demonstrator prototypes. The most advanced jumping robot demonstrator is power autonomous, weighs 23 gr and is capable of jumping to a height of 3.35m, covering a distance of 1.37m.

Musk, others commit $1 billion to non-profit AI research company to ‘benefit humanity’

(credit: OpenAI)

Elon Musk and associates announced OpenAI, a non-profit AI research company, on Friday (Dec. 11), committing $1 billion toward their goal to “advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.”

The funding comes from a group of tech leaders including Musk, Reid Hoffman, Peter Thiel, and Amazon Web Services, but the venture expects to only spend “a tiny fraction of this in the next few years.”

The founders note that it’s hard to predict how much AI could “damage society if built or used incorrectly” or how soon. But the hope is to have a leading research institution that can “prioritize a good outcome for all over its own self-interest … as broadly and evenly distributed as possible.”

Brains trust

OpenAI’s co-chairs are Musk, who is also the principal funder of Future of Life Institute, and Sam Altman, president of  venture-capital seed-accelerator firm Y Combinator, who is also providing funding.

I think the best defense against the misuse of AI is to empower as many people as possible to have AI. If everyone has AI powers, then there’s not any one person or a small set of individuals who can have AI superpower.” — Elon Musk on Medium

The founders say the organization’s patents (if any) “will be shared with the world. We’ll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.”

OpenAI’s research director is machine learning expert Ilya Sutskever, formerly at Google, and its CTO is Greg Brockman, formerly the CTO of Stripe. The group’s other founding members are “world-class research engineers and scientists” Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, John Schulman, Pamela Vagata, and Wojciech Zaremba. Pieter Abbeel, Yoshua Bengio, Alan Kay, Sergey Levine, and Vishal Sikka are advisors to the group. The company will be based in San Francisco.


If I’m Dr. Evil and I use it, won’t you be empowering me?

“There are a few different thoughts about this. Just like humans protect against Dr. Evil by the fact that most humans are good, and the collective force of humanity can contain the bad elements, we think its far more likely that many, many AIs will work to stop the occasional bad actors than the idea that there is a single AI a billion times more powerful than anything else. If that one thing goes off the rails or if Dr. Evil gets that one thing and there is nothing to counteract it, then we’re really in a bad place.” — Sam Altman in an interview with Steven Levy on Medium.


The announcement follows recent announcements by Facebook to open-source the hardware design of its GPU-based “Big Sur” AI server (used for large-scale machine learning software to identify objects in photos and understand natural language, for example); by Google to open-source its TensorFlow machine-learning software; and by Toyota Corporation to invest $1 billion in a five-year private research effort in artificial intelligence and robotics technologies, jointly with Stanford University and MIT.

To follow OpenAI: @open_ai or info@openai.com

When machines learn like humans

Humans and machines were given an image of a novel character (top) and asked to produce new versions. A machine generated the nine-character grid on the left (credit: Jose-Luis Olivares/MIT — figures courtesy of the researchers)

A team of scientists has developed an algorithm that captures human learning abilities, enabling computers to recognize and draw simple visual concepts that are mostly indistinguishable from those created by humans.

The work by researchers at MIT, New York University, and the University of Toronto, which appears in the latest issue of the journal Science, marks a significant advance in the field — one that dramatically shortens the time it takes computers to “learn” new concepts and broadens their application to more creative tasks, according to the researchers.

“Our results show that by reverse-engineering how people think about a problem, we can develop better algorithms,” explains Brenden Lake, a Moore-Sloan Data Science Fellow at New York University and the paper’s lead author. “Moreover, this work points to promising methods to narrow the gap for other machine-learning tasks.”

The paper’s other authors are Ruslan Salakhutdinov, an assistant professor of Computer Science at the University of Toronto, and Joshua Tenenbaum, a professor at MIT in the Department of Brain and Cognitive Sciences and the Center for Brains, Minds and Machines.

When humans are exposed to a new concept — such as new piece of kitchen equipment, a new dance move, or a new letter in an unfamiliar alphabet — they often need only a few examples to understand its make-up and recognize new instances. But machines typically need to be given hundreds or thousands of examples to perform with similar accuracy.

“It has been very difficult to build machines that require as little data as humans when learning a new concept,” observes Salakhutdinov. “Replicating these abilities is an exciting area of research connecting machine learning, statistics, computer vision, and cognitive science.”

Salakhutdinov helped to launch recent interest in learning with “deep neural networks,” in a paper published in Science almost 10 years ago with his doctoral advisor Geoffrey Hinton. Their algorithm learned the structure of 10 handwritten character concepts — the digits 0-9 — from 6,000 examples each, or a total of 60,000 training examples.

Bayesian Program Learning

Simple visual concepts for comparing human and machine learning. 525 (out of 1623) character concepts, shown with one example each. (credit: Brenden M. Lake et al./Science)

In the work appearing in Science this week, the researchers sought to shorten the learning process and make it more akin to the way humans acquire and apply new knowledge: learning from a small number of examples and performing a range of tasks, such as generating new examples of a concept or generating whole new concepts.

To do so, they developed a “Bayesian Program Learning” (BPL) framework, where concepts are represented as simple computer programs. For instance, the form of the letter “A” is represented by computer code that generates examples of that letter when the code is run. Yet no programmer is required during the learning process. Also, these probabilistic programs produce different outputs at each execution. This allows them to capture the way instances of a concept vary, such as the differences between how different people draw the letter “A.”

This differs from standard pattern-recognition algorithms, which represent concepts as configurations of pixels or collections of features. The BPL approach learns “generative models” of processes in the world, making learning a matter of “model building” or “explaining” the data provided to the algorithm.

The researchers “explained” to the system that characters in human writing systems consist of strokes (lines demarcated by the lifting of the pen) and substrokes, demarcated by points at which the pen’s velocity is zero. With that simple information, the system then analyzed hundreds of motion-capture recordings of humans drawing characters in several different writing systems, learning statistics on the relationships between consecutive strokes and substrokes, as well as on the variation tolerated in the execution of a single stroke.

That means that the system learned the concept of a character and what to ignore (minor variations) in any specific instance.

The BPL model also “learns to learn” by using knowledge from previous concepts to speed learning on new concepts — for example, using knowledge of the Latin alphabet to learn letters in the Greek alphabet.

Cipher for Futurama Alien Language 1 (credit: The Infosphere, the Futurama Wiki)

The authors applied their model to more than 1,600 types of handwritten characters in 50 of the world’s writing systems, including Sanskrit and Tibetan — and even some invented characters such as those from the television series “Futurama.”

Visual Turing tests

In addition to testing the algorithm’s ability to recognize new instances of a concept, the authors asked both humans and computers to reproduce a series of handwritten characters after being shown a single example of each character, or in some cases, to create new characters in the style of those it had been shown. The scientists then compared the outputs from both humans and machines through “visual Turing tests.” Here, human judges were given paired examples of both the human and machine output, along with the original prompt, and asked to identify which of the symbols were produced by the computer.

While judges’ correct responses varied across characters, for each visual Turing test, fewer than 25 percent of judges performed significantly better than chance in assessing whether a machine or a human produced a given set of symbols.

“Before they get to kindergarten, children learn to recognize new concepts from just a single example, and can even imagine new examples they haven’t seen,” notes Tenenbaum. “I’ve wanted to build models of these remarkable abilities since my own doctoral work in the late nineties.

“We are still far from building machines as smart as a human child, but this is the first time we have had a machine able to learn and use a large class of real-world concepts — even simple visual concepts such as handwritten characters — in ways that are hard to tell apart from humans.”

Beyond deep-learning methods

The researchers argue that their system captures something of the elasticity of human concepts, which often have fuzzy boundaries but still seem to delimit coherent categories. It also mimics the human ability to learn new concepts from few examples.

It thus offers hope, they say, that the type of computational structure it’s built on, called a probabilistic program, could help model human acquisition of more sophisticated concepts as well.

“I feel that this is a major contribution to science, of general interest to artificial intelligence, cognitive science, and machine learning,” says Zoubin Ghahramani, a professor of information engineering at the University of Cambridge. “Given the major successes of deep learning, the paper also provides a very sobering view of the limitations of such deep-learning methods — which are very data-hungry and perform poorly on the tasks in this paper — and an important alternative avenue for achieving human-level machine learning.”

The work was supported by grants from the National Science Foundation to MIT’s Center for Brains, Minds and Machines, the Army Research Office, the Office of Naval Research, and the Moore-Sloan Data Science Environment at New York University.


Brenden Lake | NYU fellow Brenden Lake on human-level concept learning


Abstract of Human-level concept learning through probabilistic program induction

People learning new concepts can often generalize successfully from just a single example, yet machine learning algorithms typically require tens or hundreds of examples to perform with similar accuracy. People can also use learned concepts in richer ways than conventional algorithms—for action, imagination, and explanation. We present a computational model that captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets. The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. On a challenging one-shot classification task, the model achieves human-level performance while outperforming recent deep learning approaches. We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.

AI will replace smartphones within 5 years, Ericsson survey suggests

(credit: Ericsson ConsumerLab)

Artificial intelligence (AI) interfaces will take over, replacing smartphones in five years, according to a survey of more than 5000 smartphone customers in nine countries by Ericsson ConsumerLab in the fifth edition of its annual trend report, 10 Hot Consumer Trends 2016 (and beyond).

Smartphone users believe AI will take over many common activities, such as searching the net, getting travel guidance, and as personal assistants. The survey found that 44 percent think an AI system would be as good as a teacher and one third would like an AI interface to keep them company. A third would rather trust the fidelity of an AI interface than a human for sensitive matters; and 29 percent agree they would feel more comfortable discussing their medical condition with an AI system.

However, many of the users surveyed find smartphones limited.

Impractical. Constantly having a screen in the palm of your hand is not always a practical solution, such as in driving or cooking.

Battery capacity limits. One in 3 smartphone users want a 7−8 inch screen, creating a battery drain vs. size and weight issue.

Not wearable. 85 percent of the smartphone users think intelligent wearable electronic assistants will be commonplace within 5 years, reducing the need to always touch a screen. And one in two users believes they will be able to talk directly to household appliances.

VR and 3D better. The smartphone users want movies that play virtually around the viewer, virtual tech support, and VR headsets for sports, and more than 50 percent of consumers think holographic screens will be mainstream within 5 years — capabilities not available in a small handheld device. Half of the smartphone users want a 3D avatar to try on clothes online, and 64 percent would like the ability to see an item’s actual size and form when shopping online. Half of the users want to bypass shopping altogether, with a 3D printer for printing household objects such as spoons, toys and spare parts for appliances; 44 percent even want to print their own food or nutritional supplements.

The 10 hot trends for 2016 and beyond cited in the report

  1. The Lifestyle Network Effect. Four out of five people now experience an effect where the benefits gained from online services increases as more people use them. Globally, one in three consumers already participates in various forms of the sharing economy.
  2. Streaming NativesTeenagers watch more YouTube video content daily than other age groups. Forty-six percent of 16-19 year-olds spend an hour or more on YouTube every day.
  3. AI Ends The Screen AgeArtificial intelligence will enable interaction with objects without the need for a smartphone screen. One in two smartphone users think smartphones will be a thing of the past within the next five years.
  4. Virtual Gets RealConsumers want virtual technology for everyday activities such as watching sports and making video calls. Forty-four percent even want to print their own food.
  5. Sensing Homes. Fifty-five percent of smartphone owners believe bricks used to build homes could include sensors that monitor mold, leakage and electricity issues within the next five years. As a result, the concept of smart homes may need to be rethought from the ground up.
  6. Smart CommutersCommuters want to use their time meaningfully and not feel like passive objects in transit. Eighty-six percent would use personalized commuting services if they were available.
  7. Emergency ChatSocial networks may become the preferred way to contact emergency services. Six out of 10 consumers are also interested in a disaster information app.
  8. InternablesInternal sensors that measure well-being in our bodies may become the new wearables. Eight out of 10 consumers would like to use technology to enhance sensory perceptions and cognitive abilities such as vision, memory and hearing.
  9. Everything Gets HackedMost smartphone users believe hacking and viruses will continue to be an issue. As a positive side-effect, one in five say they have greater trust in an organization that was hacked but then solved the problem.
  10. Netizen JournalistsConsumers share more information than ever and believe it increases their influence on society. More than a third believe blowing the whistle on a corrupt company online has greater impact than going to the police.

Source: 10 Hot Consumer Trends 2016. Ericsson ConsumerLab, Information Sharing, 2015. Base: 5,025 iOS/Android smartphone users aged 15-69 in Berlin, Chicago, Johannesburg, London, Mexico City, Moscow, New York, São Paulo, Sydney and Tokyo

How robots can learn from babies

A collaboration between UW developmental psychologists and computer scientists aims to enable robots to learn in the same way that children naturally do. The team used research on how babies follow an adult’s gaze to “teach” a robot to perform the same task. (credit: University of Washington)

Babies learn about the world by exploring how their bodies move in space, grabbing toys, pushing things off tables and by watching and imitating what adults are doing. So instead of laboriously writing code (or moving a robot’s arm or body to show it how to perform an action), why not just let them learn like babies?

That’s exactly what University of Washington (UW) developmental psychologists and computer scientists have now demonstrated in experiments that suggest that robots can “learn” much like kids — by amassing data through exploration, watching a human do something, and determining how to perform that task on its own.

That new method would allow someone who doesn’t know anything about computer programming to be able to teach a robot by demonstration — showing the robot how to clean your dishes, fold your clothes, or do household chores.

“But to achieve that goal, you need the robot to be able to understand those actions and perform them on their own,” said Rajesh Rao, a UW professor of computer science and engineering and senior author of an open-access paper in the journal PLoS ONE.

In the paper, the UW team developed a new probabilistic model aimed at solving a fundamental challenge in robotics: building robots that can learn new skills by watching people and imitating them. The roboticists collaborated with UW psychology professor and I-LABS co-director Andrew Meltzoff, whose seminal research has shown that children as young as 18 months can infer the goal of an adult’s actions and develop alternate ways of reaching that goal themselves.

In one example, infants saw an adult try to pull apart a barbell-shaped toy, but the adult failed to achieve that goal because the toy was stuck together and his hands slipped off the ends. The infants watched carefully and then decided to use alternate methods — they wrapped their tiny fingers all the way around the ends and yanked especially hard — duplicating what the adult intended to do.

Machine-learning algorithms based on play

This robot used the new UW model to imitate a human moving toy food objects around a tabletop. By learning which actions worked best with its own geometry, the robot could use different means to achieve the same goal — a key to enabling robots to learn through imitation. (credit: University of Washington)

Children acquire intention-reading skills, in part, through self-exploration that helps them learn the laws of physics and how their own actions influence objects, eventually allowing them to amass enough knowledge to learn from others and to interpret their intentions. Meltzoff thinks that one of the reasons babies learn so quickly is that they are so playful.

“Babies engage in what looks like mindless play, but this enables future learning. It’s a baby’s secret sauce for innovation,” Meltzoff said. “If they’re trying to figure out how to work a new toy, they’re actually using knowledge they gained by playing with other toys. During play they’re learning a mental model of how their actions cause changes in the world. And once you have that model you can begin to solve novel problems and start to predict someone else’s intentions.”

Rao’s team used that infant research to develop machine learning algorithms that allow a robot to explore how its own actions result in different outcomes. Then the robot uses that learned probabilistic model to infer what a human wants it to do and complete the task, and even to “ask” for help if it’s not certain it can.

How to follow a human’s gaze

The team tested its robotic model in two different scenarios: a computer simulation experiment in which a robot learns to follow a human’s gaze, and another experiment in which an actual robot learns to imitate human actions involving moving toy food objects to different areas on a tabletop.

In the gaze experiment, the robot learns a model of its own head movements and assumes that the human’s head is governed by the same rules. The robot tracks the beginning and ending points of a human’s head movements as the human looks across the room and uses that information to figure out where the person is looking. The robot then uses its learned model of head movements to fixate on the same location as the human.

The team also recreated one of Meltzoff’s tests that showed infants who had experience with visual barriers and blindfolds weren’t interested in looking where a blindfolded adult was looking, because they understood the person couldn’t actually see. Once the team enabled the robot to “learn” what the consequences of being blindfolded were, it no longer followed the human’s head movement to look at the same spot.

Smart movements: beyond mimicking

In the second experiment, the team allowed a robot to experiment with pushing or picking up different objects and moving them around a tabletop. The robot used that model to imitate a human who moved objects around or cleared everything off the tabletop. Rather than rigidly mimicking the human action each time, the robot sometimes used different means to achieve the same ends.

“If the human pushes an object to a new location, it may be easier and more reliable for a robot with a gripper to pick it up to move it there rather than push it,” said lead author Michael Jae-Yoon Chung, a UW doctoral student in computer science and engineering. “But that requires knowing what the goal is, which is a hard problem in robotics and which our paper tries to address.”

Though the initial experiments involved learning how to infer goals and imitate simple behaviors, the team plans to explore how such a model can help robots learn more complicated tasks.

“Babies learn through their own play and by watching others,” says Meltzoff, “and they are the best learners on the planet — why not design robots that learn as effortlessly as a child?”

That raises a question: can babies also learn from robots they’ve taught — in a closed loop? And where might that eventually take education — and civilization?


Abstract of A Bayesian Developmental Approach to Robotic Goal-Based Imitation Learning

A fundamental challenge in robotics today is building robots that can learn new skills by observing humans and imitating human actions. We propose a new Bayesian approach to robotic learning by imitation inspired by the developmental hypothesis that children use self-experience to bootstrap the process of intention recognition and goal-based imitation. Our approach allows an autonomous agent to: (i) learn probabilistic models of actions through self-discovery and experience, (ii) utilize these learned models for inferring the goals of human actions, and (iii) perform goal-based imitation for robotic learning and human-robot collaboration. Such an approach allows a robot to leverage its increasing repertoire of learned behaviors to interpret increasingly complex human actions and use the inferred goals for imitation, even when the robot has very different actuators from humans. We demonstrate our approach using two different scenarios: (i) a simulated robot that learns human-like gaze following behavior, and (ii) a robot that learns to imitate human actions in a tabletop organization task. In both cases, the agent learns a probabilistic model of its own actions, and uses this model for goal inference and goal-based imitation. We also show that the robotic agent can use its probabilistic model to seek human assistance when it recognizes that its inferred actions are too uncertain, risky, or impossible to perform, thereby opening the door to human-robot collaboration.

Army ants’ ‘living’ bridges suggest collective intelligence

Creating “living” bridges, army ants of the species Eciton hamatum automatically assemble with a level of collective intelligence that could provide new insights into animal behavior and help develop cooperating robots. (credit: Courtesy of Matthew Lutz, Princeton University, and Chris Reid, University of Sydney)

Researchers from Princeton University and the New Jersey Institute of Technology (NJIT) report for the first time that army ants of the species Eciton hamatum that form “living” bridges across breaks and gaps in the forest floor are more sophisticated than scientists knew. The ants exhibit a level of collective intelligence that could provide new insights into animal behavior and even help in the development of intuitive robots that can cooperate as a group, the researchers said.

Ants of E. hamatum automatically form living bridges without any oversight from a “lead” ant, the researchers report in the journal Proceedings of the National Academy of the Sciences. The action of each individual coalesces into a group unit that can adapt to the terrain and also operates by a clear cost-benefit ratio. The ants will create a path over an open space up to the point when too many workers are being diverted from collecting food and prey.

Collective computation

The researchers suggest that these ants are performing a collective computation. At the level of the entire colony, they’re saying they can afford this many ants locked up in this bridge, but no more than that. There’s no single ant overseeing the decision, they’re making that calculation as a colony.

The research could help explain how large groups of animals balance cost and benefit, about which little is known, said co-author Iain Couzin, a Princeton visiting senior research scholar in ecology and evolutionary biology, and director of the Max Planck Institute for Ornithology and chair of biodiversity and collective behavior at the University of Konstanz in Germany.

Previous studies have shown that single creatures use “rules of thumb” to weigh cost-and-benefit, said Couzin. This new work shows that in large groups these same individual guidelines can eventually coordinate group-wide  — the ants acted as a unit although each ant only knew its immediate circumstances, he said.

Swarm intelligence for robots

Ant-colony behavior has been the basis of algorithms related to telecommunications and vehicle routing, among other areas. Ants exemplify “swarm intelligence,” in which individual-level interactions produce coordinated group behavior. E. hamatum crossings assemble when the ants detect congestion along their raiding trail, and disassemble when normal traffic has resumed.

Previously, scientists thought that ant bridges were static structures — their appearance over large gaps that ants clearly could not cross in midair was somewhat of a mystery. The researchers found, however, that the ants, when confronted with an open space, start from the narrowest point of the expanse and work toward the widest point, expanding the bridge as they go to shorten the distance their compatriots must travel to get around the expanse.

The researchers suggest that by extracting the rules used by individual ants about whether to initiate, join or leave a living structure, we could program swarms of simple robots to build bridges and other structures by connecting to each other.


Matthew Lutz, Princeton University, and Chris Reid, University of Sydney


Matthew Lutz, Princeton University, and Chris Reid, University of Sydney


Abstract of Army ants dynamically adjust living bridges in response to a cost–benefit trade-off

The ability of individual animals to create functional structures by joining together is rare and confined to the social insects. Army ants (Eciton) form collective assemblages out of their own bodies to perform a variety of functions that benefit the entire colony. Here we examine ‟bridges” of linked individuals that are constructed to span gaps in the colony’s foraging trail. How these living structures adjust themselves to varied and changing conditions remains poorly understood. Our field experiments show that the ants continuously modify their bridges, such that these structures lengthen, widen, and change position in response to traffic levels and environmental geometry. Ants initiate bridges where their path deviates from their incoming direction and move the bridges over time to create shortcuts over large gaps. The final position of the structure depended on the intensity of the traffic and the extent of path deviation and was influenced by a cost–benefit trade-off at the colony level, where the benefit of increased foraging trail efficiency was balanced by the cost of removing workers from the foraging pool to form the structure. To examine this trade-off, we quantified the geometric relationship between costs and benefits revealed by our experiments. We then constructed a model to determine the bridge location that maximized foraging rate, which qualitatively matched the observed movement of bridges. Our results highlight how animal self-assemblages can be dynamically modified in response to a group-level cost–benefit trade-off, without any individual unit’s having information on global benefits or costs.