Carnegie Mellon AI beats top poker pros — a first

“Brains vs Artificial Intelligence” competition at the Rivers Casino in Pittsburgh (credit: Carnegie Mellon University)

Libratus, an AI developed by Carnegie Mellon University, has defeated four of the world’s best professional poker players in a marathon 120,000 hands of Heads-up, No-Limit Texas Hold’em poker played over 20 days, CMU announced today (Jan. 31) — joining Deep Blue (for chess), Watson, and Alpha Go as major milestones in AI.

Libratus led the pros by a collective $1,766,250 in chips.* The tournament was held at the Rivers Casino in Pittsburgh from 11–30 January in a competition called “Brains Vs. Artificial Intelligence: Upping the Ante.”

The developers of Libratus — Tuomas Sandholm, professor of computer science, and Noam Brown, a Ph.D. student in computer science — said the sizable victory is statistically significant and not simply a matter of luck. “The best AI’s ability to do strategic reasoning with imperfect information has now surpassed that of the best humans,” Sandholm said. “This is the last frontier, at least in the foreseeable horizon, in game-solving in AI.”

This new AI milestone has implications for any realm in which information is incomplete and opponents sow misinformation, said Frank Pfenning, head of the Computer Science Department in CMU’s School of Computer Science. Business negotiation, military strategy, cybersecurity, and medical treatment planning could all benefit from automated decision-making using a Libratus-like AI.

“The computer can’t win at poker if it can’t bluff,” Pfenning explained. “Developing an AI that can do that successfully is a tremendous step forward scientifically and has numerous applications. Imagine that your smartphone will someday be able to negotiate the best price on a new car for you. That’s just the beginning.”

How the pros taught Libratus about its weaknesses

Brains vs AI scorecard (credit: Carnegie Mellon University)

So how was Libratus was able to improve day to day during the competition? It turns out it was the pros themselves who taught Libratus about its weaknesses. “After play ended each day, a meta-algorithm analyzed what holes the pros had identified and exploited in Libratus’ strategy,” Sandholm explained. “It then prioritized the holes and algorithmically patched the top three using the supercomputer each night.

“This is very different than how learning has been used in the past in poker. Typically researchers develop algorithms that try to exploit the opponent’s weaknesses. In contrast, here the daily improvement is about algorithmically fixing holes in our own strategy.”

Sandholm also said that Libratus’ end-game strategy was a major advance. “The end-game solver has a perfect analysis of the cards,” he said. It was able to update its strategy for each hand in a way that ensured any late changes would only improve the strategy. Over the course of the competition, the pros responded by making more aggressive moves early in the hand, no doubt to avoid playing in the deep waters of the endgame where the AI had an advantage, he added.

Converging high-performance computing and AI

Professor Tuomas Sandholm, Carnegie Mellon School of Computer Science, with the Pittsburgh Supercomputing Center’s Bridges supercomputer (credit: Carnegie Mellon University)

Libratus’ victory was made possible by the Pittsburgh Supercomputing Center’s Bridges computer. Libratus recruited the raw power of approximately 600 of Bridges’ 846 compute nodes. Bridges’ total speed is 1.35 petaflops, about 7,250 times as fast as a high-end laptop, and its memory is 274 terabytes, about 17,500 as much as you’d get in that laptop. This computing power gave Libratus the ability to play four of the best Texas Hold’em players in the world at once and beat them.

“We designed Bridges to converge high-performance computing and artificial intelligence,” said Nick Nystrom, PSC’s senior director of research and principal investigator for the National Science Foundation-funded Bridges system. “Libratus’ win is an important milestone toward developing AIs to address complex, real-world problems. At the same time, Bridges is powering new discoveries in the physical sciences, biology, social science, business and even the humanities.”

Sandholm said he will continue his research push on the core technologies involved in solving imperfect-information games and in applying these technologies to real-world problems. That includes his work with Optimized Markets, a company he founded to automate negotiations.

“CMU played a pivotal role in developing both computer chess, which eventually beat the human world champion, and Watson, the AI that beat top human Jeopardy! competitors,” Pfenning said. “It has been very exciting to watch the progress of poker-playing programs that have finally surpassed the best human players. Each one of these accomplishments represents a major milestone in our understanding of intelligence.

Head’s-Up No-Limit Texas Hold’em is a complex game, with 10160 (the number 1 followed by 160 zeroes) information sets — each set being characterized by the path of play in the hand as perceived by the player whose turn it is. The AI must make decisions without knowing all of the cards in play, while trying to sniff out bluffing by its opponent. As “no-limit” suggests, players may bet or raise any amount up to all of their chips.

Sandholm will be sharing Libratus’ secrets now that the competition is over, beginning with invited talks at the Association for the Advancement of Artificial Intelligence meeting Feb. 4–9 in San Francisco and in submissions to peer-reviewed scientific conferences and journals.

* The pros — Dong Kim, Jimmy Chou, Daniel McAulay and Jason Les — will split a $200,000 prize purse based on their respective performances during the event. McAulay, of Scotland, said Libratus was a tougher opponent than he expected, but it was exciting to play against it. “Whenever you play a top player at poker, you learn from it,” he said.


Carnegie Mellon University | Brains Vs. AI Rematch: Why Poker?

A deep learning algorithm outperforms some board-certified dermatologists in diagnosis of skin cancer

A dermatologist uses a dermatoscope, a type of handheld microscope, to look at skin. Stanford AI scientists have created a deep convolutional neural network algorithm for skin cancer that matched the performance of board-certified dermatologists. (credit: Matt Young)

Deep learning has been touted for its potential to enhance the diagnosis of diseases, and now a team of researchers at Stanford has developed a deep-learning algorithm that may make this vision a reality for skin cancer.*

The researchers, led by Dr. Sebastian Thrun, an adjunct professor at the Stanford Artificial Intelligence Laboratory, reported in the January 25 issue of Nature that their deep convolutional neural network (CNN) algorithm performed as well or better than 21 board-certified dermatologists at diagnosing skin cancer. (See “Skin cancer classification performance of the CNN (blue) and dermatologists (red)” figure below.)

Diagnosing skin cancer begins with a visual examination. A dermatologist usually looks at the suspicious lesion with the naked eye and with the aid of a dermatoscope, which is a handheld microscope that provides low-level magnification of the skin. If these methods are inconclusive or lead the dermatologist to believe the lesion is cancerous, a biopsy is the next step. This deep learning algorithm may help dermatologists decide which skin lesions to biopsy.

“My main eureka moment was when I realized just how ubiquitous smartphones will be,” said Stanford Department of Electrical Engineering’s Andre Esteva, co-lead author of the study. “Everyone will have a supercomputer in their pockets with a number of sensors in it, including a camera. What if we could use it to visually screen for skin cancer? Or other ailments?”

It is projected that there will be 6.3 billion smartphone subscriptionst by the year 2021, according to Ericsson Mobility Report (2016), which could potentially provide low-cost universal access to vital diagnostic care.

Creating the deep convolutional neural network (CNN) algorithm

Deep CNN classification technique. Data flow is from left to right: an image of a skin lesion (for example, melanoma) is sequentially warped into a probability distribution over clinical classes of skin disease using Google Inception v3 CNN architecture pretrained on the ImageNet dataset (1.28 million images over 1,000 generic object classes) and fine-tuned on the team’s own dataset of 129,450 skin lesions comprising 2,032 different diseases. (credit: Andre Esteva et al./Nature)

Rather than building an algorithm from scratch, the researchers began with an algorithm developed by Google that was already trained to identify 1.28 million images from 1,000 object categories. It was designed primarily to be able to differentiate cats from dogs, but the researchers needed it to differentiate benign and malignant lesions. So they collaborated with dermatologists at Stanford Medicine, as well as Helen M. Blau, professor of microbiology and immunology at Stanford and co-author of the paper.

The algorithm was trained with nearly 130,000 images representing more than 2,000 different diseases with an associated disease label, allowing the system to overcome variations in angle, lighting, and zoom. The algorithm was then tested against 1,942 images of skin that were digitally annotated with biopsy-proven diagnoses of skin cancer. Overall, the algorithm identified the vast majority of cancer cases with accuracy rates that were similar to expert clinical dermatologists.

However, during testing, the researchers used only high-quality, biopsy-confirmed images provided by the University of Edinburgh and the International Skin Imaging Collaboration Project that represented the most common and deadliest skin cancers — malignant carcinomas and malignant melanomas.

Skin cancer classification performance of the CNN (blue) and dermatologists (red).** (credit: Andre Esteva et al./Nature)

The 21 dermatologists were asked whether, based on each image, they would proceed with biopsy or treatment, or reassure the patient. The researchers evaluated success by how well the dermatologists were able to correctly diagnose both cancerous and non-cancerous lesions in more than 370 images.***

However, Susan Swetter, professor of dermatology and director of the Pigmented Lesion and Melanoma Program at the Stanford Cancer Institute and co-author of the paper, notes that “rigorous prospective validation of the algorithm is necessary before it can be implemented in clinical practice, by practitioners and patients alike.”

* Every year there are about 5.4 million new cases of skin cancer in the United States, and while the five-year survival rate for melanoma detected in its earliest states is around 97 percent, that drops to approximately 14 percent if it’s detected in its latest stages.

** “Skin cancer classification performance of the CNN and dermatologists. The deep learning CNN outperforms the average of the dermatologists at skin cancer classification using photographic and
dermoscopic images. Our CNN is tested against at least 21 dermatologists at keratinocyte carcinoma and melanoma recognition. For each test, previously unseen, biopsy-proven images of lesions are displayed, and dermatologists are asked if they would: biopsy/treat the lesion or reassure the patient. Sensitivity, the true positive rate, and specificity, the true negative rate, measure performance. A dermatologist outputs a single prediction per image and is thus represented by a single red point. The green points are the average of the dermatologists for each task, with error bars denoting one standard deviation.” — Andre Esteva et al./Nature

*** The algorithm’s performance was measured through the creation of a sensitivity-specificity curve, where sensitivity represented its ability to correctly identify malignant lesions and specificity represented its ability to correctly identify benign lesions. It was assessed through three key diagnostic tasks: keratinocyte carcinoma classification, melanoma classification, and melanoma classification when viewed using dermoscopy. In all three tasks, the algorithm matched the performance of the dermatologists with the area under the sensitivity-specificity curve amounting to at least 91 percent of the total area of the graph. An added advantage of the algorithm is that, unlike a person, the algorithm can be made more or less sensitive, allowing the researchers to tune its response depending on what they want it to assess. This ability to alter the sensitivity hints at the depth and complexity of this algorithm. The underlying architecture of seemingly irrelevant photos —  including cats and dogs — helps it better evaluate the skin lesion images.


Abstract of Dermatologist-level classification of skin cancer with deep neural networks

Skin cancer, the most common human malignancy, is primarily diagnosed visually, beginning with an initial clinical screening and followed potentially by dermoscopic analysis, a biopsy and histopathological examination. Automated classification of skin lesions using images is a challenging task owing to the fine-grained variability in the appearance of skin lesions. Deep convolutional neural networks (CNNs) show potential for general and highly variable tasks across many fine-grained object categories. Here we demonstrate classification of skin lesions using a single CNN, trained end-to-end from images directly, using only pixels and disease labels as inputs. We train a CNN using a dataset of 129,450 clinical images—two orders of magnitude larger than previous datasets—consisting of 2,032 different diseases. We test its performance against 21 board-certified dermatologists on biopsy-proven clinical images with two critical binary classification use cases: keratinocyte carcinomas versus benign seborrheic keratoses; and malignant melanomas versus benign nevi. The first case represents the identification of the most common cancers, the second represents the identification of the deadliest skin cancer. The CNN achieves performance on par with all tested experts across both tasks, demonstrating an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists. Outfitted with deep neural networks, mobile devices can potentially extend the reach of dermatologists outside of the clinic. It is projected that 6.3 billion smartphone subscriptions will exist by the year 2021 and can therefore potentially provide low-cost universal access to vital diagnostic care.

AI system performs better than 75 percent of American adults on standard visual intelligence test

An example question from the Raven’s Progressive Matrices standardized fluid-intelligence test.* (credit: Ken Forbus)

A Northwestern University team has developed a new visual problem-solving computational model that performs in the 75th percentile for American adults on a standard intelligence test.

The research is an important step toward making artificial-intelligence systems that see and understand the world as humans do, says Northwestern Engineering’s Ken Forbus, Walter P. Murphy Professor of Electrical Engineering and Computer Science at Northwestern’s McCormick School of Engineering. The research was published online in January 2017 in the journal Psychological Review.

The new computational model** is built on CogSketch, an AI platform previously developed in Forbus’ laboratory. It can solve visual problems and understand sketches to give immediate, interactive feedback. CogSketch also incorporates a computational model of analogy, based on Northwestern psychology professor Dedre Gentner’s structure-mapping engine.

The ability to solve complex visual problems is one of the hallmarks of human intelligence. Developing artificial intelligence systems that have this ability provides new evidence for the importance of symbolic representations and analogy in visual reasoning, and it could potentially shrink the gap between computer and human cognition, the researchers suggest.

A nonverbal fluid-intelligence test

The researchers tested the AI system on Raven’s Progressive Matrices, a nonverbal standardized test that measures abstract reasoning.*** All of the test’s problems consist of a matrix with one image missing. The test taker is given six to eight choices for completing the matrix.

“The problems that are hard for people are also hard for the model, providing additional evidence that its operation is capturing some important properties of human cognition,” said Forbus.

“The Raven’s test is the best existing predictor of what psychologists call ‘fluid intelligence,’ or the general ability to think abstractly, reason, identify patterns, solve problems, and discern relationships,” said co-author Andrew Lovett, now a researcher at the U.S. Naval Research Laboratory. “Our results suggest that the ability to flexibly use relational representations, comparing and reinterpreting them, is important for fluid intelligence.”

“Most artificial intelligence research today concerning vision focuses on recognition, or labeling what is in a scene rather than reasoning about it,” Forbus said. “But recognition is only useful if it supports subsequent reasoning. Our research provides an important step toward understanding visual reasoning more broadly.”

* The test taker should choose answer D because the relationships between it and the other elements in the bottom row are most similar to the relationships between the elements of the top rows.

** “The reader may download (Windows) the computational model and run it on example problems,” the authors note in their Psychological Review paper.

** Raven’s Progressive Matrices (RPM) is an intelligence test that “requires that participants compare images in a (usually) 3×3 matrix, identify a pattern across the matrix, and solve for the missing image.” … Designed to measure a subject’s fluid intelligence, “it has remained popular for decades because it is highly successful at predicting a subject’s performance on other ability tests — not just visual tests, but verbal and mathematical as well,” the authors suggest in their Psychological Review paper.


Abstract of Modeling visual problem solving as analogical reasoning

We present a computational model of visual problem solving, designed to solve problems from the Raven’s Progressive Matrices intelligence test. The model builds on the claim that analogical reasoning lies at the heart of visual problem solving, and intelligence more broadly. Images are compared via structure mapping, aligning the common relational structure in 2 images to identify commonalities and differences. These commonalities or differences can themselves be reified and used as the input for future comparisons. When images fail to align, the model dynamically rerepresents them to facilitate the comparison. In our analysis, we find that the model matches adult human performance on the Standard Progressive Matrices test, and that problems which are difficult for the model are also difficult for people. Furthermore, we show that model operations involving abstraction and rerepresentation are particularly difficult for people, suggesting that these operations may be critical for performing visual problem solving, and reasoning more generally, at the highest level. (PsycINFO Database Record (c) 2016 APA, all rights reserved)

Tesla Autopilot predicts collision ahead seconds before it happens

Hans Noordsij, a Dutch Tesla driver, uploaded a Dec. 27 dashcam video that dramatically shows the new radar processing capacity of Tesla’s Autopilot and resulting auto-breaking, DarkVision Hardware reports. The system’s radar saw ahead of the car in front and tracked two cars ahead on the road. Note the audible warning a second or so before the accident.

Apple’s first AI paper focuses on creating ‘superrealistic’ image recognition

Apple’s first paper on artificial intelligence, published Dec. 22 on arXiv (open access), describes a method for improving the ability of a deep neural network to recognize images.

To train neural networks to recognize images, AI researchers have typically labeled (identified or described) each image in a dataset. For example, last year, Georgia Institute of Technology researchers developed a deep-learning method to recognize images taken at regular intervals on a person’s wearable smartphone camera.

Example images from dataset of 40,000 egocentric images with their respective labels (credit: Daniel Castro et al./Georgia Institute of Technology)

The idea was to demonstrate that deep-learning can “understand” human behavior and the habits of a specific person, and based on that, the AI system could offer suggestions to the user.

The problem with that method is the huge amount of time required to manually label the images (40,000 in this case). So AI researchers have turned to using synthetic images (such as from a video) that are pre-labeled (in captions, for example).

Creating superrealistic image recognition

But that, in turn, also has limitations. “Synthetic data is often not realistic enough, leading the network to learn details only present in synthetic images and fail to generalize well on real images,” the authors explain.

Simulated+Unsupervised (S+U) learning (credit: Ashish Shrivastava et al./arXiv)

So instead, the researchers have developed a new approach called “Simulated+Unsupervised (S+U) learning.”

The idea is to still use pre-labeled synthetic images (like the “Synthetic” image in the above illustration), but refine their realism by matching synthetic images to unlabeled real images (in this case, eyes) — thus creating a “superrealistic” image (the “Refined” image above), allowing for more accurate, faster image recognition, while preserving the labeling.

To do that, the researchers used a relatively new method (created in 2014) called Generative Adversarial Networks (GANs), which uses two neural networks that sort of compete with each other to create a series of superrealistic images.*


A visual Turing test

“To quantitatively evaluate the visual quality of the refined images, we designed a simple user study where subjects were asked to classify images as real or refined synthetic. Each subject was shown a random selection of 50 real images and 50 refined images in a random order, and was asked to label the images as either real or refined. The subjects found it very hard to tell the difference between the real images and the refined images.” — Ashish Shrivastava et al./arXiv


So will Siri develop the ability to identify that person whose name you forgot and whisper it to you in your AirPods, or automatically bring up that person’s Facebook page and latest tweet? Or is that getting too creepy?

* Simulated+Unsupervised (S+U) learning is “an interesting variant of adversarial gradient-based methods,” Jürgen Schmidhuber, Scientific Director of IDSIA (Swiss AI Lab), told KurzweilAI.

“An image synthesizer’s output is piped into a refiner net whose output is classified by an adversarial net trained to distinguish real from fake images. The refiner net tries to convince the classifier that it’s output is real, while being punished for deviating too much from the synthesizer output. Very nice and rather convincing applications!”

Schmidhuber also briefly explained his 1991 paper [1] that introduced gradient-based adversarial networks for unsupervised learning “when computers were about 100,000 times slower than today. The method was called Predictability Minimization (PM).

“An encoder network receives real vector-valued data samples (such as images) and produces codes thereof across so-called code units. A predictor network is trained to predict each code component from the remaining components. The encoder, however, is trained to maximize the same objective function minimized by the predictor.

“That is, predictor and encoder fight each other, to motivate the encoder to achieve a ‘holy grail’ of unsupervised learning, namely, a factorial code of the data, where the code components are statistically independent, which makes subsequent classification easy. One can attach an optional autoencoder to the code to reconstruct data from its code. After perfect training, one can randomly activate the code units in proportion to their mean values, to read out patterns distributed like the original training patterns, assuming the code has become factorial indeed.

“PM and Generative Adversarial Networks (GANs) may be viewed as symmetric approaches. PM is directly trying to map data to its factorial code, from which patterns can be sampled that are distributed like the original data. While GANs start with a random (usually factorial) distribution of codes, and directly learn to decode the codes into ‘good’ patterns. Both PM and GANs employ gradient-based adversarial nets that play a minimax game to achieve their goals.”

[1] J.  Schmidhuber. Learning factorial codes by predictability minimization. Technical Report CU-CS-565-91, Dept. of Comp. Sci., University of Colorado at Boulder, December 1991. Later also published in Neural Computation, 4(6):863-879, 1992. More: http://people.idsia.ch/~juergen/ica.html

Nanoarray sniffs out and distinguishes ‘breathprints’ of multiple diseases

Schematic representation of the concept and design of the study, which involved collection of breath samples from 1404 patients diagnosed with one of 17 different diseases. One breath sample obtained from each subject was analyzed with the artificially intelligent nanoarray for disease diagnosis and classification (represented by patterns in the illustration), and a second sample was analyzed with gas chromatography–mass spectrometry to explore its chemical composition. (credit: Morad K. Nakhleh et al./ACS Nano)

An international team of 63 scientists in 14 clinical departments have identified a unique “breathprint” for 17 diseases with 86% accuracy and have designed a noninvasive, inexpensive, and miniaturized portable device that screens breath samples to classify and diagnose several types of diseases, they report in an open-access paper in the journal ACS Nano.

As far back as around 400 B.C., doctors diagnosed some diseases by smelling a patient’s exhaled breath, which contains nitrogen, carbon dioxide, oxygen, and a small amount of more than 100 other volatile chemical components. Relative amounts of these substances vary depending on the state of a person’s health. For example, diabetes creates a sweet breath smell. More recently, several teams of scientists have developed experimental breath analyzers, but most of these instruments focus on one disease, such as diabetes and melanoma, or a few diseases.

Detecting 17 diseases

The researchers developed an array of nanoscale sensors to detect the individual components in thousands of breath samples collected from 1404 patients who were either healthy or had one of 17 different diseases*, such as kidney cancer or Parkinson’s disease.

The team used mass spectrometry to identify the breath components associated with each disease. By analyzing the results with artificial intelligence techniques (binary classifiers), the team found that each disease produces a unique breathprint, based on differing amounts of 13 volatile organic chemical (VOC) components. They also showed that the presence of one disease would not prevent the detection of others — a prerequisite for developing a practical device to screen and diagnose various diseases.

Based on the research, the team designed an organic layer that functions as a sensing layer (recognition element) for adsorbed VOCs and an electrically conductive nanoarray based on resistive layers of molecularly modified gold nanoparticles and a random network of single-wall carbon nanotubes. The nanoparticles and nanotubes have different electrical conductivity patterns associated with different diseases.**

The authors received funding from the ERC and LCAOS of the European Union’s Seventh Framework Programme for Research and Technological Development, the EuroNanoMed Program under VOLGACORE, and the Latvian Council of Science.

* Lung cancer, colorectal cancer, head and neck cancer, ovarian cancer, bladder cancer, prostate cancer, kidney cancer, gastric cancer, Crohn’s disease, ulcerative colitis, irritable bowel syndrome, idiopathic Parkinson’s, atypical Parkinsonism, multiple sclerosis, pulmonary arterial hypertension, pre-eclampsia, and chronic kidney disease.

** During exposure to breath samples, interaction between the VOC components and the organic sensing layer changes the electrical resistance of the sensors. The relative change of sensor’s resistance at the peak (beginning), middle, and end of the exposure, as well as the area under the curve of the whole measurement were measured. All breath samples identified by the AI nanoarray were also examined using an independent lab-based analytical technique: gas chromatography linked with mass spectrometry.


Abstract of Diagnosis and Classification of 17 Diseases from 1404 Subjects via Pattern Analysis of Exhaled Molecules

We report on an artificially intelligent nanoarray based on molecularly modified gold nanoparticles and a random network of single-walled carbon nanotubes for noninvasive diagnosis and classification of a number of diseases from exhaled breath. The performance of this artificially intelligent nanoarray was clinically assessed on breath samples collected from 1404 subjects having one of 17 different disease conditions included in the study or having no evidence of any disease (healthy controls). Blind experiments showed that 86% accuracy could be achieved with the artificially intelligent nanoarray, allowing both detection and discrimination between the different disease conditions examined. Analysis of the artificially intelligent nanoarray also showed that each disease has its own unique breathprint, and that the presence of one disease would not screen out others. Cluster analysis showed a reasonable classification power of diseases from the same categories. The effect of confounding clinical and environmental factors on the performance of the nanoarray did not significantly alter the obtained results. The diagnosis and classification power of the nanoarray was also validated by an independent analytical technique, i.e., gas chromatography linked with mass spectrometry. This analysis found that 13 exhaled chemical species, called volatile organic compounds, are associated with certain diseases, and the composition of this assembly of volatile organic compounds differs from one disease to another. Overall, these findings could contribute to one of the most important criteria for successful health intervention in the modern era, viz. easy-to-use, inexpensive (affordable), and miniaturized tools that could also be used for personalized screening, diagnosis, and follow-up of a number of diseases, which can clearly be extended by further development.

How to enable soft robots to better mimick biological motions

Researchers used mathematical modeling to optimize the design of an actuator to perform biologically inspired motions (credit: Harvard SEAS)

Harvard researchers have developed a method for automatically designing actuators that enable fingers and knees in a soft robot to move more organically, a robot arm to move more smoothly along a path, or a wearable robot or exoskeleton to help a patient move a limb more naturally.

Designing such actuators is currently a complex design challenge, requiring a sequence of actuator segments, each performing a different motion. “Rather than designing these actuators empirically, we wanted a tool where you could plug in a motion and it would tell you how to design the actuator to achieve that motion,” said Katia Bertoldi, the John L. Loeb Associate Professor of the Natural Sciences and coauthor of the paper.

Designing an actuator that replicates a complex input motion. (A) Analytical models of actuator segments that can extend, expand, twist, or bend are the first input to the design tool. (B) The second input to the design tool is the kinematics of the desired motion. (C) The design tool outputs the optimal segment lengths and fiber angles for replicating the input motion. (credit: Fionnuala Connoll et al./PNAS)

The method developed by the team uses mathematical modeling of fluid-powered, fiber-reinforced actuators that can produce a wide range of motions. It optimizes the actuator design for performing specific motions (kinematics), different geometries, material properties, and pressure required for extending, expanding, twisting, and bending.

This soft actuator twists like a thumb when powered by a single pressure source (credit: Harvard SEAS)

The researchers from the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and the Wyss Institute for Biologically Inspired Engineering tested the model by designing a soft robot that bends like an index finger and twists like a texting thumb when powered by a single pressure source.

The research was published this week in the journal Proceedings of the National Academy of Sciences. “Future work will also focus on developing a model that combines bending with other motions, to increase the versatility of the algorithm,” the authors note in the paper.

In a future robot design, the model could conceivably be integrated with Cornell University’s design for soft, stretchable optoelectronic sensors in fingers that detect shape and texture.

The new methodology will be included in the Soft Robotic Toolkit, an online, open-source resource developed at SEAS to assist researchers, educators and budding innovators to design, fabrication, model, characterize and control their own soft robots.


Abstract of Automatic design of fiber-reinforced soft actuators for trajectory matching

Soft actuators are the components responsible for producing motion in soft robots. Although soft actuators have allowed for a variety of innovative applications, there is a need for design tools that can help to efficiently and systematically design actuators for particular functions. Mathematical modeling of soft actuators is an area that is still in its infancy but has the potential to provide quantitative insights into the response of the actuators. These insights can be used to guide actuator design, thus accelerating the design process. Here, we study fluid-powered fiber-reinforced actuators, because these have previously been shown to be capable of producing a wide range of motions. We present a design strategy that takes a kinematic trajectory as its input and uses analytical modeling based on nonlinear elasticity and optimization to identify the optimal design parameters for an actuator that will follow this trajectory upon pressurization. We experimentally verify our modeling approach, and finally we demonstrate how the strategy works, by designing actuators that replicate the motion of the index finger and thumb.

A robotic hand with a human’s delicate sense of touch

A soft, sensitive robotic hand mounted on a robotic arm (credit: Cornell University)

Cornell University engineers have invented a new kind of robotic hand with a human’s delicate sense of touch.

Scenario: you lost part of your arm in a car accident. That artificial arm and hand you got from the hospital lets you feel and pick up things — even type on a keyboard. But not with the same sensitivity as a real hand. Now, an artificial prosthesis can even let you feel which one of three tomatoes is ripe (as shown in the video below).

The engineers’ trick was to use soft, stretchable optoelectronic (light + electronics) sensors in the fingers to detect shape and texture. (The sensors in existing prosthetic and robot hands use cruder tactile, or touch, sensors with bulky, rigid motors to measure strain.) The new prosthetic hand is a lot more sensitive. It can measure softness or hardness, how much the material stretches when touched, and how much force needs to be supplied to make the material deform.

How to make an (almost) human hand

A Cornell group led by Robert Shepherd, assistant professor of mechanical and aerospace engineering and principal investigator of Organic Robotics Lab, has published a paper describing this research in the debut edition of the journal Science Robotics (open access until Jan. 31).

Schematic of prosthetic hand structure and components (credit: Cornell University)

Unlike existing prosthetic and robot sensors, these sensors are integrated within the hand (instead of on the surface), so they can actually detect forces being transmitted through the thickness of the robot hand — simulating how a human hand feels. The more the prosthetic hand deforms (as it touches an object), the more light is lost. That variable loss of light, as detected by the photodiode, is what allows the prosthetic hand to “sense” its surroundings with high sensitivity and discrimination.*

This work was supported by a grant from Air Force Office of Scientific Research, and made use of the Cornell NanoScale Science and Technology Facility and the Cornell Center for Materials Research, both of which are supported by the National Science Foundation.

* The optoelectronic sensors are based on novel elastomeric optical waveguides, using a 3D-printed mold and a soft-lithography process to create a fluidically powered, stretchable material. Each high-precision waveguide incorporates an LED to generate light and a photodiode to measure the amount of light lost (which depends dynamically on the curvature, elongation, and force of the prosthetic hand).

To make the hand, Shepherd’s group used a four-step soft lithography process to produce the inside of the hand (core) and the cladding (outer surface of the waveguide), which also houses the LED (light-emitting diode) and the photodiode (light detector).


CNBC International | Scientists build a robotic hand with a soft touch | CNBC International


Abstract of Optoelectronically innervated soft prosthetic hand via stretchable optical waveguides

Because of their continuous and natural motion, fluidically powered soft actuators have shown potential in a range of robotic applications, including prosthetics and orthotics. Despite these advantages, robots using these actuators require stretchable sensors that can be embedded in their bodies for sophisticated functions. Presently, stretchable sensors usually rely on the electrical properties of materials and composites for measuring a signal; many of these sensors suffer from hysteresis, fabrication complexity, chemical safety and environmental instability, and material incompatibility with soft actuators. Many of these issues are solved if the optical properties of materials are used for signal transduction. We report the use of stretchable optical waveguides for strain sensing in a prosthetic hand. These optoelectronic strain sensors are easy to fabricate, are chemically inert, and demonstrate low hysteresis and high precision in their output signals. As a demonstration of their potential, the photonic strain sensors were used as curvature, elongation, and force sensors integrated into a fiber-reinforced soft prosthetic hand. The optoelectronically innervated prosthetic hand was used to conduct various active sensation experiments inspired by the capabilities of a real hand. Our final demonstration used the prosthesis to feel the shape and softness of three tomatoes and select the ripe one.

How to control a robotic arm with your mind — no implanted electrodes required

Research subjects at the University of Minnesota fitted with a specialized noninvasive EEG brain cap were able to move a robotic arm in three dimensions just by imagining moving their own arms (credit: University of Minnesota College of Science and Engineering)

Researchers at the University of Minnesota have achieved a “major breakthrough” that allows people to control a robotic arm in three dimensions, using only their minds. The research has the potential to help millions of people who are paralyzed or have neurodegenerative diseases.

The open-access study is published online today in Scientific Reports, a Nature research journal.


College of Science and Engineering, UMN | Noninvasive EEG-based control of a robotic arm for reach and grasp tasks

“This is the first time in the world that people can operate a robotic arm to reach and grasp objects in a complex 3D environment, using only their thoughts without a brain implant,” said Bin He, a University of Minnesota biomedical engineering professor and lead researcher on the study. “Just by imagining moving their arms, they were able to move the robotic arm.”

The noninvasive technique, based on a brain-computer interface (BCI) using electroencephalography (EEG), records weak electrical activity of the subjects’ brain through a specialized, high-tech EEG cap fitted with 64 electrodes. A computer then converts the “thoughts” into actions by advanced signal processing and machine learning.

An example of an implanted electrode array, allowing for a patient to control a robot arm with her thoughts (credit: UPMC)

The system solves problems with previous BCI systems. Early efforts used invasive electrode arrays implanted in the cortex to control robotic arms, or the patient’s own arm using neuromuscular electrical stimulation. These early systems face the risk of post-surgery complications and infections and are difficult to keep working over time. They also limit broad use.

An EEG-based device that allows an amputee to grasp with a bionic hand, powered only by his thoughts (credit: University of Houston)

More recently, noninvasive EEG has been used. It doesn’t require risky, expensive surgery and it’s easy and fast to attach scalp electrodes. For example, in 2015, University of Houston researchers developed an EEG-based system that allows users to successfully grasp objects, including a water bottle and a credit card. The subject grasped the selected objects 80 percent of the time using a high-tech bionic hand fitted to the amputee’s stump.

Other EEG-based systems for patients have included ones capable of controlling a lower-limb exoskelton and a thought-controlled robotic exoskeleton for the hand. However, these systems have not been suitable for  multi-dimensional control of a robotic arm, allowing the patient to reach, grasp, and move an object in three-dimensional (3D) space.

Full 3D control or a robotic arm by just thinking — no implants

The new University of Minnesota EEG BCI system was developed to enable such natural, unimpeded movements of a robotic arm in 3D space, such as picking up a cup, moving it around on a table, and drinking from it — similar to the robot arm controlled by implanted-electrodes shown in the photo above of a patient served a chocolate bar.*

The new  technology basically works in the same way as the robot system using implanted electrodes. It’s based on the motor cortex, the area of the brain that governs movement. When humans move, or think about a movement, neurons in the motor cortex produce tiny electric currents. Thinking about a different movement activates a new assortment of neurons, a phenomenon confirmed by cross-validation using functional MRI in He’s previous study. In the new study, the researchers sorted out the possible movements, using advanced signal processing.

User controls flight of a 3D virtual helicopter using brain waves

User controls flight of a 3D virtual helicopter using brain waves (credit: Bin He/University of Minnesota)

This robotic-arm research builds upon He’s research published in 2011, in which subjects were able to fly a virtual quadcopter using noninvasive EEG technology, and later research allowing for flying a physical quadcopter.

The next step of He’s research will be to further develop this BCI technology, using a brain-controlled robotic prosthetic limb attached to a person’s body for patients who have had a stroke or are paralyzed.

The University of Minnesota study was funded by the National Science Foundation (NSF), the National Center for Complementary and Integrative Health, National Institute of Biomedical Imaging and Bioengineering, and National Institute of Neurological Disorders and Stroke of the National Institutes of Health (NIH), and the University of Minnesota’s MnDRIVE (Minnesota’s Discovery, Research and InnoVation Economy) Initiative funded by the Minnesota Legislature.

* Eight healthy human subjects completed the experimental sessions of the study wearing the EEG cap. Subjects gradually learned to imagine moving their own arms without actually moving them to control a robotic arm in 3D space. They started from learning to control a virtual cursor on computer screen and then learned to control a robotic arm to reach and grasp objects in fixed locations on a table. Eventually, they were able to move the robotic arm to reach and grasp objects in random locations on a table and move objects from the table to a three-layer shelf by only thinking about these movements.

All eight subjects could control a robotic arm to pick up objects in fixed locations with an average success rate above 80 percent and move objects from the table onto the shelf with an average success rate above 70 percent.


Abstract of Noninvasive Electroencephalogram Based Control of a Robotic Arm for Reach and Grasp Tasks

Brain-computer interface (BCI) technologies aim to provide a bridge between the human brain and external devices. Prior research using non-invasive BCI to control virtual objects, such as computer cursors and virtual helicopters, and real-world objects, such as wheelchairs and quadcopters, has demonstrated the promise of BCI technologies. However, controlling a robotic arm to complete reach-and-grasp tasks efficiently using non-invasive BCI has yet to be shown. In this study, we found that a group of 13 human subjects could willingly modulate brain activity to control a robotic arm with high accuracy for performing tasks requiring multiple degrees of freedom by combination of two sequential low dimensional controls. Subjects were able to effectively control reaching of the robotic arm through modulation of their brain rhythms within the span of only a few training sessions and maintained the ability to control the robotic arm over multiple months. Our results demonstrate the viability of human operation of prosthetic limbs using non-invasive BCI technology.

A machine-learning system that trains itself by surfing the web

MIT researchers have designed a new machine-learning system that can learn by itself to extract text information for statistical analysis when available data is scarce.

This new “information extraction” system turns machine learning on its head. It works like humans do. When we run out of data in a study (say, differentiating between fake and real news), we simply search the Internet for more data, and then we piece the new data together to make sense out of it all.

That differs from most machine-learning systems, which are fed as many training examples as possible to increase the chances that the system will be able to handle difficult problems by looking for patterns compared to training data.


Andrew Ng, Associate Professor of Computer Science at Stanford, Chief Scientist of Baidu, and Chairman and Co-founder of Coursera, is writing an introductory book, Machine Learning Yearning, intended to help readers build highly effective AI and machine learning systems. If you want to download a free draft copy of each chapter as it is finished (and previous chapters), you can sign up here for his mailing list. Ng is currently up to chapter 14.


“In information extraction, traditionally, in natural-language processing, you are given an article and you need to do whatever it takes to extract correctly from this article,” says Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science and senior author of a new paper presented at the recent Association for Computational Linguistics’ Conference on Empirical Methods on Natural Language Processing.

“That’s very different from what you or I would do. When you’re reading an article that you can’t understand, you’re going to go on the web and find one that you can understand.”

Confidence boost from web data

Machine learning systems determine whether they have enough data by assigning each of its classifications a confidence score — a measure of the statistical likelihood that the classification is correct, given the patterns discerned in the training data. If not, additional training data is required.

Fig 1. Sample news article on a shooting case. Note how the article contains both the name of the shooter and the number of people killed, but both pieces of information require complex extraction schemes to make sense out of the information. (credit: Karthik Narasimhan et al.)

In the real world, that’s not always easy. For example, the researchers note in the paper that the example news article excerpt in Fig. 1, “does not explicitly mention the shooter (Scott Westerhuis), but instead refers to him as a suicide victim. Extraction of the number of fatally shot victims is similarly difficult as the system needs to infer that ‘A couple and four children’ means six people. Even a large annotated training set may not provide sufficient coverage to capture such challenging cases.”

Instead, with the researchers’ new information-extraction system, if the confidence score is too low, the system automatically generates a web search query designed to pull up texts likely to contain the data it’s trying to extract. It then attempts to extract the relevant data from one of the new texts and reconciles the results with those of its initial extraction. If the confidence score remains too low, it moves on to the next text pulled up by the search string, and so on.

The system learns how to generate search queries, gauge the likelihood that a new text is relevant to its extraction task, and determine the best strategy for fusing the results of multiple attempts at extraction.

Testing the new information-extraction system

The MIT researchers say they tested their system with two information-extraction tasks. In each case, the system was trained on about 300 documents. One task was focused on collecting and analyzing data on mass shootings in the U.S. (an essential resource for any epidemiological study of the effects of gun-control measures).

“We collected data from the Gun Violence archive, a website tracking shootings in the United States,” the authors note. “The data contains a news article on each shooting and annotations for (1) the name of the shooter, (2) the number of people killed, (3) the number of people wounded, and (4) the city where the incident took place.”

The other task was the collection of similar data on instances of food contamination. The researchers used the Foodshield EMA database, “documenting adulteration incidents since 1980.” The researchers extracted food type, type of contaminant, and location.

For the mass-shootings task, the researchers asked the system to extract from websites (such as news articles, as in Fig. 1) the name of the shooter, the location of the shooting, the number of people wounded, and the number of people killed.

The goal was to find other documents that contain the information sought, expressed in a form that a basic extractor can “understand.”

Fig. 2. Two other articles on the same shooting case. The first article clearly mentions that six people were killed. The second one portrays the shooter in an easily extractable form. (credit: Karthik Narasimhan et al.)

For instance, Figure 2 shows two other articles describing the same event in which the entities of interest — the number of people killed and the name of the shooter — are expressed explicitly. That simplifies things.

From those articles, the system learned clusters of search terms that tended to be associated with the data items it was trying to extract. For instance, the names of mass shooters were correlated with terms like “police,” “identified,” “arrested,” and “charged.” During training, for each article the system was asked to analyze, it pulled up, on average, another nine or 10 news articles from the web.

The researchers then compared their system’s performance to that of several extractors trained using more conventional machine-learning techniques. For every data item extracted in both tasks, the new system outperformed its predecessors, usually by about 10 percent.

“The challenges … lie in (1) performing event coreference (retrieving suitable articles describing the same incident) and (2) reconciling the entities extracted from these different documents,” the authors note in the paper. “We address these challenges using a Reinforcement Learning (RL) approach that combines query formulation, extraction from new sources, and value reconciliation.”

UPDATE Dec. 8, 2016 — Added sources for test data.


Abstract of Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning

Most successful information extraction systems operate with access to a large collection of documents. In this work, we explore the task of acquiring and incorporating external evidence to improve extraction accuracy in domains where the amount of training data is scarce. This process entails issuing search queries, extraction from new sources and reconciliation of extracted values, which are repeated until sufficient evidence is collected. We approach the problem using a reinforcement learning framework where our model learns to select optimal actions based on contextual information. We employ a deep Qnetwork, trained to optimize a reward function that reflects extraction accuracy while penalizing extra effort. Our experiments on two databases – of shooting incidents, and food adulteration cases – demonstrate that our system significantly outperforms traditional extractors and a competitive meta-classifier baseline.