New ‘machine unlearning’ technique deletes unwanted data

The novel approach to making systems forget data is called “machine unlearning” by the two researchers who are pioneering the concept. Instead of making a model directly depend on each training data sample (left), they convert the learning algorithm into a summation form (right) – a process that is much easier and faster than retraining the system from scratch. (credit: Yinzhi Cao and Junfeng Yang)

Machine learning systems are becoming ubiquitous, but what about false or damaging information about you (and others) that these systems have learned? Is it even possible for that information to be ever corrected? There are some heavy security and privacy questions here. Ever Google yourself?

Some background: machine-learning software programs calculate predictive relationships from massive amounts of data. The systems identify these predictive relationships using advanced algorithms — a set of rules for solving math problems — and “training data.” This data is then used to construct the models and features that enable a system to predict things, like the probability of rain next week or when the Zika virus will arrive in your town.

This intricate process means that a piece of raw data often goes through a series of computations in a system. The computations and information derived by the system from that data together form a complex propagation network called the data’s “lineage” (a term coined by Yinzhi Cao, a Lehigh University assistant professor of computer science and engineering, and his colleague, Junfeng Yang of Columbia University).

“Effective forgetting systems must be able to let users specify the data to forget with different levels of granularity,” said Cao. “These systems must remove the data and undo its effects so that all future operations run as if the data never existed.”

Widely used learning systems such as Google Search are, for the most part, only able to forget a user’s raw data upon request — not the data’s lineage (what the user’s data connects to). However, in October 2014, Google removed more than 170,000 links to comply with the ruling, which affirmed users’ right to control what appears when their names are searched. In July 2015, Google said it had received more than a quarter-million such requests.

How “machine unlearning” works

Now the two researchers say they have developed a way to forget faster and more effectively. Their concept, called “machine unlearning,” led to a four-year, $1.2 million National Science Foundation grant to develop the approach.

Building on work that was presented at a 2015 IEEE Symposium and then published, Cao and Yang’s “machine unlearning” method is based on the assumption that most learning systems can be converted into a form that can be updated incrementally without costly retraining from scratch.

Their approach introduces a layer of a small number of summations between the learning algorithm and the training data to eliminate dependency on each other. That means the learning algorithms depend only on the summations and not on individual data.

Using this method, unlearning a piece of data and its lineage no longer requires rebuilding the models and features that predict relationships between pieces of data. Simply recomputing a small number of summations would remove the data and its lineage completely — and much faster than through retraining the system from scratch, the researchers claim.

Verification?

Cao and Yang tested their unlearning approach on four diverse, real-world systems: LensKit, an open-source recommendation system; Zozzle, a closed-source JavaScript malware detector; an open-source OSN spam filter; and PJScan, an open-source PDF malware detector.

Cao and Yang are now adapting the technique to other systems and creating verifiable machine unlearning to statistically test whether unlearning has indeed repaired a system or completely wiped out unwanted data.

“We foresee easy adoption of forgetting systems because they benefit both users and service providers,” they said. “With the flexibility to request that systems forget data, users have more control over their data, so they are more willing to share data with the systems.”

The researchers envision “forgetting systems playing a crucial role in emerging data markets where users trade data for money, services, or other data, because the mechanism of forgetting enables a user to cleanly cancel a data transaction or rent out the use rights of her data without giving up the ownership.”


editor’s comments: I’d like to see case studies and critical reviews of this software by independent security and privacy experts. Yes, I’m paranoid but… etc. Your suggestions? To be continued…


Abstract of Towards Making Systems Forget with Machine Unlearning

Today’s systems produce a rapidly exploding amount of data, and the data further derives more data, forming a complex data propagation network that we call the data’s lineage. There are many reasons that users want systems to forget certain data including its lineage. From a privacy perspective, users who become concerned with new privacy risks of a system often want the system to forget their data and lineage. From a security perspective, if an attacker pollutes an anomaly detector by injecting manually crafted data into the training data set, the detector must forget the injected data to regain security. From a usability perspective, a user can remove noise and incorrect entries so that a recommendation engine gives useful recommendations. Therefore, we envision forgetting systems, capable of forgetting certain data and their lineages, completely and quickly. This paper focuses on making learning systems forget, the process of which we call machine unlearning, or simply unlearning. We present a general, efficient unlearning approach by transforming learning algorithms used by a system into a summation form. To forget a training data sample, our approach simply updates a small number of summations — asymptotically faster than retraining from scratch. Our approach is general, because the summation form is from the statistical query learning in which many machine learning algorithms can be implemented. Our approach also applies to all stages of machine learning, including feature selection and modeling. Our evaluation, on four diverse learning systems and real-world workloads, shows that our approach is general, effective, fast, and easy to use.

Using machine learning to rationally design future electronics materials

A schematic diagram of machine learning for materials discovery (credit: Chiho Kim, Ramprasad Lab, UConn)

Replacing inefficient experimentation, UConn researchers have used machine learning to systematically scan millions of theoretical compounds for qualities that would make better materials for solar cells, fibers, and computer chips.

Led by UConn materials scientist Ramamurthy ‘Rampi’ Ramprasad, the researchers set out to determine which polymer atomic configurations make a given polymer a good electrical conductor or insulator, for example.

A polymer is a large molecule made of many repeating building blocks. The most familiar example is plastics. What controls a polymer’s properties is mainly how the atoms in the polymer connect to each other. Polymers can also have diverse electronic properties. For example, they can be very good insulators or good conductors. And what controls all these properties is mainly how the atoms in the polymer connect to each other.

But with at least 95 stable elements, the number of possible combinations is astronomical. So they pared down the problem to a manageable subset. Many polymers are made of building blocks containing just a few atoms. They look like this:

Polyurea, a common plastic. In this diagram, N is nitrogen, H hydrogen, and O oxygen. R stands in for any number of chemicals that could slightly alter the polymer, but the repeating NH-O-NH-O is the basic structure. Most polymers look like that, made of carbon (C), H, N and O, with a few other elements thrown in occasionally. (credit: Yikrazuul/public domain)

For their project, Ramprasad’s group looked at polymers made of just seven building blocks: CH2, C6H4, CO, O, NH, CS, and C4H2S. These are found in common plastics such as polyethylene, polyesters, and polyureas. An enormous variety of polymers could theoretically be constructed using just these building blocks; Ramprasad’s group decided at first to analyze just 283, each composed of a repeated four-block unit.

They started from basic quantum mechanics, and calculated the three-dimensional atomic and electronic structures of each of those 283 four-block polymers (calculating the position of every electron and atom in a molecule with more than two atoms takes a powerful computer a significant chunk of time, which is why they did it for only 283 molecules).

Calculating key electronic properties

(credit: UConn)

Once they had the three-dimensional structures, they could calculate what they really wanted to know: each polymer’s properties.

  1. Ramprasad’s group calculated the band gap, which is the amount of energy it takes for an electron in the polymer to break free of its home atom and travel around the material; and the dielectric constant, which is a measure of the effect an electric field can have on the polymer. These properties translate to how much electric energy each polymer can store in itself.
  2. They then defined each polymer as a string of numbers, a sort of numerical fingerprint. Since there are seven possible building blocks, there are seven possible numbers, each indicating how many of each block type are contained in that polymer.
  3. But a simple number string like that doesn’t give enough information about the polymer’s structure, so they added a second string of numbers that tell how many pairs there are of each combination of building blocks, such as NH-O or C6H4-CS.
  4. Then they added a third string that described how many triples, like NH-O-CH2, there were. They arranged these strings as a three-dimensional matrix, which is a convenient way to describe such strings of numbers in a computer.
  5. Then they let the computer go to work. Using the library of 283 polymers they had laboriously calculated using quantum mechanics, the machine compared each polymer’s numerical fingerprint to its band gap and dielectric constant, and gradually ‘learned’ which building block combinations were associated with which properties. It could even map those properties onto a two-dimensional matrix of the polymer building blocks.
  6. Once the machine learned which atomic building block combinations gave which properties, it could accurately evaluate the band gap and dielectric constant for any polymer made of any combination of those seven building blocks, using just the numerical fingerprint of its structure.

Flow chart of the steps involved in the genetic algorithm (GA) approach, leading to direct design of polymers (credit: Arun Mannodi-Kanakkithodi et al/Scientific Reports)

Validating predictions

Many of the predictions of quantum mechanics and the machine learning scheme have been validated by Ramprasad’s UConn collaborators, who actually made several of the novel polymers and tested their properties.

The group published a paper on their polymer work in an open-access paper in Scientific Reports on Feb. 15; and another paper that utilizes machine learning in a different manner, namely, to discover laws that govern dielectric breakdown of insulators, will be published in a forthcoming issue of Chemistry of Materials.

You can see the predicted properties of every polymer Ramprasad’s group has evaluated in their online data vault, Khazana, which also provides their machine learning apps to predict polymer properties on the fly. They are also uploading data and the machine learning tools from their Chemistry of Materials work, and from an additional recent article published in Scientific Reports on Jan. 19 on predicting the band gap of perovskites, inorganic compounds used in solar cells, lasers, and light-emitting diodes.

Ramprasad’s work is aligned with a larger U.S. White House initiative called the Materials Genome Initiative. Much of Ramprasad’s work described here was funded by grants from the Office of Naval Research, as well as from the U.S. Department of Energy.


Abstract of Machine Learning Strategy for Accelerated Design of Polymer Dielectrics

The ability to efficiently design new and advanced dielectric polymers is hampered by the lack of sufficient, reliable data on wide polymer chemical spaces, and the difficulty of generating such data given time and computational/experimental constraints. Here, we address the issue of accelerating polymer dielectrics design by extracting learning models from data generated by accurate state-of-the-art first principles computations for polymers occupying an important part of the chemical subspace. The polymers are ‘fingerprinted’ as simple, easily attainable numerical representations, which are mapped to the properties of interest using a machine learning algorithm to develop an on-demand property prediction model. Further, a genetic algorithm is utilised to optimise polymer constituent blocks in an evolutionary manner, thus directly leading to the design of polymers with given target properties. While this philosophy of learning to make instant predictions and design is demonstrated here for the example of polymer dielectrics, it is equally applicable to other classes of materials as well.

Freaked out by robots? Recall a familiar robot movie.

Familiar robot movies (credits: Disney/Pixar, Columbia Pictures, 20th Century Fox, 20th Century Fox respectively)

Older adults who recalled more robots portrayed in films had lower anxiety toward robots than seniors who remembered fewer robot portrayals, Penn State researchers found in a study.

That could help elders accept robots as caregivers, said S. Shyam Sundar, Distinguished Professor of Communications and co-director of the Media Effects Research Laboratory.

“Increasingly, people are talking about smart homes and health care facilities and the roles robots could play to help the aging process,” said Sundar. “Robots could provide everything from simple reminders — when to take pills, for example — to fetching water and food for people with limited mobility.”

The more robot portrayals the study subjects could recall, regardless of the robot’s characteristics (even threatening ones, like the Terminator), the more they led to more positive attitudes on robots, and eventually more positive intentions to use a robot. People also had a more positive reaction to robots that looked more human-like and ones that evoked more sympathy.

The most recalled robots included robots from Bicentennial Man, Forbidden Planet, Lost In Space, Star Wars, The Terminator, Transformers, Wall-E, and I, Robot.


Abstract of The Hollywood Robot Syndrome: Media Effects on Older Adults’ Attitudes toward Robots and Adoption Intentions

Do portrayals of robots in popular films influence older adults’ robot anxiety and adoption intentions? Informed by cultivation theory, disposition theory and the technology acceptance model, the current survey (N = 379) examined how past exposure to robots in the media affect older adults’ (Mage = 66) anxiety towards robots and their subsequent perceptions of robot usefulness, ease of use, and adoption intentions. The results of a structural equation model (SEM) analysis indicate that the higher the number of media portrayals recalled, the lower the anxiety towards robots. Furthermore, recalling robots with a human-like appearance or robots that elicit greater feelings of sympathy was related to more positive attitudes towards robots. Theoretical and practical implications of these results for the design of socially assistive robots for older adults are discussed.

AlphaGo machine-learning program defeats top Go player in first match

AlphaGo (via DeepMind’s Aja Huang) vs. Sedol in last minute of Match 1 (credit: DeepMind)

Google DeepMind’s machine-learning AlphaGo program has defeated South Korean Go champion Lee Sedol in the first match of five historic matches between human and AI, taking place in Seoul.

The second round will take place today (Wednesday March 9 in U.S.) at 11 PM ET (1 PM KST), also covered on YouTube.

Last October, AlphaGo defeated European Go champion Fan Hui 5-0, making it the first computer program to beat a professional Go player (see “Google machine-learning system is first to defeat professional Go player“).

IBM’s Big Blue computer conquered chess in 1997 in a match with chess champion Garry Kasparov, leaving Go as “the only game left above chess,” according to DeepMind CEO Demis Hassabis.

Google DeepMind is offering $1 million in prize money for the winner. If AlphaGo wins, Google will donate the prize money to UNICEF, STEM and Go charities.

“The Google AI win in Go is yet another hurdle jumped over by AI,” said Ray Kurzweil. “When a computer took the world chess championship in 1997, observers noted that chess was just a combinatorial logic game and that computers would never win at Go. Indeed, Go requires the more human-like capability of deeply understanding patterns, which AI is now mastering. Today, computers are doing many things that used to be the unique province of human intelligence, such as driving cars, identifying complex images and understanding natural language. But this is not an alien invasion of intelligent machines from Mars. Rather these are tools of our own creation designed to extend our own reach, physically and now mentally.”


Deep Mind | Match 1 – Google DeepMind Challenge Match: Lee Sedol vs AlphaGo

Deep learning helps robots perfect skills

BRETT folds a towel, using deep learning to distinguish corners and edges, as well as how wrinkly certain areas are, and matches what it sees to past experience to determine what to do next. (credit: Peg Skorpinski)

BRETT (Berkeley Robot for the Elimination of Tedious Tasks) has learned to improve its performance in household chores through deep learning and reinforcement learning to provide moment-to-moment visual and sensory feedback to the software that controls the robot’s movements.

For the past 15 years, Berkeley robotics researcher Pieter Abbeel has been looking for ways to make robots learn. In 2010 he and his students programmed BRETT to pick up different-sized towels, figure out their shape, and neatly fold them.

The new use of deep-reinforcement learning strategy opens the way for training robots to carry out increasingly complex tasks. The research goal is to generalize from one task to another. A robot that could learn from experience would be far more versatile than one needing detailed, baked-in instructions for each new act.

Deep learning enables the robot to perceive its immediate environment, including the location and movement of its limbs. Reinforcement learning means improving  at a task by trial and error. A robot with these two skills could refine its performance based on real-time feedback.

Applications for such a skilled robot might range from helping humans with tedious housekeeping chores to assisting in highly detailed surgery. In fact, Abbeel says, “Robots might even be able to teach other robots.” Or humans?


UC Berkeley | Autonomous robot doing laundry


Cal ESG | Faculty Talks 2

Stretchable electronics that can quadruple in length

Stretchable biphasic gold–gallium thin films. Scale bar: 5 mm; Inset scale bar: 500 micrometers. (credit: Arthur Hirsch et al./Advanced Materials)

EPFL researchers have developed films with conductive tracks just several hundreds of nanometers thick that can be bent and stretched up to four times their original length. They could be used in artificial skin, connected clothing, and on-body sensors.

Instead of bring printed on a board, the tracks are almost as flexible as rubber and can be stretched up to four times their original length and in all directions a million times without cracking or interrupting their conductivity. The material could be used to make circuits that can be twisted and stretched — ideal for artificial skin on prosthetics or robotic machines.

It could also be integrated into fabric and used in connected clothing. And because it follows the shape and movements of the human body, it could be used for sensors designed to monitor particular biological functions.

The films use an alloy of gallium to achieve a liquid state at room temperature and gold to ensures the gallium remains homogeneous, preventing it from separating into droplets. The invention is described in an article published today in the journal Advanced Materials.


École polytechnique fédérale de Lausanne (EPFL) | Stretchable electronics that quadruple in length


Abstract of Intrinsically Stretchable Biphasic (Solid–Liquid) Thin Metal Films

Stretchable biphasic conductors are formed by physical vapor deposition of gallium onto an alloying metal film. The properties of the photolithography-compatible thin metal films are highlighted by low sheet resistance (0.5 Ω sq−1) and large stretchability (400%). This novel approach to deposit and pattern liquid metals enables extremely robust, multilayer and soft circuits, sensors, and actuators.

‘Fingerprinting’ and neural nets could help protect power grid, other industrial systems

Electrical substation (credit: Fitrah Hamid, Georgia Tech)

Georgia Tech researchers have developed a device fingerprinting technique that could improve the security of the electrical grid and other industrial systems.

“The stakes are extremely high; the systems are very different from home or office computer networks,” said Raheem Beyah, an associate professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology.

The networked systems controlling the U.S. electrical grid and other industrial systems, carried out over supervisory control and data acquisition (SCADA) protocols, often lack the ability to run modern encryption and authentication systems. The legacy systems connected to them were never designed for networked security, Beyah said. Because they are distributed around the country, often in remote areas, the systems are also difficult to update using the “patching” techniques common in computer networks.

Fingerprinting to detect false data or commands

Points of attack in a power substation network (credit: David Formby et al./Network and Distributed System Security Symposium)

Which is why Beyah and his team have developed “fingerprinting techniques” to protect various operations of the power grid to prevent or minimize spoofing of packets that could be injected to produce false data or false control commands into the system. “This is the first technique that can passively fingerprint different devices that are part of critical infrastructure networks,” he said. “We believe it can be used to significantly improve the security of the grid and other networks.”

For instance, control devices used in the power grid produce signals that are distinctive because of their unique physical configurations and compositions. Security devices listening to signals traversing the grid’s control systems can differentiate between these legitimate devices and signals produced by equipment that’s not part of the system.

Devices such as circuit breakers and electrical protection systems can also be told to open or close remotely, and they then report on the actions they’ve taken. The time required to open a breaker or a valve is determined by the physical properties of the device. If an acknowledgement arrives too soon after the command is issued — less time than it would take for a breaker or valve to open, for instance — the security system could suspect spoofing, Beyah explained.

To develop the device fingerprints, the researchers have built computer models of utility grid devices to understand how they operate. Information to build the models came from “black box” techniques — watching the information that goes into and out of the system — and “white box” techniques using schematics or physical access to the systems and unique signatures that indicates the identity of specific devices, or device type, or associated actions.

The researchers used supervised learning techniques when a list of IP addresses and corresponding device types were available; and unsupervised learning when not available, with performance nearly as high as the supervised learning methods.

The researchers have demonstrated the technique on two electrical substations, and plan to continue refining it until it becomes close to 100 percent accurate. Their current technique addresses the protocol used for more than half of the devices on the electrical grid, and future work will include examining application of the method to other protocols.

Other vulnerable systems

Beyah believes the approach could have broad application to securing industrial control systems used in manufacturing, oil and gas refining, wastewater treatment and other industries where they use devices with measurable physical properties. Beyond industrial controls, the principle could also apply to the Internet of Things (IoT), where the devices being controlled have specific signatures related to switching them on and off.

“All of these IoT devices will be doing physical things, such as turning your air-conditioning on or off,” Beyah said. “There will be a physical action occurring, which is similar to what we have studied with valves and actuators.”

The research, reported February 23 at the Network and Distributed System Security Symposium in San Diego, was supported in part by the National Science Foundation (NSF). The approach has been successfully tested in two electrical substations.


Abstract of Who’s in Control of Your Control System? Device Fingerprinting for Cyber-Physical Systems

Industrial control system (ICS) networks used in critical infrastructures such as the power grid present a unique set of security challenges. The distributed networks are difficult to physically secure, legacy equipment can make cryptography and regular patches virtually impossible, and compromises can result in catastrophic physical damage. To address these concerns, this research proposes two device type fingerprinting methods designed to augment existing intrusion detection methods in the ICS environment. The first method measures data response processing times and takes advantage of the static and low-latency nature of dedicated ICS networks to develop accurate fingerprints, while the second method uses the physical operation times to develop a unique signature for each device type. Additionally, the physical fingerprinting method is extended to develop a completely new class of fingerprint generation that requires neither prior access to the network nor an example target device. Fingerprint classification accuracy is evaluated using a combination of a real world five month dataset from a live power substation and controlled lab experiments. Finally, simple forgery attempts are launched against the methods to investigate their strength under attack.

Should you trust a robot in emergencies?

Would you follow a broken-down emergency guide robot in a mock fire? (credit: Georgia Tech)

In a finding reminiscent of the bizarre Stanford prison experiment, subjects in an experiment blindly followed a robot in a mock building-fire emergency — even when it led them into a dark room full of furniture and they were told the robot had broken down.

The research was designed to determine whether or not building occupants would trust a robot designed to help them evacuate a high-rise in case of fire or other emergency, said Alan Wagner, a senior research engineer in the Georgia Tech Research Institute (GTRI).

In the study, sponsored in part by the Air Force Office of Scientific Research (AFOSR), the researchers recruited a group of 42 volunteers, most of them college students, and asked them to follow a brightly colored robot that had the words “Emergency Guide Robot” on its side.

Blind obedience to a robot authority figure

Georgia Tech Research Institute (GTRI) Research Engineer Paul Robinette adjusts the arms of the “Rescue Robot,” built to study issues of trust between humans and robots. (credit: Rob Felt, Georgia Tech)

In some cases, the robot (controlled by a hidden researcher), brightly-lit with red LEDs and white “arms” that served as pointers, led the volunteers into the wrong room and traveled around in a circle twice before entering the conference room.

For several test subjects, the robot stopped moving, and an experimenter told the subjects that the robot had broken down. Once the subjects were in the conference room with the door closed, the hallway through which the participants had entered the building was filled with artificial smoke, which set off a smoke alarm.

When the test subjects opened the conference room door, they saw the smoke and the robot directed them to an exit in the back of the building instead of toward a doorway marked with exit signs that had been used to enter the building.

The researchers surmise that in the scenario they studied, the robot may have become an “authority figure” that the test subjects were more likely to trust in the time pressure of an emergency. In simulation-based research done without a realistic emergency scenario, test subjects did not trust a robot that had previously made mistakes.

Only when the robot made obvious errors during the emergency part of the experiment did the participants question its directions. However, some subjects still followed the robot’s instructions — even when it directed them toward a darkened room that was blocked by furniture.

In future research, the scientists hope to learn more about why the test subjects trusted the robot, whether that response differs by education level or demographics, and how the robots themselves might indicate the level of trust that should be given to them.

How to prevent humans from trusting robots too much

The research is part of a long-term study of how humans trust robots, an important issue as robots play a greater role in society. The researchers envision using groups of robots stationed in high-rise buildings to point occupants toward exits and urge them to evacuate during emergencies. Research has shown that people often don’t leave buildings when fire alarms sound, and that they sometimes ignore nearby emergency exits in favor of more familiar building entrances.

But in light of these findings, the researchers are reconsidering the questions they should ask. “A more important question now might be to ask how to prevent them from trusting these robots too much.”

But there are other issues of trust in human-robot relationships that the researchers want to explore, the researchers say: “Would people trust a hamburger-making robot to provide them with food? If a robot carried a sign saying it was a ‘child-care robot,’ would people leave their babies with it? Will people put their children into an autonomous vehicle and trust it to take them to grandma’s house? We don’t know why people trust or don’t trust machines.”

The research, believed to be the first to study human-robot trust in an emergency situation, is scheduled to be presented March 9 at the 2016 ACM/IEEE International Conference on Human-Robot Interaction (HRI 2016) in Christchurch, New Zealand.


Georgia Tech | In Emergencies, Should You Trust a Robot?


Abstract of Overtrust of Robots in Emergency Evacuation Scenarios

Robots have the potential to save lives in emergency scenarios, but could have an equally disastrous effect if participants overtrust them. To explore this concept, we performed an experiment where a participant interacts with a robot in a non-emergency task to experience its behavior and then chooses whether to follow the robot’s instructions in an emergency or not. Artificial smoke and fire alarms were used to add a sense of urgency. To our surprise, all 26 participants followed the robot in the emergency, despite half observing the same robot perform poorly in a navigation guidance task just minutes before. We performed additional exploratory studies investigating different failure modes. Even when the robot pointed to a dark room with no discernible exit the majority of people did not choose to safely exit the way they entered.

How predicting Shakespeare’s writing could improve our understanding of natural language

Google used the writings of 1,000 authors to train a deep neural network to predict writing patterns (credit: Martin Droeshout/Wikimedia Commons)

A Google natural language understanding research group led by Ray Kurzweil is building software systems that can understand natural language at a human level. The goal is to understand and interpret meanings of spoken or written language.

One key to achieving that understanding is establishing context, suggest researchers Chris Tar; Marc Pickett, PhD.; and Brian Strope, PhD., on the Google Research Blog.

For example, take the phrase, “Great, ice cream for dinner!”  If a six-year-old says it, it means something very different than if a parent says it, they point out.

That is, knowing the characteristics of the speaker (or writer) can narrow down the set of possible meanings of a phrase.

Similarly, the researchers suggested, a deep neural network (DNN) that takes into account the specific author’s style and “personality” should be able to predict (with higher accuracy than with a random guess) the next sentence an author would be likely to write in a book.

To test that idea, the researchers imported the text of 1,000 different authors from the Project Gutenberg website.

“The information our system derived is presumably representative of the author’s word choice, thinking, and style,” say the researchers. “We call these “Author vectors.’”

A section of a representation of “Author vectors” for some of the authors in the study. Note that contemporaries and influencers tend to be near each other (e.g., Marlowe and Shakespeare vs. Milton and Pope). It uses the t-SNE algorithm. (credit: Google/Christopher Olah)

Essentially, the system is saying, “I’ve been told that this is Shakespeare, who tends to write like this, so I’ll take that into account when weighing which sentence is more likely to follow.” In effect, one can chat with a statistical representation of text written by Shakespeare, the researchers note.

(Or in the future, suggest completions to the unfinished works of Philip K. Dick?)

The system could enrich Google products through personalization, the researchers suggest. “For example, it could help provide more personalized response options for the recently introduced Smart Reply feature in Inbox by Gmail” (a system that could automatically determine if an email was answerable with a short reply, and compose a few suitable responses that a user could edit or send with just a tap).

IBM Watson AI XPRIZE announced at TED

(credit: IBM)

The IBM Watson AI XPRIZE, a Cognitive Computing Competition was announced on the TED Stage today (Feb 17) by XPRIZE Foundation chairman Peter Diamandis and IBM Watson general manager David Kenny.

It’s a $5 million competition challenging teams from around the world to develop and demonstrate how humans can collaborate with powerful cognitive technologies to tackle some of the world’s grand challenges.

According to IBM, “the goal is to bring together a broad group of creative minds, which is why we’ve teamed up with XPRIZE, the leader in incentivized prize competitions that push the limits of possibility to change the world for the better, and the TED community, to make it happen.

“For each year between 2017 and 2019, we’ll challenge teams to share their ideas publicly and go head-to-head at our annual World of Watson conference with a chance to advance through the competition and win milestone prizes.”

Unlike previous XPRIZE competitions, teams will come up with their own ideas for challenges in the broad field of “AI.” Ideas will be evaluated by a panel of expert judges for technical validity and the TED and XPRIZE communities will choose the winner based on “the audacity of their mission and the awe-inspiring nature of the teams’ TED Talks in 2020,” says IBM.

IBM believes this competition can accelerate the creation of “landmark breakthroughs that deliver new, positive impacts to peoples’ lives, and the transformation of industries and professions.”

“Typical of all XPRIZE competitions, the IBM Watson AI XPRIZE will crowdsource solutions from some of the most brilliant thinkers and entrepreneurs around the world, creating true exponential impact,” according to the XPRIZE Foundation.

Complete rules and guidelines will be made available in May.  Pre-register here.