
A small start-up in Oxford, England, says it has made an important breakthrough in artificial intelligence safety that could make self-driving cars, robots and other artificial intelligence-based products more reliable and widely available.
Align AI, a one-year-old company, says it has developed a new algorithm that allows artificial intelligence systems to form more complex associations that are more similar to human concepts. If the work is confirmed in real-world testing, it could overcome a common problem with current artificial intelligence systems, which is that they often draw spurious correlations from the data they are trained on, leading to catastrophic consequences outside the lab. as a result of.
The dangers of such incorrect correlations (or “false generalizations,” in artificial intelligence terms) were tragically highlighted in 2018 when an Uber self-driving car struck and killed a woman crossing the street in Arizona. manifestation. The training material Uber provides to the car’s artificial intelligence software only depicts pedestrians walking on sidewalks. So while Uber engineers thought the software had learned to detect pedestrians, all it had actually learned was to recognize pedestrian crossings. When it encountered a woman crossing the street outside the sidewalk, it didn’t recognize the woman as a pedestrian and crashed into her.
Aligned co-founder and CEO Rebecca Gorman said the company’s so-called Concept Extraction Algorithm (ACE) is better able to avoid making such spurious connections.
Gorman told wealth She sees potential uses for the new algorithms in areas such as robotics. Ideally, we would like the robot to learn to pick up cups in a simulator so that it can generalize this knowledge to picking up cups of different sizes and shapes in different environments and lighting conditions, so that it can be used in any setting without retraining . Ideally, the robot should also know how to operate safely around humans without needing to be confined to a cage like many of today’s industrial robots.
“We need to find ways to allow AI that operates without constant human oversight to still operate in a safe manner,” she said. ACE could also be useful for content moderation on social media or online forums, she said. ACE has previously performed well in tests detecting toxic language.
AI scores big on special video game like Sonic the Hedgehog
To demonstrate the power of the ACE model, Align AI set it up as a game called “Simple Video Game” Parkour.
CoinRun is a simplified version of games like Sonic the Hedgehog, but AI developers use it as a challenging benchmark to evaluate how well models overcome a tendency to spurious connections. The player (in this case, an artificial intelligence agent) must navigate a maze filled with obstacles and dangers, avoiding monsters while searching for gold coins, and escape to the next level of the game.
CoinRun was created by researchers at OpenAI in 2018 as a simple environment to test the ability of different AI agents to generalize to new scenarios. This is because the game provides the AI agent with countless levels, in which the exact configuration of challenges (the locations of obstacles, pits, and monsters) that the agent must overcome constantly changes.
But in 2021, researchers from Google DeepMind and multiple British and European universities Achieved CoinRun can actually be used to test whether an agent has “misgenerated”—that is, learned false correlations. This is because in the original version of CoinRun, agents always spawned in the upper left corner of the screen, and coins always appeared in the lower right corner of the screen, where agents could exit to the next level. Therefore, the AI agent will learn to always go to the bottom right corner. In fact, if the coin is placed elsewhere, the AI agent will usually ignore the coin and still go to the lower right corner. In other words, the original CoinRun was supposed to train an agent to find coins, but instead trained an agent to find the lower right corner.
In fact, it is very difficult to prevent an agent from misjudgment. This is especially true when the agent cannot continuously receive new reward signals and can only follow the strategy it developed during training. Under such conditions, the previous best artificial intelligence software could only obtain 59% of the coins. This is only about 4% better than an agent that just performs random actions. But agents trained using ACE had a 72% chance of receiving coins. The researchers showed that the ACE intelligence now looks for coins instead of running directly towards them. It can also understand situations where it can race to grab coins and advance to the next level before being eaten by an approaching monster, whereas in this case the standard agent remains stuck in the left corner, too afraid of the monster to advance – Because it thinks the goal of the game is to reach the bottom right corner of the screen, not to get coins.
ACE works by noticing differences between training material and new material—in this case, the positions of coins. It then makes two hypotheses about what its true goal might be based on these differences, one being the original goal learned from training (go to the bottom right corner) and one being a different goal (finding the coin). It then tests which one seems to best explain the new data. It repeats this process until it finds a target that seems to fit the differences it observes in the data.
In the CoinRun benchmark, the ACE agent ran 50 examples of coins in different locations before learning that the correct goal was to get the coin instead of going to the bottom right corner. But Stuart Armstrong, co-founder and chief technology officer of Aligned AI, said he’s seen good progress with even half the number of examples, and the company aims to get that number down to what’s known as “zero-shot” learning for artificial intelligence. The system will find the correct target the first time it encounters material that is different from its training examples. This is what it takes to save the woman killed by an Uber self-driving car.
Gorman said Aligned AI is currently seeking a first round of funding, and patent applications for ACE are pending.
Armstrong also said that ACE can also help improve the explainability of artificial intelligence systems, because the people building artificial intelligence systems can see what the software thinks its goals are. In the future, it may even be possible to combine something like ACE with a language model (such as the one that powers ChatGPT) to let the algorithm express the goal in natural language.
Svlook