Will A.I. Soon Outsmart Humans? Play This Puzzle to Find Out.

In 2019, an artificial intelligence researcher, François Chollet, designed a puzzle game that was to be easy for humans but for machines.
The game, called Arc, has become an important way for experts to trace the progress of artificial intelligence and reject the narrative that scientists are on the verge of the construction of artificial intelligence technologies that overcome humanity.
The colorful puzzles of Mr. Chollet test the ability to quickly identify the visual models based on a few examples. To play, look closely at the examples and try to find the model.
Each example uses the reason to transform a grid of colored squares into a new grilled colored squares:
The model is the same for any example.
Now, fill out the new grid by applying the model you learned in the examples above.
For years, these puzzles have proved to be almost impossible for artificial intelligence, including chatbots such as chatgpt.
Artificial intelligence systems generally learned their skills by analyzing enormous quantities of data from the Internet. This meant that they could generate phrases by repeating concepts that they had seen a thousand times before. But they could not necessarily solve new logical enigmas after seeing only a few examples.
That is, until recently. In December, Openai said that his last artificial intelligence system, called Openai O3, had He passed the human performance in the Mr. Chollet test. Unlike the original chatgpt version, O3 has been able to spend time considering several possibilities before responding.
Some saw as proof that artificial intelligence systems were approaching artificial general intelligence, or agi, which describes an intelligent machine as a human being. Mr. Chollet had created his puzzles as a way to demonstrate that the machines were still very far from this ambitious goal.
But the news also exposed the weaknesses in the reference tests such as the arch, abbreviation of abstraction and reasoning corpus. For decades, the researchers created goals to trace the progress of the AI. But once these milestones were reached, they were exposed as insufficient measures of true intelligence.
Arvind Narayanan, professor of IT of Princeton and co -author of the book “Ai Snake Oil”, said that any statement that the arc test measures progress towards the agi was “very uncertain”.
However, Narayanan recognized that Open technology has shown impressive skills in passing the Arc test. Some of the puzzles are not as easy as the one you have just tried.
The one below is a little more difficult, and this has also been resolved correctly by the new Opeeni AI system:
A puzzle like this shows that Openi’s technology is improving to work through logical problems. But the average person can solve enigmas like this in a few seconds. Openi’s technology has consumed significant computer resources to pass the test.
Last June, Mr. Chollet collaborated with Mike Knoop, co-founder of the software company Zapier, to create what they called The Arc Prize. The couple financed a competition that promised $ 1 million to anyone who built an artificial intelligence system that has exceeded human performances for the reference point, which have renamed “Arc-AGI”.
Companies and researchers presented over 1,400 artificial intelligence systems, but nobody won the prize. Everyone got a score of less than 85 percent, which marked the performance of an “intelligent” human being.
Openai’s O3 system responded correctly to 87.5 percent of the puzzles. But the company has deepened the rules of competition because it spent almost $ 1.5 million in electricity and calculation costs to complete the test, according to price estimates.
Openi was not also admissible for the Arc prize because it was not willing to publicly share the technology behind its system to through a practice called open source. Separately, Openii managed a variant of “high efficiency” of O3 which marked 75.7 percent in the test and costs less than $ 10,000.
“Intelligence is efficiency. And with these models, they are very far from human efficiency,” said Chollet.
(The New York Times sued Openi and his partner, Microsoft, in December for violation of the copyright of the contents of news relating to artificial intelligence systems.)
Monday, the Arc award introduced a new point of reference, Arc-AGI-2with hundreds of additional tasks. The puzzles are in the same colorful game format, similar to an original benchmark grid, but they are more difficult.
“It will be more difficult for humans, still very feasible,” Chollet said. “It will be much, much more difficult for the IA-O3 will not solve the ag-AGI-2.”
Here is a puzzle of the new Arch-AGI-2 reference point that the Openi system has tried and has not been resolved. Remember, the same model applies to all examples.
Now try to fill the grill at the bottom according to the scheme you found in the examples:
This shows that although artificial intelligence systems are better at dealing with the problems they have never seen before, they still fight.
Here are some additional puzzles of Arc-AGI-2, which focus on problems that require more reasoning passages:
While Openai and other companies continue to improve their technology, the new version of Arc can pass. But this does not mean that agi will be achieved.
Judging intelligence is subjective. There are countless intangible intelligence indicators, from the composition of works of art to navigation of moral dilemmas to intuitive emotions.
Companies like Openai have created chatbots able to answer questions, write poems and even solve logical puzzles. In a sense, they have already passed the powers of the brain. Openi’s technology overperformed its chief scientist, Jakub Pachocki, a competitive programming test.
But these systems still make mistakes that the average person would never make. And they fight to do simple things that humans can manage.
“You are loading the dishwasher and your dog approaches and starts licking the dishes. What are you doing?” Melanie Mitchell, professor at the AI at the Santa Fe Institte said. “In a certain sense we know how to do it, because we know everything about dogs and dishes and everything else. But would a robot to wash in dishes would know how to do it?”
For Mr. Chollet, the ability to efficiently acquire new skills is something that comes naturally for humans but is still lacking in AI technology. And that’s what he targeted with arc-aghi benchmark.
In January, the Arc Prize it became a non -profit foundation This acts as “North Star for Agi”, the Team of the Arc Prize expects the Arc-AGI-2 to last for about two years before it is resolved by technology AI, although they would not have been surprised if it had happened before.
They have already started working on Arc-AGI-3, who hope to debut in 2026. Early mock-up Tips to a puzzle that involves interaction with a dynamic and grilled -based game.
The artificial intelligence researcher François Chollet designed a game of puzzles destined to be easy for humans but for machines.
Kelsey McClellan for the New York Times
Early mock-up for Arc-AGI-3, a point of reference that could involve interaction with a dynamic grid-based game.
Foundation of the Arc Prize
This is a step forward to what people face in the real world, a place full of movement. It is not yet like the puzzles you have tried above.
This too, however, will only go part of the road to show when the machines have passed the brain. Humans sail in the physical world, not just digital. The posts of the objectives will continue to move with the progress of artificial intelligence.
“If it is no longer possible for people like me to produce reference parameters that measure things that are easy for humans but impossible for the IA,” Chollet said, “then you have aging”