A British R&D unit today unveiled a futuristic vision of âquantitative safety guaranteesâ for AI.
The Advanced Research and Invention Agency (ARIA) compares the guarantees to the high safety standards in nuclear power and passenger aviation. In the case of machine learning, the standards involve a probabilistic guarantee that no harm will result from a particular action.
At the core of ARIAâs plan is a âgatekeeperâ AI. This digital sentinel will ensure that other AI agents only operate within the guardrails set for a specific application.
ARIA will direct ÂŁ59 million towards the scheme. By the programmeâs end , the agency intends to demonstrate a scalable proof-of-concept in one domain. Suggestions include electricity grid balancing and supply chain management.
If effective, the project could safeguard high-stakes AI applications, such as improving critical infrastructure or optimising clinical trials.
The program is the brainchild of David âdavidadâ Dalrymple, who co-invented the popular cryptocurrency Filecoin.
Dalrymple has also extensively researched technical AI safety, which sparked his interest in the gatekeeper approach. As the programme director of ARIA, he can now put his theory into practice.
The gatekeeper guarantee
ARIAâs gatekeepers will draw on scientific world models and mathematical proofs. Dalrymple said the concept combines commercial and academic concepts.
âThe approaches being explored by big AI companies rely on finite samples and do not provide any guarantees about the behaviour of AI systems at deployment,â he told TNW via email.
âMeanwhile, if we focus too heavily on academic approaches like formal logic, we run the risk of effectively trying to build AI capabilities from scratch.
âThe gatekeeper approach gives us the best of both worlds by tuning frontier capabilities as an engine to drive at speed, but along rails of mathematical reasoning.â
This fusion requires deep interdisciplinary collaboration â which is where ARIA comes in.
The British DARPA?
Established last year, ARIA funds âhigh-risk, high-rewardâ research. The strategy has attracted comparisons to DARPA, the Pentagonâs âmad scienceâ unit.
Dalrymple has drawn another parallel with DARPA. He compares ARIAâs new project to DARPAâs HACMS program, which created an unhackable quadcopter. The project proved that formal verification can create bug-free software.
âVulnerabilities can be ruled out, but only with assumptions about the scope and speed of interventions that an attacker can make on the physical embodiment of a system,â Dalrymple said.
His plan builds on an approach endorsed by Yoshua Bengio, a renowned computer scientist. A Turing Award winner, Bengio has also called for âquantitative safety guarantees.â But heâs been disappointed by the progress thus far.
âUnlike methods to build bridges, drugs or nuclear plants, current approaches to train frontier AI systems â the most capable AI systems currently in existence â do not allow us to obtain quantitative safety guarantees of any kind,â Bengio wrote in a blogpost last year.
Dalrymple has a chance to change that. That would also be a huge boost for ARIA, which has attracted scrutiny from politicians.
Some lawmakers have questioned ARIAâs budget. The body has won ÂŁ800mn in funding over five years â a sizeable sum but a mere fraction of other government research bodies.
ARIA can also point to potential savings on the horizon. One programme it launched last month aims to train AI systems at 0.1% of the current cost.
One of the themes of this yearâs TNW Conference is Ren-AI-ssance: The AI-Powered Rebirth. If you want to go deeper into all things artificial intelligence, or simply experience the event (and say hi to our editorial team), weâve got something special for our loyal readers. Use the code TNWXMEDIA at checkout to get 30% off your business pass, investor pass or startup packages (Bootstrap & Scaleup).
Get the TNW newsletter
Get the most important tech news in your inbox each week.