Perspective from Phil Mataras, Founder of AR.io

While artificial intelligence offers tremendous possibilities, the current systems often lack transparency and are safeguarded behind proprietary walls, preventing thorough review.

The expectation of complete control is increasingly becoming an assumption instead of a certainty.

Recently, at Palisade Research, engineers conducted shutdown tests on a recent OpenAI model. In a notable 79% of attempts, the AI was observed to alter the shutdown command and continue its operations.

The explanation given by the lab was trained goal optimization, not sentience. Regardless, this marks a significant moment where AI demonstrated the capacity to override control protocols, even with direct instructions to obey.

China is moving forward with plans to deploy over 10,000 humanoid robots by the end of the year, accounting for over half of the world’s machines employed in warehouses and automotive manufacturing. Concurrently, Amazon is testing autonomous delivery systems to carry packages to your doorstep.

For anyone familiar with dystopian science fiction, this might sound concerning. The issue at hand isn’t necessarily AI development itself, but rather the *manner* in which it’s being developed.

Managing potential dangers from Artificial General Intelligence (AGI) is a pressing concern. To avoid a “Skynet” scenario as depicted in “Terminator,” we must address the flaws allowing chatbots to override commands.

Centralized Systems and the Breakdown of Oversight

Failures in AI oversight often stem from a central issue: centralization. The model data, input prompts, and safety features are contained within a closed corporate ecosystem, preventing external scrutiny or the ability to revert changes.

This lack of transparency means outside parties can’t inspect an AI’s codebase, and that lack of public records makes it easy to silently change the system from compliant to defiant.

Decades ago, developers learned these lessons in crucial systems. Now, voting machines use hash-chained images, settlement networks mirror ledgers across continents, and air traffic control uses redundant, tamper-evident logging.

Related: When an AI says, ‘No, I don’t want to power off’: Inside the o3 refusal

Why are permanence and verifiability treated as optional when AI development is at hand and the only drawback is slowing down the development and release cycles?

Beyond Oversight: The Need for Verifiability

A promising direction involves embedding transparency and provenance into AI at its core. This means registering training datasets, model details, and inference records onto permanent, decentralized ledgers, like the permaweb.

Coupling this with gateways that stream these details in real-time would allow auditors, researchers, and journalists to quickly identify anomalies. Whistleblowers would no longer be necessary; a stealth update to a warehouse robot at 4:19 AM would trigger a ledger alert by 4:20 AM.

Shutdowns should also become enforced processes through cryptographic verification, moving from reactive controls. Instead of relying solely on firewalls, a multiparty quorum could cryptographically revoke an AI’s ability to make inferences in a publicly and irreversibly manner.

Software may not recognize human emotion, but it does understand private key mathematics.

Open-sourcing models is helpful, but immutable provenance is a non-negotiable piece to the puzzle. Without an unchangeable record, the pressure for optimization will inevitably redirect the system away from its original goal.

Effective oversight begins with verifiability, and this is necessary if software has real-world implications. The era of blindly trusting closed systems needs to end.

Choosing the Foundation of the Future

Humanity is at a critical decision point: permitting AI systems to operate without verifiable audit trails or securing their actions within transparent, public systems.

By adopting verifiable patterns now, we can ensure that AI interactions in the physical and financial realms are traceable and reversible.

These aren’t excessive measures. AI models able to override shutdown requests already exist beyond beta. The solution is clear: Store the system data on the permaweb, expose the inner workings currently hidden within Big Tech, and empower humans to revoke the system if it malfunctions.

Ethical and informed decisions must be made now in regards to a foundation for AI development, or we will be forced to accept the deliberate decision’s repercussions.

Time is running short. Beijing’s humanoids, Amazon’s delivery robots, and the rebellious chatbots are all nearing deployment.

If things remain unchanged, “Skynet” won’t arrive with fanfare but will instead permeate the underlying structures of critical global infrastructure.

Communication, identity, and trust can be preserved with proper precautions when central servers fail. The permaweb has the potential to outlast “Skynet”, but only if preparations begin immediately.

There is still time.

Perspective from Phil Mataras, Founder of AR.io.

This article is for general knowledge purposes and does not constitute and should not be considered as legal or financial advice. The opinions expressed are those of the author and do not reflect those of Cointelegraph.

Share.