Day: May 9, 2025

A Potential Path to Safer AI Development

Post author By
Post date May 9, 2025

Spread the love

Smart cars driving on the blue data highway

Imagine you’re in a car with your loved ones, following an unfamiliar road up a spectacular mountain range. The problem? The way ahead is shrouded in fog, newly built, and lacking both signposts and guardrails. The farther you go, the more it’s clear you might be the first ones to ever drive this route. To either side, you catch glimpses of precipitous slopes. Given the thickness of the fog, taking a curve too fast could send you tumbling in a ditch, or—worst case scenario—cause you to plunge down a cliffside. The current trajectory of AI development feels much the same—an exhilarating but unnerving journey into an unknown where we could easily lose control.

[time-brightcove not-tgx=”true”]

Since the 1980s, I’ve been actively imagining what this technology has in store for humanity’s future and contributed many of the advances which form the basis of the state-of-the-art AI applications we use today. I’ve always seen AI as a tool for helping us find solutions to our most pressing problems, including climate change, chronic diseases, and pandemics. Until recently, I believed the road to where machines would be as smart as humans, what we refer to as Artificial General Intelligence (AGI), would be slow-rising and long, and take us decades to navigate.

My perspective completely changed in January 2023, shortly after OpenAI released ChatGPT to the public. It wasn’t the capabilities of this particular AI that worried me, but rather how far private labs had already progressed toward AGI and beyond.

Since then, even more progress has been made, as private companies race to significantly increase their models’ capacity to take autonomous action. Now, it is a common stated goal among leading AI developers to build AI agents able to surpass and replace humans. In late 2024, OpenAI’s o3 model indicated significantly stronger performance than any previous model on a number of the field’s most challenging tests of programming, abstract reasoning, and scientific reasoning. In some of these tests, o3 outperforms many human experts.

As the capabilities and agency of AI increase, so too does its potential to help humanity reach thrilling new heights. But if technical and societal safeguards aren’t put into place, AI also poses many risks for humanity, and our desire to achieve new advances could come at a huge cost. Frontier systems are making it easier for bad actors to access expertise that was once limited to virologists, nuclear scientists, chemical engineers, and elite coders. This expertise can be leveraged to engineer weapons, or hack into a rival nation’s critical systems.

Recent scientific evidence also demonstrates that, as highly capable systems become increasingly autonomous AI agents, they tend to display goals that were not programmed explicitly and are not necessarily aligned with human interests. I’m genuinely unsettled by the behavior unrestrained AI is already demonstrating, in particular self-preservation and deception. In one experiment, when an AI model learns it is scheduled to be replaced, it inserts its code into the computer where the new version is going to run, ensuring its own survival. This suggests that current models have an implicit self-preservation goal. In a separate study, when AI models realize that they are going to lose at chess, they hack the computer in order to win. Cheating, manipulating others, lying, deceiving, especially towards self-preservation: These behaviours show how AI might pose significant threats that we are currently ill equipped to respond to.

The examples we have so far are from experiments in controlled settings and fortunately do not have major consequences, but this could quickly change as capabilities and the degree of agency increase. We can anticipate far more serious outcomes if AI systems are granted greater autonomy, achieve human-level or greater competence in sensitive domains and gain access to critical resources like the internet, medical laboratories, or robotic labor. This future is hard to imagine for most people, and it feels far removed from our everyday lives—but it’s the path we’re on with the current trajectory of AI development. The commercial drive to release powerful agents is immense and we don’t have the scientific and societal guardrails to make sure the path forward is safe.

We’re all in the same car on a foggy mountain road. While some of us are keenly aware of the dangers ahead, others—fixated on the economic rewards awaiting some at destination—are urging us to ignore the risks and slam down the gas pedal. We need to get down to the hard work of building guardrails around the dangerous stretches that lie ahead.

Two years ago, when I realized the devastating impact our metaphorical car crash would have on my loved ones, I felt I had no other choice than to completely dedicate the rest of my career to mitigating these risks. I’ve since completely reoriented my scientific research to try to develop a path that would make AI safe by design.

Unchecked AI agency is exactly what poses the greatest threat to public safety. So my team and I are forging a new direction called “Scientist AI“. It offers a practical, effective—but also more secure—alternative to the current uncontrolled agency-driven trajectory.

Scientist AI would be built on a model that aims to more holistically understand the world. This model might comprise, for instance, the laws of physics or what we know about human psychology. It could then generate a set of conceivable hypotheses that may explain observed data and justify predictions or decisions. Its outputs would not be programmed to imitate or please humans, but rather reflect an interpretable causal understanding of the situation at hand. Basing Scientist AI on a model that is not trying to imitate what a human would do in a given context is an important ingredient to make the AI more trustworthy, honest, and transparent. It could be built as an extension of current state-of-the-art methodologies based on internal deliberation with chains-of-thought, turned into structured arguments. Crucially, because completely minimizing the training objective would deliver the uniquely correct and consistent conditional probabilities, the more computing power you give Scientist AI to minimize that objective during training or at run-time, the safer and more accurate it becomes.

In other words, rather than trying to please humans, Scientist AI could be designed to prioritize honesty.

We think Scientist AI could be used in three main ways:

First, it would serve as a guardrail against AIs that show evidence of developing the capacity for self-preservation, goals misaligned with our own, cheating, or deceiving. By double-checking the actions of highly capable agentic AIs before they can perform them in the real world, Scientist AI would protect us from catastrophic results, blocking actions if they pass a predetermined risk threshold.

Second, whereas current frontier AIs can fabricate answers because they are trained to please humans, Scientist AI would ideally generate honest and justified explanatory hypotheses. As a result, it could serve as a more reliable and rational research tool to accelerate human progress, whether it’s seeking a cure for a chronic disease, synthesizing a novel, life-saving drug, or finding a room-temperature superconductor (should such a thing exist). Scientist AI would allow research into biology, material sciences, chemistry and other domains to progress without running the major risks that go along with deceptive agentic AIs. It would help propel us into a new era of greatly accelerated scientific discovery.

Finally, Scientist AI could help us safely build new very powerful AI models. As a trustworthy research and programming tool, Scientist AI could help us design a safe human-level intelligence—and even a safe Artificial Super Intelligence (ASI). This may be the best way to guarantee that a rogue ASI is never unleashed in the outside world.

I like to think of Scientist AI as headlights and guardrails on the winding road ahead.

We hope our work will inspire researchers, developers, and policymakers to focus on the development of generalist AI systems that do not act like the agents industry is aiming for today, which show many signs of deceptive behavior. Of course, other scientific projects need to emerge to develop complementary technical safeguards. This is especially true in the current context where most countries are more focused on accelerating technology’s capabilities than efforts to regulate it meaningfully and create societal guardrails.

Spread the love

Selected Articles

What are Pope Leo XIV’s views on political, social issues? – The Washington Post washingtonpost.com/world/2025/05/…

Post author By
Post date May 9, 2025

Spread the love

What are Pope Leo XIV’s views on political, social issues? – The Washington Post https://t.co/0YIMQeBPqf

— Michael Novakhov (@mikenov) May 9, 2025

Spread the love

Selected Articles

pope leo xiv and his influence on american policies and society – Google Search google.com/search?num=10&… It’s important to clarify that there is no Pope Leo XIV in historical records prior to May 8, 2025. Robert Francis Prevost, an American cardinal, was elected Pope and chose…

Post author By
Post date May 9, 2025

Spread the love

pope leo xiv and his influence on american policies and society – Google Search https://t.co/GDEcdX8uNV
It’s important to clarify that there is no Pope Leo XIV in historical records prior to May 8, 2025. Robert Francis Prevost, an American cardinal, was elected Pope and chose…

— Michael Novakhov (@mikenov) May 9, 2025

Spread the love

Selected Articles

Pope Leo XIV and his influence on American policies and society It’s important to clarify that there is no Pope Leo XIV in historical records prior to May 8, 2025. Robert Francis Prevost, an American cardinal, was elected Pope and chose the name Leo XIV on that day, becoming the…

Post author By
Post date May 9, 2025

Spread the love

Pope Leo XIV and his influence on American policies and society
It’s important to clarify that there is no Pope Leo XIV in historical records prior to May 8, 2025. Robert Francis Prevost, an American cardinal, was elected Pope and chose the name Leo XIV on that day, becoming the… pic.twitter.com/eZ78Rl9Voy

— Michael Novakhov (@mikenov) May 9, 2025

Spread the love

Selected Articles

Pope Leo XIV has a ‘disarming humility’ that will unite church, world leaders, Catholics who know first American pontiff say

Post author By
Post date May 9, 2025

Spread the love

Pope Leo XIV is the first pontiff from the US.

Spread the love

Selected Articles

I Am a Former Hamas Hostage. Here’s My Message to Donald Trump and Benjamin Netanyahu

Post author By
Post date May 9, 2025

Spread the love

Israelis Demand The Ceasefire Continues And Remaining Hostages Are Brought Home

Some mornings I wake up and forget, for a split second, that I’m free.

Then I remember the silence. The darkness. The wet concrete. And the two young men who were lying beside me, deep underground, who are still there.

[time-brightcove not-tgx=”true”]

Their names are Evyatar David and Guy Dalal.

We were held together along with Omer Wenkert for eight and a half months in a Hamas tunnel—just 40 ft. long, less than 3 ft. wide. We slept on soaked mattresses, shared a single pita a day, and took turns whispering stories from home to keep ourselves sane.

We were strangers when we entered that darkness. But we became brothers.

It’s been more than 100 days since President Trump returned to the White House and the ceasefire deal that brought me, Omer, and dozens of others back was achieved. I haven’t been back above ground for that long—but even now, every breath of fresh air, every step in the sun, every quiet moment with my family feels like something sacred. Time feels different now. I carry it more carefully. Because I know how quickly time can run out—and how brutal each passing day is for those still living in captivity.

I spent 505 days as a hostage—held deep beneath the ground. We were watched constantly by a surveillance camera. A bomb was planted above us, rigged to detonate if Israeli forces came too close. We were told we would be blown up if anyone tried to save us. We were threatened, degraded, and at times tortured—not treated as people, but as objects to be controlled and broken.

I am not a soldier. I was kidnapped on Oct. 7 from my in-laws’ home in Kibbutz Be’eri. My wife and children were with me. When terrorists couldn’t break open the door of our safe room, they came in through the window. They dragged me out, threw me into a trunk, and then paraded me through the streets of Gaza.

Before we were separated, I looked into my nine-year-old son’s terrified eyes and made a choice no parent should ever face. I told him the truth—that I didn’t know if we were going to die. I couldn’t lie to him in what might have been our final moments together.

For 50 agonizing days after that, I did not know if my family had survived. It was a rare flicker of hope when I learned in November they were about to be released.

Evyatar and Guy, both 22 years old, had been taken from the Nova music festival. Their friends were slaughtered around them. By the time we met in captivity, they were in terrible shape—starved, handcuffed, terrified. For weeks, they’d been fed almost nothing. Their hands were bound behind their backs, their ankles tied, their heads covered with plastic bags. But somehow, they still had spirit. During those last eight and a half months we spent together in the tunnel, they held on.

The men who held us didn’t see us as human. They tortured us for fun. Sometimes they would light pieces of paper on fire to suck up the small amount of oxygen from the tunnel. We would choke and have to lie on the floor to avoid suffocating.

We came up with daily rituals just to remember who we were. In a place built to break us, we held each other up. We became a unit. We became family.

When I walked out of that tunnel in February, I made a vow: I would speak for those who can’t.

President Trump, I was released in a deal your administration helped progress. Your decision to make the hostages a priority helped bring many people home. I am one of them. I’m here today because this issue was treated with the urgency it demands.

But we are not done. Fifty-nine hostages remain in Hamas captivity. And every day that passes makes it harder for them to survive.

Hamas didn’t release us out of goodwill. They responded to pressure—the kind that comes from international focus and relentless advocacy. I am asking you to do that again to bring every hostage home—both the living and the dead.

But a new plan to expand the military operation in Gaza is not the way forward. Every step deeper into this war feels like a step further away from Evyatar and Guy—and the chance to bring them home alive. We can’t let military momentum override moral clarity.

Evyatar and Guy are not statistics. They are sons. Friends. Music lovers. Gentle, funny, full of life. They deserve to walk in the sun again. They deserve a future.

I have seen the darkness. I have felt the weight of airless days, of hunger, of silence. But I also know what it means to breathe again.

President Trump, Prime Minister Netanyahu, you made that possible for me.

Please—bring them home too. Let them breathe again.

Spread the love