Stuart Armstrong - How Could We Align AI? @scfu

Science, Technology & the Future | Stuart Armstrong - How Could We Align AI? @scfu | Uploaded June 2022 | Updated October 2024, 1 hour ago.
Synopsis: The goal of Aligned AI is to implement scalable solutions to the alignment problem, and distribute these solutions to actors developing powerful transformative artificial intelligence.
What is Alignment?

Algorithms are shaping the present and will shape the future ever more strongly. It is crucially important that these powerful algorithms be aligned – that they act in the interests of their designers, their users, and humanity as a whole. Failure to align them could lead to catastrophic results.

Our long experience in the field of AI safety has identified the key bottleneck for solving alignment: concept extrapolation.
What is Concept Extrapolation?

Algorithms typically fail when they are confronted with new situations – they go out of distribution. Their training data will never be enough to deal with all unexpected situations – thus an AI will need to safely extend key concepts and goals, similarly – or better – to how humans do it.

This is concept extrapolation, explained in more details in this sequence. Solving the concept extrapolation problem is both necessary and almost sufficient for solving the whole AI alignment problem.

This talk is part of the ‘Stepping Into the Future‘ conference.

Bio: Dr Stuart Armstrong, Co-Founder and Chief Research Officer

Previously a Researcher at the University of Oxford’s Future of Humanity Institute, Stuart is a mathematician and philosopher and the originator of the value extrapolation approach to artificial intelligence alignment. He has extensive expertise in AI alignment research, having pioneered such ideas as interruptibility, low-impact AIs, counterfactual Oracle AIs, the difficulty/impossibility of AIs learning human preferences without assumptions, and how to nevertheless learn these preferences. Along with journal and conference publications, he posts his research extensively on the Alignment Forum.

Many thanks for tuning in!

Have any ideas about people to interview? Want to be notified about future events? Any comments about the STF series?
Please fill out this form: docs.google.com/forms/d/1mr9PIfq2ZYlQsXRIn5BcLH2onbiSI7g79mOH_AFCdIk
Consider supporting SciFuture by:
a) Subscribing to the SciFuture YouTube channel: youtube.com/subscription_center?add_user=TheRationalFuture

b) Donating
- Bitcoin: 1BxusYmpynJsH4i8681aBuw9ZTxbKoUi22
- Ethereum: 0xd46a6e88c4fe179d04464caf42626d0c9cab1c6b
- Patreon: patreon.com/scifuture

c) Sharing the media SciFuture creates

Kind regards,
Adam Ford
- Science, Technology & the Future - #SciFuture - scifuture.org

Carbon Copies Pt3 - Randal Koene #scifuture #neuroscience #ethics #science #mindupload

iGem Project - A Peptide Expression Platform

Andrés Gómez Emilsson - The Aesthetic of the Meta Aesthetic

Robert Sparrow - Yesterdays Child: Gene Editing, Enhancement & Obsolescence

Jesse Hoogland - Singular Learning Theory

Musing on Understanding & AI - Hugo de Garis, Adam Ford, Michel de Haan

Jay Novella - A Skeptics Guide to the Future!

Machine Understanding - Goal Achievement & Artwork - AGI17

To Seed or Not to Seed? The Expected Value of Directed Panspermia - Asher Soryl

Joscha Bach - Agency in an Age of Machines