A paper referenced in the talk, "Deep learning generalizes because the parameter-function map is biased towards simple functions": arxiv.org/abs/1805.08522
This was filmed as part of Redwood Research's Machine Learning for Alignment Bootcamp
A paper referenced in the talk, "Deep learning generalizes because the parameter-function map is biased towards simple functions": arxiv.org/abs/1805.08522
This was filmed as part of Redwood Research's Machine Learning for Alignment Bootcamp6:How to Build a Safe Advanced AGI?: Evan Hubinger 2023AI Safety Talks2023-05-13 | Part 6 of a series of talks in which researcher Evan Hubinger explores the problems of safety for artificial general intelligence
This was recorded as part of the SERI ML Alignment Theory Scholars Program: serimats.org5:Predictive Models: Evan Hubinger 2023AI Safety Talks2023-05-13 | Part 5 of a series of talks in which researcher Evan Hubinger explores the problems of safety for artificial general intelligence
This was recorded as part of the SERI ML Alignment Theory Scholars Program: serimats.org4:How Do We Become Confident in the Safety of an ML System?: Evan Hubinger 2023AI Safety Talks2023-05-13 | Part 4 of a series of talks in which researcher Evan Hubinger explores the problems of safety for artificial general intelligence
This was recorded as part of the SERI ML Alignment Theory Scholars Program: serimats.org3:How Likely is Deceptive Alignment?: Evan Hubinger 2023AI Safety Talks2023-05-13 | Part 3 of a series of talks from researcher Evan Hubinger.
This was recorded as part of the SERI ML Alignment Theory Scholars Program: serimats.org1:AGI Safety: Evan Hubinger 2023AI Safety Talks2023-05-13 | Part 1 of a series of talks in which researcher Evan Hubinger explores the problems of safety for artificial general intelligence
This was recorded as part of the SERI ML Alignment Theory Scholars Program: serimats.orgConcrete Open Problems in Mechanistic Interpretability: Neel Nanda at SERI MATSAI Safety Talks2023-05-05 | How can we look inside neural networks and figure out how they do what they do? This is likely to be very important for alignment and safety, but the research is at an early stage, with lots of opportunities for great work. Researcher Neel Nanda talks about some of them in this talk.