Open Problems in Mechanistic Interpretability: A Whirlwind Tour  @GoogleTechTalks
Open Problems in Mechanistic Interpretability: A Whirlwind Tour  @GoogleTechTalks
Google TechTalks | Open Problems in Mechanistic Interpretability: A Whirlwind Tour @GoogleTechTalks | Uploaded June 2023 | Updated October 2024, 1 week ago.
A Google TechTalk, presented by Neel Nanda, 2023/06/20
Google Algorithms Seminar - ABSTRACT: Mechanistic Interpretability is the study of reverse engineering the learned algorithms in a trained neural network, in the hopes of applying this understanding to make powerful systems safer and more steerable. In this talk Neel will give an overview of the field, summarise some key works, and outline what he sees as the most promising areas of future work and open problems. This will touch on techniques in casual abstraction and meditation analysis, understanding superposition and distributed representations, model editing, and studying individual circuits and neurons.

About the Speaker: Neel works on the mechanistic interpretability team at Google DeepMind. He previously worked with Chris Olah at Anthropic on the transformer circuits agenda, and has done independent work on reverse-engineering modular addition and using this to understand grokking.
Open Problems in Mechanistic Interpretability: A Whirlwind TourLimitations of Stochastic Selection with Pairwise Independent PriorsWelcome and Federated Learning and Analytics at GooglePathwise Conditioning and Non-Euclidean Gaussian Processes2023 Blockly Developer Summit Day 2-11: Onboarding New UsersA Constant Factor Prophet Inequality for Online Combinatorial AuctionsLuke Gniwecki | VP of Product @ LandVault & Founder of Metaverski | web3 talks | May 26th 2022Day 1 Lightning Talks: Federated Optimization and AnalyticsBuilding Developer Assistants that Think Fast and SlowFederated Learning with Formal User-Level Differential Privacy GuaranteesGeorge Tung | Founder of CryptosRus | web3 talks | Dec 1st 2022 | MC: Marlon RuizAcademic Keynote: Mean Estimation with User-level Privacy under Data Heterogeneity, Rachel Cummings

Open Problems in Mechanistic Interpretability: A Whirlwind Tour @GoogleTechTalks

SHARE TO X SHARE TO REDDIT SHARE TO FACEBOOK WALLPAPER