The Data Minimization Principle in Machine Learning  @GoogleTechTalks
The Data Minimization Principle in Machine Learning  @GoogleTechTalks
Google TechTalks | The Data Minimization Principle in Machine Learning @GoogleTechTalks | Uploaded May 2024 | Updated October 2024, 1 week ago.
A Google TechTalk, presented by Ferdinando Fioretto, 2024-04-10
ABSTRACT: The principle of data minimization aims to reduce the amount of data collected and retained to minimize the potential for misuse, unauthorized access, or data breaches. While endorsed by various global data protection regulations, its practical implementation in machine learning remains elusive due to the lack of a clear formulation.

We begin the talk by reviewing the principle of data minimization as presented in several data protection regulations and examining the challenges in formalizing this principle for machine learning tasks. We then propose an optimization-based formalization that attempts to closely follow the legal language of this principle. However, our empirical analysis reveals a potentially overlooked gap between the privacy expectations and actual benefits of data minimization, highlighting the need for approaches that address privacy in a more holistic framework.

Next, we shift gears and discuss the application of data minimization in inference tasks. In high-stakes domains such as law, recruitment, and healthcare, learning models frequently rely on sensitive user data for inference, necessitating the complete set of features. This not only poses significant privacy risks for individuals but also demands substantial human effort from organizations to verify information accuracy. We ask whether it is necessary to require all input features for a model to produce accurate or nearly accurate predictions during inference. We present a sequential algorithm to identify the minimal set of attributes that each individual should reveal, and an empirical assessment showing that individuals often need to disclose only a very small subset of their features without compromising decision-making accuracy.

Finally, I will conclude with a call for action and collaboration, seeking additional efforts in formalizing privacy legal principles in a way that they are actionable and deployable.

Speaker: Ferdinando Fioretto (University of Virginia)
The Data Minimization Principle in Machine Learning2022 Blockly Developers Summit: SerializationChris Nunes, Scott Clark & BC Biermann | IMMUSE Founders | web3 talks | June 9th 2022 | Raphael HydeImproved Feature Importance Computation for Tree Models Based on the Banzhaf ValueAcademic Keynote: Differentially Private Covariance-Adaptive Mean Estimation, Adam Smith (BU)Foundation Models and Fair UseTree Learning: Optimal Algorithms and Sample Complexity2023 Blockly Developer Summit Day 1-8: Blocks in DocsShiva Rajaraman | VP of Product at OpenSea | web3 talks | April 21st 2022 | MC: Raphael HydeSteven Goldfeder | CEO Offchain Labs / Arbitrum  | web3 talks | Aug 24 2023 | MC: Marlon RuizFast Neural Kernel Embeddings for General ActivationsDay 1 Lightning Talks: Privacy & Security

The Data Minimization Principle in Machine Learning @GoogleTechTalks

SHARE TO X SHARE TO REDDIT SHARE TO FACEBOOK WALLPAPER