Measuring the Unmeasured: New Threats to Machine Learning Systems
Machine learning (ML) is at the core of many Internet services and operates on users’ personal information. The deciding metric for deploying ML models is often test performance, which measures if the models learned the given task well. Test performance, however, does not measure other important properties of ML models such as security vulnerabilities, privacy leakage and compliance with regulations.
In this talk, I will first give an overview of threats and issues in current ML systems, including data poisoning attacks, backdoors in ML, adversarial examples, training data leakage, etc. I will then introduce overlearning, a phenomenon where deep learning models “accidentally" learn representations for inferring sensitive attributes uncorrelated with the training objective. I will demonstrate the threats posed by overlearning and discuss why solutions such as censoring do not appear to work.
Congzheng Song is a Computer Science Ph.D candidate at Cornell University working with Prof. Vitaly Shmatikov. His research interests are in security and privacy in machine learning. His past works identified a range of privacy leakages, either intentionally or unintentionally, when training and deploying machine learning models computed on sensitive user data. As a DLI fellow, Congzheng will continue exploring and identifying security and privacy threats in real world artificial intelligence assisted systems.