MAPS: Scaling Privacy Compliance Analysis to a Million Apps

Authors: Sebastian Zimmeck (Department of Mathematics and Computer Science, Wesleyan University), Peter Story (School of Computer Science, Carnegie Mellon University), Daniel Smullen (School of Computer Science, Carnegie Mellon University.), Abhilasha Ravichander (School of Computer Science, Carnegie Mellon University.), Ziqi Wang (School of Computer Science, Carnegie Mellon University.), Joel Reidenberg (School of Law, Fordham University.), N. Cameron Russell (School of Law, Fordham University.), Norman Sadeh (School of Computer Science, Carnegie Mellon University)

Volume: 2019
Issue: 3
Pages: 66–86

Download PDF

Abstract: The app economy is largely reliant on data collection as its primary revenue model. To comply with legal requirements, app developers are often obligated to notify users of their privacy practices in privacy policies. However, prior research has suggested that many developers are not accurately disclosing their apps’ privacy practices. Evaluating discrepancies between apps’ code and privacy policies enables the identification of potential compliance issues. In this study, we introduce the Mobile App Privacy System (MAPS) for conducting an extensive privacy census of Android apps. We designed a pipeline for retrieving and analyzing large app populations based on code analysis and machine learning techniques. In its first application, we conduct a privacy evaluation for a set of 1,035,853 Android apps from the Google Play Store. We find broad evidence of potential non-compliance. Many apps do not have a privacy policy to begin with. Policies that do exist are often silent on the practices performed by apps. For example, 12.1% of apps have at least one location-related potential compliance issue. We hope that our extensive analysis will motivate app stores, government regulators, and app developers to more effectively review apps for potential compliance issues.

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 license.