Website Sheds Light on Shortcomings of Privacy PoliciesBy Daniel Tkacik, CyLab / 412-268-1187 / firstname.lastname@example.org
"This is the first site to provide analysis of privacy policies at this scale," said School of Computer Science Professor Norman Sadeh, lead principal investigator of the study and a researcher in CyLab, Carnegie Mellon’s security and privacy institute. "Our objective is to produce succinct yet informative summaries that can be included in browser plug-ins or interactively conveyed to users by privacy assistants that inform users about salient privacy practices."
For instance, a user interested in learning more about the data collected by a given site can select "first-party collection practices," and all statements in the policy about data collection will be highlighted. Similarly, users can click the "third-party sharing practices" option and see a display of statements made by the site about the different entities with which it shares user data.
The interactive tool covers a comprehensive number of different practices, including whether the site provides opt-out or opt-in choices to users, whether it discloses its retention policy and whether it includes statements about "Do Not Track," as mandated by California law (CalOPPA) and much more.
"While navigating our site, people will notice how complex and fragmented many privacy policies are," Sadeh said. "The vast majority of statements are about first-party collection and third-party sharing and contain significant levels of ambiguity when it comes to determining exactly what is being collected and with whom it is shared."
"Color codes also make it clear that privacy policies tend to mix a variety of different statements in the same paragraph, often requiring the reader to read large portions of the policy, if not the entire policy, before hoping to be able to answer simple questions," added Professor Joel Reidenberg, the Fordham principal investigator on the project and director of Fordham Center on Law and Information Policy.
"Many sites hardly provide users with any real choices. Most policies that mention 'Do Not Track' do so by simply indicating that they do not handle Do Not Track requests – the bare minimum required under CalOPPA," he said.
While the annotations on the website were crowdsourced from law students at Fordham, the researchers say they're working toward automation.
"We are now using machine learning and natural language processing to semi-automate, and hopefully one day fully automate, the analysis of privacy policies," Sadeh said.
The Usable Privacy Project is supported by a grant from the National Science Foundation. The website design team also included Institute for Software Research post-doctoral fellows Mads Schaarup Andersen, Florian Schaub, Shomir Wilson, Language Technologies Institute graduate student Aswarth Dara and computer science freshman Sushain Cherivirala.