2009 OntheTradeoffBetweenPrivacyandU

Jump to: navigation, search

Subject Headings:


Cited By


Author Keywords

Privacy, Anonymity, Data Publishing, Data Mining


In data publishing, anonymization techniques such as generalization and bucketization have been designed to provide privacy protection. In the meanwhile, they reduce the utility of the data. It is important to consider the tradeoff between privacy and utility. In a paper that appeared in KDD-2008, Brickell and Shmatikov proposed an evaluation methodology by comparing privacy gain with utility gain resulted from anonymizing the data, and concluded that "even modest privacy gains require almost complete destruction of the data-mining utility”. This conclusion seems to undermine existing work on data anonymization. In this paper, we analyze the fundamental characteristics of privacy and utility, and show that it is inappropriate to directly compare privacy and utility. We then observe that the privacy-utility tradeoff in data publishing is similar to the risk-return tradeoff in financial investment, and propose an integrated framework for considering privacy-utility tradeoff, borrowing concepts from the Modern Portfolio Theory for financial investment. Finally, we evaluate our methodology on the Adult dataset from the UCI machine learning repository. Our results clarify several common misconceptions about data utility and provide data publishers useful guidelines on choosing the right tradeoff between privacy and utility.



 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 OntheTradeoffBetweenPrivacyandUTiancheng Li
Ninghui Li
On the Tradeoff Between Privacy and Utility in Data PublishingKDD-2009 Proceedings10.1145/1557019.15570792009