- (Li et al., 2009) ⇒ Tiancheng Li, and Ninghui Li. (2009). “On the Tradeoff Between Privacy and Utility in Data Publishing.” In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2009). doi:10.1145/1557019.1557079
- Categories and Subject Descriptors: H.2.7 Database Administration: Security, Integrity, and Protection; H.2.8 Database Applications: Data mining.
- General Terms: Algorithms, Experimentation, Security, Theory.
In data publishing, anonymization techniques such as generalization and bucketization have been designed to provide privacy protection. In the meanwhile, they reduce the utility of the data. It is important to consider the tradeoff between privacy and utility. In a paper that appeared in KDD-2008, Brickell and Shmatikov proposed an evaluation methodology by comparing privacy gain with utility gain resulted from anonymizing the data, and concluded that "even modest privacy gains require almost complete destruction of the data-mining utility”. This conclusion seems to undermine existing work on data anonymization. In this paper, we analyze the fundamental characteristics of privacy and utility, and show that it is inappropriate to directly compare privacy and utility. We then observe that the privacy-utility tradeoff in data publishing is similar to the risk-return tradeoff in financial investment, and propose an integrated framework for considering privacy-utility tradeoff, borrowing concepts from the Modern Portfolio Theory for financial investment. Finally, we evaluate our methodology on the Adult dataset from the UCI machine learning repository. Our results clarify several common misconceptions about data utility and provide data publishers useful guidelines on choosing the right tradeoff between privacy and utility.
|2009 OntheTradeoffBetweenPrivacyandU||Tiancheng Li|
|On the Tradeoff Between Privacy and Utility in Data Publishing||KDD-2009 Proceedings||10.1145/1557019.1557079||2009|
|Author||Tiancheng Li + and Ninghui Li +|
|journal||Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining +|
|title||On the Tradeoff Between Privacy and Utility in Data Publishing +|