Trading Privacy for Information Loss in the Blink of an Eye

Alexandra Pilalidou, Panos Vassiliadis

Dept. of Computer Science, University of Ioannina, Ioannina, Hellas

24th International Conference on Scientific and Statistical Database Management (SSDBM 2012) 25-27 June 2012 Chania, Crete, Hellas

The publishing of data with privacy guarantees is a task typically performed by a data curator who is expected to provide guarantees for the data he publishes in quantitative fashion, via a privacy criterion (e.g., k-anonymity, l-diversity). The anonymization of data is typically performed off-line. In this paper, we provide algorithmic tools that facilitate the negotiation for the anonymization scheme of a data set in user time. Our method takes as input a set of user constraints for (i) suppression, (ii) generalization and (iii) a privacy criterion (k-anonymity, l-diversity) and returns (a) either an anonymization scheme that fulfils these constraints or, (b) three approximations to the user request based on the idea of keeping the two of the three values of the user input fixed and finding the closest possible approximation for the third parameter. The proposed algorithm involves precomputing suitable histograms for all the different anonymization schemes that a global recoding method can follow. This allows computing exact answers extremely fast (in the order of few milliseconds).

Writings

  • The SSDBM'12 paper (PDF)

    Following is the MSc thesis of A. Pilalidou which contains several results not found in the SSDBM 12 paper (detailed experimental findings, theoretical foundations of the proposed algorithms, extensions of the fundamental method to allow a reduction of the off-line preprocessing):

  • Alexandra Pilalidou. On-line negotiation for privacy preserving data publishing. MSc Thesis. MT 2010-15, Dept. of Computer Science, Univ. of Ioannina, 2010. Also available at the Technical Reports of the Dept. of Comp. Science, Univ. Ioannina.

    Presentations

  • A fairly lengthy presentation of our results (PDF)
  • A short presentation at SSDBM 2012 (PDF)

    Data

    The data sets that we have experimented with are found as a MySQL dump here and include