Alexandra Pilalidou, Panos Vassiliadis
Dept. of Computer Science, University of Ioannina, Ioannina, Hellas
24th International Conference on Scientific and Statistical Database Management (SSDBM 2012) 25-27 June 2012 Chania, Crete, Hellas
The publishing of data with privacy guarantees is a task typically performed by a data curator who is expected to provide guarantees for the data he publishes in quantitative fashion, via a privacy criterion (e.g., k-anonymity, l-diversity). The anonymization of data is typically performed off-line. In this paper, we provide algorithmic tools that facilitate the negotiation for the anonymization scheme of a data set in user time. Our method takes as input a set of user constraints for (i) suppression, (ii) generalization and (iii) a privacy criterion (k-anonymity, l-diversity) and returns (a) either an anonymization scheme that fulfils these constraints or, (b) three approximations to the user request based on the idea of keeping the two of the three values of the user input fixed and finding the closest possible approximation for the third parameter. The proposed algorithm involves precomputing suitable histograms for all the different anonymization schemes that a global recoding method can follow. This allows computing exact answers extremely fast (in the order of few milliseconds).
Following is the MSc thesis of A. Pilalidou which contains several results not found in the SSDBM 12 paper (detailed experimental findings, theoretical foundations of the proposed algorithms, extensions of the fundamental method to allow a reduction of the off-line preprocessing):
The data sets that we have experimented with are found as a MySQL dump here and include