Ir al contenido

Documat


Semantic microaggregation for the anonymization of query logs using the open directory project

  • Autores: Arnau Erola, Jordi Castellà Roca Árbol académico, Guillermo Navarro-Arribas Árbol académico, Vicenç Torra Reventós Árbol académico
  • Localización: Sort: Statistics and Operations Research Transactions, ISSN 1696-2281, Vol. 35, Nº. Extra 1, 2011, págs. 41-58
  • Idioma: inglés
  • Enlaces
  • Resumen
    • Web search engines gather information from the queries performed by the user in the form of query logs. These logs are extremely useful for research, marketing, or profiling, but at the same time they are a great threat to the user�s privacy. We provide a novel approach to anonymize query logs so they ensure user k-anonymity, by extending a common method used in statistical disclosure control: microaggregation. Furthermore, our microaggregation approach takes into account the semantics of the queries by relying on the Open Directory Project. We have tested our proposal with real data from AOL query logs.

  • Referencias bibliográficas
    • Adar, E. (2007). User 4xxxxx9: Anonymizing query logs. InQuery Logs workshop.
    • Barbaro, M. and Zeller, T. (2006). A face is exposed for AOL searcher no. 4417749. The New York Times.
    • Cooper, A. (2008). A survey of query log privacy-enhancing techniques from a policy perspective.ACM Transactions on the Web, 2.
    • Defays, D. and Nanopoulos, P. (1993). Panels of enterprisesand confidentiality: the small aggregates method. In Proceedings of 92 Symposium...
    • Domingo-Ferrer, J. and Mateo-Sanz, J.M. (2002). Practicaldata-oriented microaggregation for statistical disclosure control.IEEE Transactions...
    • Domingo-Ferrer, J. and Torra, V. (2005). Ordinal, continuous and heterogeneousk-anonymity through microaggregation.Data Mining and Knowledge...
    • Domingo-Ferrer, J. and Solanas, A. (2009). Erratum: Erratum to “a measure of variance for hierarchical nominal attributes”.Information Sciences,...
    • EFF. (2009). AOL’s massive data leak. Electronic Frontier Foundation. http://w2.eff.org/Privacy/AOL/.
    • Erola, A., Castell̀a-Roca, J., Navarro-Arribas, G. and Torra, V. (2010). Semantic microaggregation for the anonymization of query logs. InProceedings...
    • Frankowski, D., Cosley, D., Sen, S., Terveen, L. and Riedl, J. (2006). You are what you say: privacy risks of public mentions. InAnnual ACM...
    • Gauch, S. and Speretta, M. (2004). Personalized search based on user search histories. InProceedings of International Conference of Knowledge...
    • Google (2008). 2008 annual report. http://investor.google.com/order.html.
    • Hansell, S. (2006). Increasingly, Internet’s data trail leads to court. The New York Times.
    • He, Y. and Naughton, J. (2009). Anonymization of set-valueddata via top-down, local generalization. Proceedings of the VLDB Endowment, 2,...
    • Hong, Y., He, X., Vaidya, J., Adam, N. and Atluri, V. (2009). Effective anonymization of query logs. In CIKM’09: Proceedings of the 18th ACM...
    • Korolova, A., Kenthapadi, K., Mishra, N. and Ntoulas, A. (2009). Releasing search queries and clicks privately. InWWW’09: Proceedings of the...
    • Kumar, R., Novak, J., Pang, B. and Tomkins, A. (2007). On anonymizing query logs via token-based hashing. In Proceedings of the 16th international...
    • Miller, G. (2009). WordNet-about us. WordNet. Princeton University. http://wordnet.princeton.edu.
    • Mills, E. (2006). AOL sued over web search data release. CNETNews. http://news.cnet.com/8301-107843- 6119218-7.html.
    • Navarro-Arribas, G. and Torra, V. (2009). Tree-based microaggregation for the anonymization of search logs. In WI-IAT’09: Proceedings of the...
    • Navarro-Arribas, G., Torra, V., Erola, A. and Castellà-Roca, J. (in press, 2011). Userk-anonymity for privacy preserving data mining of...
    • ODP. (2010). Open directory project. http://www.dmoz.org.
    • Oganian, A. and Domingo-Ferrer, J. (2001). On the complexity of optimal microaggregation for statistical disclosure control.Statistical Journal...
    • Poblete, B., Spiliopoulou, M. and Baeza-Yates, R. (2008). Website privacy preservation for query log publishing. In First International Workshop...
    • Samarati, P. (2001). Protecting respondents identities inmicrodata release.IEEE Transactions on Knowledge and Data Engineering, 13, 1010–1027.
    • SearchEngineWatch. (2009). Global search market share, july 2009 vs. july 2008. http://searchenginewatch.com/3634922.
    • SearchEngineWatch. (2010). Top search providers for september 2010. http://searchenginewatch.com/3641456.
    • Soghoian, C. (2007). The problem of anonymous vanity searches.I/S: A Journal of Law and Policy for the Information Society, 3.
    • Summers, N. (2009). Walking the cyberbeat. Newsweek. http://www.newsweek.com/id/195621.
    • Sweeney, L. (2002).k-anonymity: a model for protecting privacy.International Journal on Uncertainty, Fuzziness and Knowledge-based Systems,...
    • Torra, V. (2004). Microaggregation for categorical variables: a median based approach. InProceedings Privacy in Statistical Databases (PSD...
    • Torra, V. (2008). Constrained microaggregation: adding constraints for data editing.Transactions on Data Privacy, 1, 86–104.
    • Ward, J.H. (1963). Hierarchical Grouping to optimize an objective function.Journal of the American Statistical Association, 58, 236–244.
    • Zetter, K. (2009). Yahoo issues takedown notice for spying price list. Wired. http://www.wired.com/threatlevel/2009/12/yahoo-spy-prices/#more-11725.

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno