Abstract
There is much debate about the challenge to anonymize a large amount of information obtained in big data scenarios. Besides, it is even harder considering inferences from data may be used as additional adversary knowledge. This is the case of geo-located data, where the Points of Interest (POIs) may have additional information that can be used to link them to a user’s real identity. However, in most cases, when a model of the raw data is published, this processing protects up to some point the privacy of the data subjects by minimizing the published information. In this paper, we measure the privacy obtained by the minimization of the POIs published when we apply the Mobility Markov Chain (MMC) model, which extracts the most important POIs of an individual. We consider the gender inferences that an adversary may obtain from publishing the MMC model together with additional information such as the gender or age distribution of each POI, or the aggregated gender distribution of all the POIs visited by a data subject. We measure the unicity obtained after applying the MMC model, and the probability that an adversary that knows some POIs in the data before processing may be able to link them with the POIs published after the MMC model. Finally, we measure the anonymity lost when adding the gender attribute to the side knowledge of an adversary that has access to the MMC model. We test our algorithms on a real transaction database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
VISA Merchant Category Classification (MCC) codes directory.
References
Bi, B., Shokouhi, M., Kosinski, M., Graepel, T.: Inferring the demographics of search users: social data meets search queries. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 131–140. New York (2013)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 1082–1090. New York (2011)
Danezis, G., et al.: Privacy and data protection by design - from policy to engineering. Technical report, ENISA (2015)
Fiore, M., et al.: Privacy of trajectory micro-data: a survey. CoRR abs/1903.12211 (2019)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Gambs, S., Killijian, M.O., del Prado Cortez, M.N.: Show me how you move and I will tell you who you are. Trans. Data Priv. 4(2), 103–126 (2011)
Gambs, S., Killijian, M.O., Núñez del Prado Cortez, M.: Next place prediction using mobility Markov chains. In: Proceedings of the First Workshop on Measurement, Privacy, and Mobilit, Bern, Switzerland, vol. 3, pp. 1–6, April 2012
Gambs, S., Killijian, M.O., Núñez del Prado Cortez, M.: GEPETO: a geoprivacy-enhancing toolkit. In: 2010 IEEE 24th International Conference on Advanced Information Networking and Applications Workshops, pp. 1071–1076. IEEE (2010)
Gambs, S., Killijian, M.O., del Prado Cortez, M.N.: De-anonymization attack on geolocated data. J. Comput. Syst. Sci. 80(8), 1597–1614 (2014)
Hastie, T., Rosset, S., Zhu, J., Zou, H.: Multi-class AdaBoost. Stat. Interface 2(3), 349–360 (2009)
Hu, J., Zeng, H.J., Li, H., Niu, C., Chen, Z.: Demographic prediction based on user’s browsing behavior. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 151–160. ACM, New York (2007)
Mayer, J., Mutchler, P., Mitchell, J.C.: Evaluating the privacy properties of telephone metadata. Proc. Nat. Acad. Sci. 113(20), 5536–5541 (2016)
de Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3, 1376 (2013)
de Montjoye, Y.A., Radaelli, L., Singh, V.K., Pentland, A.S.: Unique in the shopping mall: On the reidentifiability of credit card metadata. Science 347(6221), 536–539 (2015). https://doi.org/10.1126/science.1256297
Riederer, C., Kim, Y., Chaintreau, A., Korula, N., Lattanzi, S.: Linking users across domains with location data: theory and validation. In: Proceedings of the 25th International Conference on World Wide Web, WWW 2016, pp. 707–719, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2016)
Salas, J., Domingo-Ferrer, J.: Some basics on privacy techniques, anonymization and their big data challenges. Math. Comput. Sci. 12(3), 263–274 (2018). https://doi.org/10.1007/s11786-018-0344-6
Salas, J., Megías, D., Torra, V.: SwapMob: swapping trajectories for mobility anonymization. In: Domingo-Ferrer, J., Montes, F. (eds.) PSD 2018. LNCS, vol. 11126, pp. 331–346. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99771-1_22
Scellato, S., Noulas, A., Lambiotte, R., Mascolo, C.: Socio-spatial properties of online location-based social networks. In: Proceedings of the Fifth International Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, 17–21 July 2011 (2011)
Wang, P., Guo, J., Lan, Y., Xu, J., Cheng, X.: Your cart tells you: inferring demographic attributes from purchase data. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 173–182. ACM (2016)
Weinsberg, U., Bhagat, S., Ioannidis, S., Taft, N.: BlurMe: inferring and obfuscating user gender based on ratings. In: Proceedings of the Sixth ACM Conference on Recommender Systems, RecSys 2012, pp. 195–202. New York (2012)
Yan, Z., Chakraborty, D., Parent, C., Spaccapietra, S., Aberer, K.: SeMiTri: a framework for semantic annotation of heterogeneous trajectories. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT/ICDT 2011, pp. 259–270. ACM, New York (2011)
Zhong, E., Tan, B., Mo, K., Yang, Q.: User demographics prediction based on mobile data. Pervasive Mob. Comput. 9(6), 823–837 (2013)
Zhong, Y., Yuan, N.J., Zhong, W., Zhang, F., Xie, X.: You are where you go: inferring demographic attributes from location check-ins. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, WSDM 2015, pp. 295–304. ACM, New York (2015)
Acknowledgement
This work was supported by the Spanish Government, in part under Grant RTI2018-095094-B-C22 “CONSENT”, and in part under Grant TIN2014-57364-C2-2-R “SMARTGLACIS.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Salas, J., Nunez-del-Prado, M. (2020). Privacy Preservation and Inference with Minimal Mobility Information. In: Lossio-Ventura, J.A., Condori-Fernandez, N., Valverde-Rebaza, J.C. (eds) Information Management and Big Data. SIMBig 2019. Communications in Computer and Information Science, vol 1070. Springer, Cham. https://doi.org/10.1007/978-3-030-46140-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-46140-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46139-3
Online ISBN: 978-3-030-46140-9
eBook Packages: Computer ScienceComputer Science (R0)