[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Discovering User Behavioral Features to Enhance Information Search on Big Data

Published: 29 July 2017 Publication History

Abstract

Due to the emerging Big Data paradigm, driven by the increasing availability of intelligent services easily accessible by a large number of users (e.g., social networks), traditional data management techniques are inadequate in many real-life scenarios. In particular, the availability of huge amounts of data pertaining to user social interactions, user preferences, and opinions calls for advanced analysis strategies to understand potentially interesting social dynamics. Furthermore, heterogeneity and high speed of user-generated data require suitable data storage and management tools to be designed from scratch. This article presents a framework tailored for analyzing user interactions with intelligent systems while seeking some domain-specific information (e.g., choosing a good restaurant in a visited area). The framework enhances a user's quest for information by exploiting previous knowledge about their social environment, the extent of influence the users are potentially subject to, and the influence they may exert on other users. User influence spread across the network is dynamically computed as well to improve user search strategy by providing specific suggestions, represented as tailored faceted features. Such features are the result of data exchange activity (called data posting) that enriches information sources with additional background information and knowledge derived from experiences and behavioral properties of domain experts and users. The approach is tested in an important application scenario such as tourist recommendation, but it can be profitably exploited in several other contexts, for example, viral marketing and food education.

References

[1]
Divyakant Agrawal et al. 2012. Challenges and opportunities with Big Data. A community white paper developed by leading researchers across the United States.
[2]
Marcelo Arenas, Pablo Barceló, Ronald Fagin, and Leonid Libkin. 2004. Locally consistent transformations and query answering in data exchange. In Proceedings of the Symposium on Principles of Database Systems (PODS’12), Catriel Beeri and Alin Deutsch (Eds.). ACM, 229--240.
[3]
Ricardo Baeza-Yates and Berthier A. Ribeiro-Neto. 1999. Modern Information Retrieval. Addison-Wesley.
[4]
Vinayak R. Borkar, Michael J. Carey, and Chen Li. 2012. Inside “big data management”: Ogres, onions, or parfaits? In Proceedings of the International Conference on Extending Database Technology (EDBT’12), Elke A. Rundensteiner, Volker Markl, Ioana Manolescu, Sihem Amer-Yahia, Felix Naumann, and Ismail Ari (Eds.). ACM, 3--14.
[5]
Nunziato Cassavia, Pietro Dicosta, Elio Masciari, and Domenico Saccà. 2014. Data preparation for tourist data big data warehousing. In Proceedings of the 3rd International Conference on Data Management Technologies and Applications (DATA’14). 419--426.
[6]
Nunziato Cassavia, Pietro Dicosta, Elio Masciari, and Domenico Saccà. 2015a. Improving tourist experience by big data tools. In Proceedings of the 2015 International Conference on High Performance Computing 8 Simulation (HPCS’15). 553--556.
[7]
Nunziato Cassavia, Pietro Dicosta, Elio Masciari, and Domenico Saccà. 2015b. Surfing big data warehouses for effective information gathering. In Proceedings of 4th International Conference on Data Management Technologies and Applications (DATA’15). 373--377.
[8]
Nunziato Cassavia, Elio Masciari, Chiara Pulice, and Domenico Saccà. 2016. A framework enhancing the user search activity through data posting. In Proceedings of the 10th International Symposium on Rule Technologies. Research, Tools, and Applications (RuleML’16). 287--304.
[9]
Rick Cattell. 2010. Scalable SQL and NoSQL data stores. SIGMOD Rec. 39, 4 (2010), 12--27.
[10]
Surajit Chaudhuri and Umeshwar Dayal. 1997. An overview of data warehousing and OLAP technology. SIGMOD Rec. 26, 1 (1997), 65--74.
[11]
Alfredo Cuzzocrea, Domenico Saccà, and Jeffrey D. Ullman. 2013. Panel on Big Data: A Research Agenda. In Proceedings of the 17th International Database Engineering 8 Applications Symposium (IDEAS’13). 198--203.
[12]
The Economist. 2010. Data, data everywhere. The Economist (Feb. 2010).
[13]
The Economist. 2011. Drowning in numbers—Digital data will flood the planet - and help us understand it better. The Economist (Nov. 2011).
[14]
James Manyika et al. 2011. Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute (May. 2011).
[15]
Wolfgang Faber, Gerald Pfeifer, Nicola Leone, Tina Dell’Armi, and Giuseppe Ielpa. 2008. Design and implementation of aggregate functions in the DLV system. Theory Pract. Logic Program. 8, 5--6 (2008), 545--580.
[16]
Ronald Fagin, Phokion G. Kolaitis, and Lucian Popa. 2005. Data exchange: Getting to the core. ACM Trans. Database Syst. 30, 1 (2005), 174--210.
[17]
Google Documentation. November, 2011. Getting the most from your google search appliance. Website. In Google Developers Site. Retrieved from https://developers.google.com/search-appliance/documentation/614/QuickStart/quick_start_intro.
[18]
Amit Goyal, Francesco Bonchi, and Laks V. S. Lakshmanan. 2010. Learning influence probabilities in social networks. In Proceedings of the 3rd International Conference on Web Search and Web Data Mining (WSDM’10). 241--250.
[19]
Amit Goyal, Wei Lu, and Laks V. S. Lakshmanan. 2011. SIMPATH: An efficient algorithm for influence maximization under the linear threshold model. In Proceedings of the 11th IEEE International Conference on Data Mining (ICDM’11). 211--220.
[20]
Jian Pei Jiawei Han, and Micheline Kamber. 2011. Data Mining: Concepts and Techniques. Morgan Kaufmann.
[21]
David Kempe, Jon M. Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD’03). 137--146.
[22]
Steve Lohr. 2012. The age of big data. nytimes.com (Feb. 2012).
[23]
Wei Lu, Francesco Bonchi, Amit Goyal, and Laks V. S. Lakshmanan. 2013. The bang for the buck: Fair competitive viral marketing from the host perspective. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD’13). 928--936.
[24]
M. F. Moens. 2000. Automatic Indexing and Abstracting of Document Texts. Kluwer Academic.
[25]
A. Narayanan and V. Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP’08). IEEE Computer Society, Washington, DC, 111--125.
[26]
Nature. 2008. Big data. Nature (Sept. 2008). http://www.nature.com/news/specials/bigdata/index.html.
[27]
Mark Newman. 2010. Networks: An Introduction. Oxford University Press, Inc., New York, NY.
[28]
Eric Redmond and Jim R. Wilson. 2012. Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement. Pragmatic Bookshelf.
[29]
Domenico Saccà and Edoardo Serra. 2012. Data exchange in datalog is mainly a matter of choice. In Datalog (Lecture Notes in Computer Science), Vol. 7494, Pablo Barceló and Reinhard Pichler (Eds.). Springer, 153--164.
[30]
Domenico Saccà and Edoardo Serra. 2013. Data posting: A new frontier for data exchange in the big data era. In Proceedings of the 7th Alberto Mendelzon International Workshop on Foundations of Data Management, Loreto Bravo and Maurizio Lenzerini (Eds.), Vol. 1087. CEUR-WS.org.
[31]
Domenico Saccà, Edoardo Serra, and Antonella Guzzo. 2012. Count constraints and the inverse OLAP problem: Definition, complexity and a step toward aggregate data exchange. In Proceedings of the Foundations of Information and Knowledge Systems Symposium (FoIKS’12), Lecture Notes in Computer Science, Vol. 7153, Thomas Lukasiewicz and Attila Sali (Eds.). Springer, 352--369.
[32]
Youze Tang, Xiaokui Xiao, and Yanchen Shi. 2014. Influence maximization: Near-optimal time complexity meets practical efficiency. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 75--86.

Cited By

View all
  • (2024)Explicating the mapping between big data and knowledge management: a systematic literature review and future directionsBenchmarking: An International Journal10.1108/BIJ-09-2022-0550Online publication date: 29-Mar-2024
  • (2024)A Multimodal conceptual framework to achieve automated software evolution for context-rich intelligent applicationsInnovations in Systems and Software Engineering10.1007/s11334-024-00591-0Online publication date: 10-Nov-2024
  • (2024)The COVID-19 Crisis Effect on Railways’ Digital Branding: Risk Management Applications Utilizing Big DataComputational and Strategic Business Modelling10.1007/978-3-031-41371-1_6(57-67)Online publication date: 14-Feb-2024
  • Show More Cited By

Index Terms

  1. Discovering User Behavioral Features to Enhance Information Search on Big Data

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Interactive Intelligent Systems
    ACM Transactions on Interactive Intelligent Systems  Volume 7, Issue 2
    June 2017
    87 pages
    ISSN:2160-6455
    EISSN:2160-6463
    DOI:10.1145/3129288
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 July 2017
    Accepted: 01 June 2016
    Revised: 01 June 2016
    Received: 01 August 2015
    Published in TIIS Volume 7, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. NoSQL databases
    2. information extraction
    3. intelligent recommendation
    4. personal big data
    5. user behavior

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • MIUR (the Italian Ministry for Research and University)
    • PON (national operative program) Project PON04a2_D “INMOTO: INformation and MObility for TOurism,”
    • Regional Operating Program (POR)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Explicating the mapping between big data and knowledge management: a systematic literature review and future directionsBenchmarking: An International Journal10.1108/BIJ-09-2022-0550Online publication date: 29-Mar-2024
    • (2024)A Multimodal conceptual framework to achieve automated software evolution for context-rich intelligent applicationsInnovations in Systems and Software Engineering10.1007/s11334-024-00591-0Online publication date: 10-Nov-2024
    • (2024)The COVID-19 Crisis Effect on Railways’ Digital Branding: Risk Management Applications Utilizing Big DataComputational and Strategic Business Modelling10.1007/978-3-031-41371-1_6(57-67)Online publication date: 14-Feb-2024
    • (2023)Modeling, Evaluating, and Applying the eWoM Power of Reddit PostsBig Data and Cognitive Computing10.3390/bdcc70100477:1(47)Online publication date: 9-Mar-2023
    • (2022)Extraction and analysis of text patterns from NSFW adult content in RedditData & Knowledge Engineering10.1016/j.datak.2022.101979138:COnline publication date: 1-Mar-2022
    • (2022)Big data in the food supply chain: a literature reviewJournal of Data, Information and Management10.1007/s42488-021-00064-04:1(33-47)Online publication date: 24-Jan-2022
    • (2022)Extracting time patterns from the lifespans of TikTok challenges to characterize non-dangerous and dangerous onesSocial Network Analysis and Mining10.1007/s13278-022-00893-w12:1Online publication date: 16-Jun-2022
    • (2021)New Media User Behaviour Research Based on Big Data AnalysisJournal of Physics: Conference Series10.1088/1742-6596/1802/4/0420371802:4(042037)Online publication date: 1-Mar-2021
    • (2021)Investigating the phenomenon of NSFW posts in RedditInformation Sciences: an International Journal10.1016/j.ins.2021.01.062566:C(140-164)Online publication date: 1-Aug-2021
    • (2020)Research on Weibo user behavior system for subjective perception and big data mining technologyJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-17948438:2(1225-1234)Online publication date: 1-Jan-2020
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media