[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3486622.3494008acmconferencesArticle/Chapter ViewAbstractPublication PageswiConference Proceedingsconference-collections
short-paper

Interpretable Mining of Influential Patterns from Sparse Web

Published: 13 April 2022 Publication History

Abstract

Big data are everywhere. World Wide Web is an example of these big data. It has become a vast data production and consumption platform, at which threads of data evolve from multiple devices, by different human interactions, over worldwide locations, under divergent distributed settings. Embedded in these big web data is implicit, previously unknown and potentially useful information and knowledge that awaited to be discovered. This calls for web intelligence solutions, which make good use of data science and data mining (especially, web mining) to discover useful knowledge and important information from the web. As a web mining task, web structure mining aims to examine incoming and outgoing links on web pages and make recommendations of frequently referenced web pages to web surfers. As another web mining task, web usage mining aims to examine web surfer patterns and make recommendations of frequently visited pages to web surfers. While the size of the web is huge, the connection among all web pages may be sparse. In other words, the number of vertex nodes (i.e., web pages) on the web is huge, the number of directed edges (i.e., incoming and outgoing hyperlinks between web pages) may be small. This leads to a sparse web. In this paper, we present a solution for interpretable mining of influential patterns from sparse web. In particular, we represent web structure and usage information by bitmaps to capture connections to web pages. Due to the sparsity of the web, we compress the bitmaps, and use them in mining influential patterns (e.g., popular web pages). For explainability of the mining process, we ensure the compressed bitmaps are interpretable. Evaluation on real-life web data demonstrates the effectiveness, interpretability and practicality of our solution for interpretable mining of influential patterns from sparse web.

References

[1]
[1] Q. Kong, et al., “Factor space: a new idea for artificial intelligence based on causal reasoning,” IEEE/WIC/ACM WI-IAT 2020, pp. 592-599.
[2]
[2] R.K. MacKinnon, C.K. Leung, “Stock price prediction in undirected graphs using a structural support vector machine,” IEEE/WIC/ACM WI-IAT 2015, vol. 1, pp. 548-555
[3]
[3] S.P. Singh, et al., “Analytics of similar-sounding names from the web with phonetic based clustering,” IEEE/WIC/ACM WI-IAT 2020, pp. 580-585.
[4]
[4] L.V.S. Lakshmanan, et al., “The segment support map: scalable mining of frequent itemsets,” ACM SIGKDD Explorations 2(2), 2000, pp. 21-27.
[5]
[5] C.K. Leung, et al., “Fast algorithms for frequent itemset mining from uncertain data,” IEEE ICDM 2014, pp. 893-898.
[6]
[6] C.K. Leung, Y. Hayduk, “Mining frequent patterns from uncertain data with MapReduce for big data analytics,” DASFAA 2013, Part I, pp. 440-455.
[7]
[7] C.K. Leung, S.K. Tanbeer, “Fast tree-based mining of frequent itemsets from uncertain data,” DASFAA 2012, Part I, pp. 272-287.
[8]
[8] H. Zheng, P. Li, “Optimizing multi-objective functions in fuzzy association rule mining,” IEEE/WIC/ACM WI-IAT 2020, pp. 606-610.
[9]
[9] F. Jiang, C.K. Leung, “A data analytic algorithm for managing, querying, and processing uncertain big data in cloud environments,” Algorithms 8(4), 2015, pp. 1175-1194.
[10]
[10] C.K. Leung, F. Jiang, “Big data analytics of social networks for the discovery of ”following” patterns,” DaWaK 2015, pp. 123-135.
[11]
[11] P. Braun, et al., “Game data mining: clustering and visualization of online game data in cyber-physical worlds,” Procedia Computer Science 112, pp. 2259-2268.
[12]
[12] P.M.J. Dubois, et al., “An interactive circular visual analytic tool for visualization of web data,” IEEE/WIC/ACM WI 2016, pp. 709-712.
[13]
[13] C.K. Leung, C.L. Carmichael, “FpVAT: A visual analytic tool for supporting frequent pattern mining,” ACM SIGKDD Explorations 11(2), 2009, pp. 39-48.
[14]
[14] C.K. Leung, et al., “Visual analytics of social networks: mining and visualizing co-authorship networks,” HCII-FAC 2011, pp. 335-345.
[15]
[15] C.K. Leung, F. Jiang, “RadialViz: An orientation-free frequent pattern visualizer,” PAKDD 2012, Part II, pp. 322-334.
[16]
[16] S. Chen, “Internet and beyond: towards a better connected world,” IEEE Internet Comput. 24(1), 2020, pp. 36-38.
[17]
[17] K. Kundu, A. Dutta, “Multi-agent based distributed mis selection for dynamic job scheduling,” IEEE/WIC/ACM WI-IAT 2020, pp. 234-241.
[18]
[18] C.K. Leung, et al., “Constrained big data mining in an edge computing environment,” in Big Data Applications and Services 2017, pp. 61-68.
[19]
[19] A. Hogan, The Web of Data, 2020
[20]
[20] C.K. Leung, et al., “Interactive discovery of influential friends from social networks,” Social Network Analysis and Mining 4(1), 2014, pp. 154:1-154:13.
[21]
[21] C.K. Leung, “Mathematical model for propagation of influence in a social network,” Encyclopedia of Social Network Analysis and Mining, 2e, 2018, pp. 1261-1269.
[22]
[22] C.K. Leung, et al., “Parallel social network mining for interesting ’following’ patterns,” CCPE 28(15), 2016, pp. 3994-4012.
[23]
[23] Q.M. Rahman, et al., “A sliding window-based algorithm for detecting leaders from social network action streams,” IEEE/WIC/ACM WI-IAT 2015, vol. 1, pp. 133-136
[24]
[24] K.C. Chatzidimitriou, et al., “Cenote: a big data management and analytics infrastructure for the web of things,” IEEE/WIC/ACM WI 2019, pp. 282-285.
[25]
[25] A. Kobusinska, et al., “Emerging trends, issues and challenges in Internet of Things, big data and cloud computing,” FGCS 87, 2018, pp. 416-419.
[26]
[26] R. Sardar, T. Anees, “Web of things: security challenges and mechanisms,” IEEE Access 9, 2021, pp. 31695-31711.
[27]
[27] F. Buccafurri, et al., “Enabling propagation in web of trust by Ethereum,” IDEAS 2019, pp. 9:1-9:6.
[28]
[28] F. Jiang, et al., “Web page recommendation based on bitwise frequent pattern mining,” IEEE/WIC/ACM WI 2016, pp. 632-635.
[29]
[29] C.K. Leung, et al., “Bitwise parallel association rule mining for web page recommendation,” IEEE/WIC/ACM WI 2017, pp. 662-669.
[30]
[30] C.K. Leung, et al., “Web page recommendation from sparse big web data,” IEEE/WIC/ACM WI 2018, pp. 592-597.
[31]
[31] C.K. Leung, et al., “Explainable machine learning and mining of influential patterns from sparse web,” IEEE/WIC/ACM WI-IAT 2020, pp. 829-836.
[32]
[32] K. Wu et al., “Optimizing bitmap indices with efficient compression,” ACM TODS 31(1), 2006, pp. 1-38.

Cited By

View all
  • (2024)Developing a Big Data Science Based Model Linked to Meteorological Data for Enhanced Applicability of Transportation AnalyticsInternational Journal of Professional Studies10.37648/ijps.v17i01.02017:1(256-266)Online publication date: 2024
  • (2022)Web Mining from Interpretable Compressed Representation of Sparse Web2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT55865.2022.00097(620-627)Online publication date: Nov-2022
  • (2022)Towards Trustworthy Artificial Intelligence in Healthcare2022 IEEE 10th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI54592.2022.00127(626-632)Online publication date: Jun-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WI-IAT '21: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
December 2021
698 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bitmap
  2. Data analytics
  3. Data compression
  4. Data mining
  5. Data science
  6. Explainability
  7. Frequent pattern mining
  8. Intelligent agent technology
  9. Interpretability
  10. Recommendation
  11. Web intelligence
  12. Web mining
  13. Web of data
  14. Web structure mining
  15. Web usage mining
  16. World wide web

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

Conference

WI-IAT '21
Sponsor:
WI-IAT '21: IEEE/WIC/ACM International Conference on Web Intelligence
December 14 - 17, 2021
VIC, Melbourne, Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Developing a Big Data Science Based Model Linked to Meteorological Data for Enhanced Applicability of Transportation AnalyticsInternational Journal of Professional Studies10.37648/ijps.v17i01.02017:1(256-266)Online publication date: 2024
  • (2022)Web Mining from Interpretable Compressed Representation of Sparse Web2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT55865.2022.00097(620-627)Online publication date: Nov-2022
  • (2022)Towards Trustworthy Artificial Intelligence in Healthcare2022 IEEE 10th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI54592.2022.00127(626-632)Online publication date: Jun-2022
  • (2022)A Deep Learning Based Predictive Model for Healthcare Analytics2022 IEEE 10th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI54592.2022.00106(547-549)Online publication date: Jun-2022
  • (2022)Trustworthy Explanations for Knowledge Discovered from E-Health Records2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom)10.1109/HealthCom54947.2022.9982786(246-251)Online publication date: 17-Oct-2022
  • (2022)A Data Science Solution for Mining Weather Data and Transportation Data for Smart Cities2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC54236.2022.00266(1672-1677)Online publication date: Jun-2022
  • (2022)Mining Popular Topics from the Media2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC54236.2022.00199(1262-1267)Online publication date: Jun-2022
  • (2022)Predictive Analytics for Supporting Environmental Sustainability and Disaster Management2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC54236.2022.00198(1256-1261)Online publication date: Jun-2022
  • (2022)A Big Data Science Solution for Transportation Analytics with Meteorological Data2022 IEEE 16th International Conference on Big Data Science and Engineering (BigDataSE)10.1109/BigDataSE56411.2022.00013(21-28)Online publication date: Dec-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media