[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3459637.3481967acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

DORA THE EXPLORER: Exploring Very Large Data With Interactive Deep Reinforcement Learning

Published: 30 October 2021 Publication History

Abstract

We demonstrate DORA THE EXPLORER, a system that guides users in finding items of interest in a very large data set. DORA THE EXPLORER provides users with the full spectrum of exploration modes and is driven by Data Familiarity or Curiosity, as well as User Interventions. DORA THE EXPLORER is able to handle data and search scenario complexity, i.e., the difficulty to find scattered/clustered individual records in the data set, and user ability to express what s/he needs. DORA THE EXPLORER relies on Deep Reinforcement Learning that combines intrinsic (curiosity) and extrinsic (familiarity) rewards. DORA's main goal is to support scientific discovery from data. We describe the system architecture and illustrate it with three demonstration scenarios on a 2.6 mil-lion galaxies SDSS, a large sky survey data set1. A video of DORA THE EXPLORER is available at https://bit.ly/dora-demo, the codehttps://github.com/apersonnaz/rl-guided-galaxy-exploration, and the application at https://bit.ly/dora-application

Supplementary Material

MP4 File (Dora_conf_demo_with_sub.mp4)
Dora demonstration - 20 minutes - with subtitles
MP4 File (Dora_conf_demo_with_sub.mp4)
Dora demonstration - 20 minutes - with subtitles

References

[1]
S. Amer-Yahia, S. Kleisarchaki, N. K. Kolloju, L. V. Lakshmanan, and R. H. Zamar. 2017. Exploring Rated Datasets with Rating Maps. In WWW.
[2]
O. Bar El, T. Milo, and A. Somech. 2020. Automatically generating data exploration sessions using deep reinforcement learning. In SIGMOD. 1527--1537.
[3]
L. Biewald. 2020. Experiment Tracking with Weights and Biases. https://www.wandb.com/Software available from wandb.com.
[4]
M. Das, S. Thirumuruganathan, S. Amer-Yahia, G. Das, and C. Yu. 2012. Who Tags What? An Analysis Framework. pVLDB Endow., Vol. 5, 11 (2012), 1567--1578.
[5]
K. Dimitriadou, O. Papaemmanouil, and Y. Diao. 2016. AIDE: an active learning-based approach for interactive data exploration. IEEE TKDE, Vol. 28, 11 (2016), 2842--2856.
[6]
M. Eirinaki, S. Abraham, N. Polyzotis, and N. Shaikh. 2013. Querie: Collaborative database exploration. IEEE TKDE, Vol. 26, 7 (2013), 1778--1790.
[7]
X. Ge, Y. Xue, Z. Luo, M. A. Sharaf, and P. K. Chrysanthis. 2016. REQUEST: A scalable framework for interactive construction of exploratory queries. In IEEE Intl. Conf. on Big Data. 646--655.
[8]
M. Kahng, S. B. Navathe, J. T. Stasko, and D. H. P. Chau. 2016. Interactive Browsing and Navigation in Relational Databases. pVLDB Endow., Vol. 9, 12 (2016), 1017--1028. http://www.vldb.org/pvldb/vol9/p1017-kahng.pdf
[9]
N. Kamat, P. Jayachandran, K. Tunga, and A. Nandi. 2014. Distributed and interactive cube exploration. In ICDE. 472--483.
[10]
P. Marcel, N. Labroche, and P. Vassiliadis. 2019. Towards a benefit-based optimizer for Interactive Data Analysis. In EDBT/ICDT.
[11]
V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. P. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 1928--1937.
[12]
D. Mottin, M. Lissandrini, Y. Velegrakis, and T. Palpanas. 2017. New Trends on Exploratory Methods for Data Analytics. pVLDB Endow., Vol. 10, 12 (2017), 1977--1980.
[13]
B. Omidvar-Tehrani, S. Amer-Yahia, and A. Termier. 2015. Interactive user group analysis. In CIKM. 403--412.
[14]
A. Personnaz, S. Amer-Yahia, L. Berti-Equille, M. Fabricius, and S. Subramanian. 2021. Balancing Familiarity and Curiosity in Data Exploration with Deep Reinforcement Learning. In the 4th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM) in conjunction with ACM SIGMOD 2021.
[15]
D. A. Randell, Z. Cui, and A. G. Cohn. 1992. A Spatial Logic Based on Regions and Connection. In Proceedings of the Third International Conference on Principles of Knowledge Representation and Reasoning (KR'92). 165--176.
[16]
M. Seleznova, B. Omidvar-Tehrani, S. Amer-Yahia, and E. Simon. 2020. Guided Exploration of User Groups. pVLDB Endow., Vol. 13, 9 (2020), 1469--1482.
[17]
T. Uno, M. Kiyomi, and H. Arimura. 2004. LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI), Vol. 126.
[18]
N. Yan, C. Li, S. B. Roy, R. Ramegowda, and G. Das. 2010. Facetedpedia: enabling query-dependent faceted search for Wikipedia. In CIKM. 1927--1928.
[19]
L. Zhang and Y. Zhang. 2010. Interactive retrieval based on faceted feedback. In SIGIR. 363--370.

Cited By

View all
  • (2024)Interestingness Measures for Exploratory Data Analysis: a SurveyNew Trends in Database and Information Systems10.1007/978-3-031-70421-5_2(14-24)Online publication date: 14-Nov-2024
  • (2023)SHEVA: A Visual Analytics System for Statistical Hypothesis ExplorationProceedings of the VLDB Endowment10.14778/3611540.361163116:12(4102-4105)Online publication date: 12-Sep-2023
  • (2023)Learn to Explore: on Bootstrapping Interactive Data Exploration with Meta-learning2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00135(1720-1733)Online publication date: Apr-2023
  • Show More Cited By

Index Terms

  1. DORA THE EXPLORER: Exploring Very Large Data With Interactive Deep Reinforcement Learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
    October 2021
    4966 pages
    ISBN:9781450384469
    DOI:10.1145/3459637
    © 2021 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    • Best Demo

    Author Tags

    1. data exploration
    2. reinforcement learning
    3. sdss datasets
    4. user feedback

    Qualifiers

    • Short-paper

    Funding Sources

    • European Union?s Horizon 2020

    Conference

    CIKM '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Interestingness Measures for Exploratory Data Analysis: a SurveyNew Trends in Database and Information Systems10.1007/978-3-031-70421-5_2(14-24)Online publication date: 14-Nov-2024
    • (2023)SHEVA: A Visual Analytics System for Statistical Hypothesis ExplorationProceedings of the VLDB Endowment10.14778/3611540.361163116:12(4102-4105)Online publication date: 12-Sep-2023
    • (2023)Learn to Explore: on Bootstrapping Interactive Data Exploration with Meta-learning2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00135(1720-1733)Online publication date: Apr-2023
    • (2022)EDA4SUMProceedings of the VLDB Endowment10.14778/3554821.355485115:12(3590-3593)Online publication date: 1-Aug-2022
    • (2022)Guided Text-based Item ExplorationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557141(3410-3420)Online publication date: 17-Oct-2022
    • (2022)BETZE: Benchmarking Data Exploration Tools with (Almost) Zero Effort2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00224(2385-2398)Online publication date: May-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media