More Web Proxy on the site http://driver.im/

research-article

Subjective databases

Authors:

Wang-Chiew TanAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 12, Issue 11

Pages 1330 - 1343

https://doi.org/10.14778/3342263.3342271

Published: 01 July 2019 Publication History

Abstract

Online users are constantly seeking experiences, such as a hotel with clean rooms and a lively bar, or a restaurant for a romantic rendezvous. However, e-commerce search engines only support queries involving objective attributes such as location, price, and cuisine, and any experiential data is relegated to text reviews.

In order to support experiential queries, a database system needs to model subjective data. Users should be able to pose queries that specify subjective experiences using their own words, in addition to conditions on the usual objective attributes. This paper introduces OpineDB, a subjective database system that addresses these challenges. We introduce a data model for subjective databases. We describe how OpineDB translates subjective queries against the subjective database schema, which is done by matching the user query phrases to the underlying schema. We also show how the experiential conditions specified by the user can be combined and the results aggregated and ranked. We demonstrate that subjective databases satisfy user needs more effectively and accurately than alternative techniques through experiments with real data of hotel and restaurant reviews.

References

[1]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensorflow: A system for large-scale machine learning. In OSDI, pages 265--283, 2016.

Digital Library

[2]

I. Androutsopoulos, G. D. Ritchie, and P. Thanisch. Natural language interfaces to databases - an introduction. Natural Language Engineering, 1(1):29--81, 1995.

[3]

L. Aroyo and C. Welty. Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine, 36(1):15--24, 2015.

Digital Library

[4]

R. A. Baeza-Yates. Bias on the web. Commun. ACM, 61(6):54--61, 2018.

Digital Library

[5]

S. Bird and E. Loper. Nltk: the natural language toolkit. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, page 31, 2004.

Digital Library

[6]

S. Brody and N. Elhadad. An unsupervised aspect-sentiment model for online reviews. In NAACL HLT, pages 804--812, 2010.

Digital Library

[7]

M. Buhrmester, T. Kwang, and S. D. Gosling. Amazon's mechanical turk: A new source of inexpensive, yet high-quality, data? Perspectives on psychological science, 6(1):3--5, 2011.

[8]

B. Chen, B. An, L. Sun, and X. Han. Semi-supervised lexicon learning for wide-coverage semantic parsing. In COLING, pages 892--904, 2018.

[9]

D. M. Christopher, R. Prabhakar, and S. Hinrich. Introduction to information retrieval. An Introduction To Information Retrieval, 151(177):5, 2008.

[10]

A. Conneau, D. Kiela, H. Schwenk, L. Barrault, and A. Bordes. Supervised learning of universal sentence representations from natural language inference data. In EMNLP, pages 670--680, 2017.

[11]

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

[12]

S. Evensen, A. Feng, A. Halevy, J. Li, V. Li, Y. Li, H. Liu, G. Mihaila, J. Morales, N. Nuno, E. Pavlovic, W.-C. Tan, and X. Wang. Voyageur: An experiential travel search engine. In The World Wide Web Conference, WWW '19, pages 3511--5, 2019.

Digital Library

[13]

R. Fagin. Combining fuzzy information from multiple systems. In PODS, pages 216--226. ACM, 1996.

Digital Library

[14]

R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. Journal of computer and system sciences, 66(4):614--656, 2003.

Digital Library

[15]

E. Fast, B. Chen, and M. S. Bernstein. Empath: Understanding Topic Signals in Large-Scale Text. In CHI, pages 4647--4657, 2016.

Digital Library

[16]

K. Ganesan and C. Zhai. Opinion-based entity ranking. Information retrieval, 15(2):116--150, 2012.

Digital Library

[17]

GitHub. BERT-BiLSMT-CRF-NER. https://github.com/macanv/BERT-BiLSTM-CRF-NER, 2018.

[18]

GitHub. OpineDB. https://github.com/rit-git/opinedb_public, 2018.

[19]

C. Gormley and Z. Tong. Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine. "O'Reilly Media, Inc.", 2015.

Digital Library

[20]

W. L. Hamilton, K. Clark, J. Leskovec, and D. Jurafsky. Inducing domain-specific sentiment lexicons from unlabeled corpora. In EMNLP, pages 595--605, 2016.

[21]

R. He, W. S. Lee, H. T. Ng, and D. Dahlmeier. An unsupervised neural attention model for aspect extraction. In ACL, pages 388--397, 2017.

[22]

M. Hu and B. Liu. Mining and summarizing customer reviews. In SIGKDD, pages 168--177, 2004.

Digital Library

[23]

M. Hu and B. Liu. Mining opinion features in customer reviews. In AAAI, pages 755--760, 2004.

Digital Library

[24]

I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys (CSUR), 40(4):11, 2008.

Digital Library

[25]

S. Iyer, I. Konstas, A. Cheung, J. Krishnamurthy, and L. Zettlemoyer. Learning a neural semantic parser from user feedback. In ACL, pages 963--973, 2017.

[26]

R. Kiros, Y. Zhu, R. R. Salakhutdinov, R. Zemel, R. Urtasun, A. Torralba, and S. Fidler. Skip-thought vectors. In NIPS, pages 3294--3302, 2015.

Digital Library

[27]

E. P. Klement, R. Mesiar, and E. Pap. Book review:" triangular norms". International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 11(02):257--259, 2003.

Digital Library

[28]

G. Klir and B. Yuan. Fuzzy sets and fuzzy logic, volume 4. Prentice hall New Jersey, 1995.

Digital Library

[29]

F. Li and H. V. Jagadish. Understanding natural language queries over relational databases. SIGMOD Record, 45(1):6--13, 2016.

Digital Library

[30]

Y. Li, A. Feng, J. Li, S. Mumick, A. Y. Halevy, V. Li, and W. Tan. Subjective databases. arXiv preprint arXiv:1902.09661, 2019.

[31]

B. Liu. Sentiment Analysis and Opinion Mining. Morgan Claypool, 2012.

[32]

C. Makris and P. Panagopoulos. Improving opinion-based entity ranking. In WEBIST, pages 223--230, 2014.

[33]

T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.

[34]

M. Pontiki, D. Galanis, H. Papageorgiou, I. Androutsopoulos, S. Manandhar, A.-S. Mohammad, M. Al-Ayyoub, Y. Zhao, B. Qin, O. De Clercq, et al. Semeval-2016 task 5: Aspect based sentiment analysis. In SemEval-2016, pages 19--30, 2016.

[35]

M. Pontiki, D. Galanis, H. Papageorgiou, S. Manandhar, and I. Androutsopoulos. Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 486--495, 2015.

[36]

M. Pontiki, D. Galanis, J. Pavlopoulos, H. Papageorgiou, I. Androutsopoulos, and S. Manandhar. Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 27--35, 2014.

[37]

H. Poon. Grounded unsupervised semantic parsing. In ACL, pages 933--943, 2013.

[38]

A. Popescu, O. Etzioni, and H. A. Kautz. Towards a theory of natural language interfaces to databases. In IUI, pages 149--157, 2003.

Digital Library

[39]

G. Qiu, B. Liu, J. Bu, and C. Chen. Opinion word expansion and target extraction through double propagation. COLING, 37(1):9--27, 2011.

Digital Library

[40]

R. Rehřek and P. Sojka. Gensim-statistical semantics in python. 2011.

[41]

S. Rothe, S. Ebert, and H. Schütze. Ultradense word embeddings by orthogonal transformation. In NAACL HLT, pages 767--777, 2016.

[42]

E. F. Sang and F. De Meulder. Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050, 2003.

[43]

M. Stonebraker and L. A. Rowe. The design of Postgres, volume 15. ACM, 1986.

Digital Library

[44]

D. Suciu, D. Olteanu, C. Ré, and C. Koch. Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011.

Digital Library

[45]

Y. Tai and H. Kao. Automatic domain-specific sentiment lexicon generation with label propagation. In IIWAS, page 53, 2013.

Digital Library

[46]

The Booking.com Dataset. https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe.

[47]

The Yelp Dataset. https://www.yelp.com/dataset.

[48]

I. Trummer, A. Y. Halevy, H. Lee, S. Sarawagi, and R. Gupta. Mining subjective properties on the web. In SIGMOD, pages 1745--1760, 2015.

Digital Library

[49]

I. S. Vicente, R. Agerri, and G. Rigau. Simple, Robust and (almost) Unsupervised Generation of Polarity Lexicons for Multiple Languages. In EACL, pages 88--97, 2014.

[50]

W. Wang, S. J. Pan, D. Dahlmeier, and X. Xiao. Recursive neural conditional random fields for aspect-based sentiment analysis. In EMNLP, pages 616--626, 2016.

[51]

W. Wang, S. J. Pan, D. Dahlmeier, and X. Xiao. Coupled multi-layer attentions for co-extraction of aspect and opinion terms. In AAAI, pages 3316--3322, 2017.

Digital Library

[52]

T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In HLT/EMNLP, pages 347--354, 2005.

Digital Library

[53]

H. Xin, R. Meng, and L. Chen. Subjective knowledge base construction powered by crowdsourcing and knowledge base. In SIGMOD, pages 1349--1361. ACM, 2018.

Digital Library

[54]

X. Yan, J. Guo, Y. Lan, and X. Cheng. A biterm topic model for short texts. In WWW, pages 1445--1456, 2013.

Digital Library

[55]

L. A. Zadeh. Fuzzy logic= computing with words. IEEE transactions on fuzzy systems, 4(2):103--111, 1996.

Digital Library

[56]

L. Zhang, S. Wang, and B. Liu. Deep learning for sentiment analysis: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 8(4), 2018.

[57]

V. Zhong, C. Xiong, and R. Socher. Seq2sql: Generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103, 2017.

Cited By

Zhang ZYang KZhang JPalmatier R(2023)Uncovering Synergy and Dysergy in Consumer ReviewsManagement Science10.1287/mnsc.2022.444369:4(2339-2360)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1287/mnsc.2022.4443
Omidvar-Tehrani BPersonnaz AAmer-Yahia SAl Hasan MXiong L(2022)Guided Text-based Item ExplorationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557141(3410-3420)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557141
Plötzky FBalke WBaeza-Yates RWeller KPortela MSeneviratne OWeber IYasseri TBon ASrinivas SIbáñez L(2022)It’s the Same Old Story! Enriching Event-Centric Knowledge Graphs by Narrative AspectsProceedings of the 14th ACM Web Science Conference 202210.1145/3501247.3531565(34-43)Online publication date: 26-Jun-2022
https://dl.acm.org/doi/10.1145/3501247.3531565
Show More Cited By

Subjective databases
1. Information systems
  1. Data management systems
    1. Database management system engines
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory

Recommendations

Querying subjective data
Abstract
Online users are constantly seeking experiences, such as a hotel with clean rooms and a lively bar, or a restaurant for a romantic rendezvous. However, e-commerce search engines only support queries involving objective attributes such as location, ...
3D video subjective quality: a new database and grade comparison study

This paper presents a research study on the subjective assessment of 3D video quality using a newly constructed 3D video database (3DVCL@FER). This database consists of 8 original 3D video sequences, each degraded with 22 different degradation types, ...
Querying relational databases through XSLT

XML has been accepted as a universal format for data interchange and publication. It can be applied in the applications in which the data of a database needs to be viewed in XML format so that the data being viewed takes more semantics and is easily ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 12, Issue 11

July 2019

543 pages

ISSN:2150-8097

Editors:
Lei Chen,
Fatma Özcan

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2019

Published in PVLDB Volume 12, Issue 11

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
167
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang ZYang KZhang JPalmatier R(2023)Uncovering Synergy and Dysergy in Consumer ReviewsManagement Science10.1287/mnsc.2022.444369:4(2339-2360)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1287/mnsc.2022.4443
Omidvar-Tehrani BPersonnaz AAmer-Yahia SAl Hasan MXiong L(2022)Guided Text-based Item ExplorationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557141(3410-3420)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557141
Plötzky FBalke WBaeza-Yates RWeller KPortela MSeneviratne OWeber IYasseri TBon ASrinivas SIbáñez L(2022)It’s the Same Old Story! Enriching Event-Centric Knowledge Graphs by Narrative AspectsProceedings of the 14th ACM Web Science Conference 202210.1145/3501247.3531565(34-43)Online publication date: 26-Jun-2022
https://dl.acm.org/doi/10.1145/3501247.3531565
Moreira Jde Melo TBarbosa LSilva A(2022)A distantly supervised approach for enriching product graphs with user opinionsJournal of Intelligent Information Systems10.1007/s10844-022-00717-559:2(435-454)Online publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1007/s10844-022-00717-5
Amer-Yahia SMilo TYoungmann BLi GLi ZIdreos SSrivastava D(2021)Exploring Ratings in Subjective DatabasesProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457259(62-75)Online publication date: 9-Jun-2021
https://dl.acm.org/doi/10.1145/3448016.3457259
Carmeli NWang XSuhara YAngelidis SLi YLi JTan W(2021)Constructing Explainable Opinion Graphs from ReviewsProceedings of the Web Conference 202110.1145/3442381.3450081(3419-3431)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3442381.3450081
Wang XSuhara YNuno NLi YLi JCarmeli NAngelidis SKandogann ETan W(2020)ExtremeReader: An interactive explorer for customizable and explainable review summarizationCompanion Proceedings of the Web Conference 202010.1145/3366424.3383535(176-180)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1145/3366424.3383535
Miao ZLi YWang XTan W(2020)Snippext: Semi-supervised Opinion Mining with Augmented DataProceedings of The Web Conference 202010.1145/3366423.3380144(617-628)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1145/3366423.3380144
Zhang XEngel JEvensen SLi YDemiralp ÇTan WBernhaupt RMueller FVerweij DAndres JMcGrenere JCockburn AAvellino IGoguey ABjørn PZhao SSamson BKocielnik R(2020)Teddy: A System for Interactive Review AnalysisProceedings of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3313831.3376235(1-13)Online publication date: 21-Apr-2020
https://dl.acm.org/doi/10.1145/3313831.3376235

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents