[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Less Is More: Rejecting Unreliable Reviews for Product Question Answering

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12459))

Abstract

Promptly and accurately answering questions on products is important for e-commerce applications. Manually answering product questions (e.g. on community question answering platforms) results in slow response and does not scale. Recent studies show that product reviews are a good source for real-time, automatic product question answering (PQA). In the literature, PQA is formulated as a retrieval problem with the goal to search for the most relevant reviews to answer a given product question. In this paper, we focus on the issue of answerability and answer reliability for PQA using reviews. Our investigation is based on the intuition that many questions may not be answerable with a finite set of reviews. When a question is not answerable, a system should return nil answers rather than providing a list of irrelevant reviews, which can have significant negative impact on user experience. Moreover, for answerable questions, only the most relevant reviews that answer the question should be included in the result. We propose a conformal prediction based framework to improve the reliability of PQA systems, where we reject unreliable answers so that the returned results are more concise and accurate at answering the product question, including returning nil answers for unanswerable questions. Experiments on a widely used Amazon dataset show encouraging results of our proposed framework. More broadly, our results demonstrate a novel and effective application of conformal methods to a retrieval task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 71.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.amazon.com/.

  2. 2.

    https://world.taobao.com/.

  3. 3.

    https://www.mturk.com/.

  4. 4.

    https://github.com/zswvivi/ecml_pqa.

  5. 5.

    https://cseweb.ucsd.edu/~jmcauley.

  6. 6.

    https://github.com/zswvivi/icdm_pqa.

  7. 7.

    The original implementation uses a softmax activation function to compute P(r|q) (and so the probability of all reviews sum up to one); we make a minor modification to the softmax function and use a sigmoid function instead (and so each review produces a valid probability distribution over the positive and negative classes).

  8. 8.

    Following the original papers, a “review” is technically a “review sentence” rather than the full review.

  9. 9.

    To control for quality, we insert a control question with a known answer (from the QA pair) in every 3 questions. Workers who consistently give low scores to these control questions are filtered out.

  10. 10.

    This step is only needed for moqa, as bertqa and fltr produce probabilities in the first place. For moqa, we convert the review score into a probability applying a sigmoid function to the log score.

References

  1. McAuley, J., Yang, A.: Addressing complex and subjective product-related queries with customer reviews. In: WWW (2016)

    Google Scholar 

  2. Zhao, J., Guan, Z., Sun, H.: Riker: mining rich keyword representations for interpretable product question answering. In: SIGKDD (2019)

    Google Scholar 

  3. Zhang, S., Lau, J.H., Zhang, X., Chan, J., Paris, C.: Discovering Relevant Reviews for Answering Product-related Queries. In: ICDM (2019)

    Google Scholar 

  4. Gao, S., Ren, Z., et al.: Product-aware answer generation in e-commerce question-answering. In: WSDM (2019)

    Google Scholar 

  5. Chen, S., Li, C., et al.: Driven answer generation for product-related questions in e-commerce. In: WSDM (2019)

    Google Scholar 

  6. Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. In: ACL (2018)

    Google Scholar 

  7. Herbei, R., Wegkamp, M.H.: Classification with reject option. The Canadian Journal of Statistics/La Revue Canadienne de Statistique (2006)

    Google Scholar 

  8. Gammerman, A.: Conformal Predictors for Reliable Pattern Recognition. In: Computer Data Analysis and Modeling: Stochastics and Data Science (2019)

    Google Scholar 

  9. Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, New York (2005)

    MATH  Google Scholar 

  10. Shafer, G., Vovk, V.: A tutorial on conformal prediction. J. Mach. Learn. Res. 9, 371–421 (2008)

    MathSciNet  MATH  Google Scholar 

  11. Toccaceli, P., Gammerman, A.: Combination of inductive mondrian conformal predictors. Mach. Learn. 108(3), 489–510 (2018). https://doi.org/10.1007/s10994-018-5754-9

    Article  MathSciNet  MATH  Google Scholar 

  12. Carlsson, L., Bendtsen, C., Ahlberg, E.: Comparing performance of different inductive and transductive conformal predictors relevant to drug discovery. In: Conformal and Probabilistic Prediction and Applications (2017)

    Google Scholar 

  13. Cortes-Ciriano, I., Bender, A.: Reliable prediction errors for deep neural networks using test-time dropout. J. Chem. Inf. Model. 59(7), 3330–3339 (2019)

    Article  Google Scholar 

  14. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)

    Article  Google Scholar 

  15. Devlin, J., Chang, M.W., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)

    Google Scholar 

  16. Gupta, M., Kulkarni, N., Chanda, R., et al.: AmazonQA: a review-based question answering task. In: IJCAI (2019)

    Google Scholar 

  17. Hu, M., Wei, F., Peng, Y., et al.: Read+ verify: machine reading comprehension with unanswerable questions. In: AAAI (2019)

    Google Scholar 

  18. Sun, F., Li, L., et al.: U-net: machine reading comprehension with unanswerable questions (2018)

    Google Scholar 

  19. Godin, F., Kumar, A., Mittal, A.: Learning when not to answer: a ternary reward structure for reinforcement learning based question answering. In: NAACL-HLT (2019)

    Google Scholar 

  20. Huang, K., Tang, Y., Huang, J., He, X., Zhou, B.: Relation module for non-answerable predictions on reading comprehension. In: CoNLL (2019)

    Google Scholar 

  21. Joshi, M., Choi, E., Weld, D.S., Zettlemoyer, L.: TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. In: ACL (2017)

    Google Scholar 

  22. Dunn, M., Sagun, L., Higgins, M., Guney, V.U., Cirik, V., Cho, K.: Searchqa: a new qa dataset augmented with context from a search engine (2017)

    Google Scholar 

  23. Su, L., Guo, J., Fan, Y., Lan, Y., Cheng, X.: Controlling risk of web question answering. In: SIGIR (2019)

    Google Scholar 

  24. Sun, J., Carlsson, L., Ahlberg, E., et al.: Applying mondrian cross-conformal prediction to estimate prediction confidence on large imbalanced bioactivity data sets. J. Chem. Inf. Model. 57(7), 1591–1598 (2017)

    Google Scholar 

  25. Card, D., Zhang, M., Smith, N.A.: Deep weighted averaging classifiers. In: Proceedings of the Conference on Fairness, Accountability, and Transparency (2019)

    Google Scholar 

  26. Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: ICML (2016)

    Google Scholar 

  27. Liu, F., Moffat, A., Baldwin, T., Zhang, X.: Quit while ahead: Evaluating truncated rankings. In: SIGIR (2016)

    Google Scholar 

  28. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)

    Article  Google Scholar 

  29. Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30, 195–215 (1998). https://doi.org/10.1023/A:1007452223027

Download references

Acknowledgement

Shiwei Zhang is supported by the RMIT University and CSIRO Data61 Scholarships.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiuzhen Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, S., Zhang, X., Lau, J.H., Chan, J., Paris, C. (2021). Less Is More: Rejecting Unreliable Reviews for Product Question Answering. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67664-3_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67663-6

  • Online ISBN: 978-3-030-67664-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics