[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Bravely Say I Don’t Know: Relational Question-Schema Graph for Text-to-SQL Answerability Classification

Published: 25 March 2023 Publication History

Abstract

Recently, the Text-to-SQL task has received much attention. Many sophisticated neural models have been invented that achieve significant results. Most current work assumes that all the inputs are legal and the model should generate an SQL query for any input. However, in the real scenario, users are allowed to enter the arbitrary text that may not be answered by an SQL query. In this article, we focus on the issue–answerability classification for the Text-to-SQL system, which aims to distinguish the answerability of the question according to the given database schema. Existing methods concatenate the question and the database schema into a sentence, then fine-tune the pre-trained language model on the answerability classification task. In this way, the database schema is regarded as sequence text that may ignore the intrinsic structure relationship of the schema data, and the attention that represents the correlation between the question token and the database schema items is not well designed. To this end, we propose a relational Question-Schema graph framework that can effectively model the attention and relation between question and schema. In addition, a conditional layer normalization mechanism is employed to modulate the pre-trained language model to generate better question representation. Experiments demonstrate that the proposed framework outperforms all existing models by large margins, achieving new state of the art on the benchmark TRIAGESQL. Specifically, the model attains 88.41%, 78.24%, and 75.98% in Precision, Recall, and F1, respectively. Additionally, it outperforms the baseline by approximately 4.05% in Precision, 6.96% in Recall, and 6.01% in F1.

References

[1]
Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. CoRR abs/1607.06450 (2016). http://arxiv.org/abs/1607.06450
[2]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations: Conference Track (ICLR’15).
[3]
Keunwoo Choi, György Fazekas, Mark B. Sandler, and Kyunghyun Cho. 2017. Convolutional recurrent neural networks for music classification. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’17). IEEE, Los Alamitos, CA, 2392–2396.
[4]
Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, and Aaron C. Courville. 2017. Modulating early visual processing by language. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 6594–6604.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19) (Volume 1: Long and Short Papers). 4171–4186.
[6]
John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, 7 (2011), 2121–2159.
[7]
William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 1024–1034.
[8]
João F. Henriques, Sébastien Ehrhardt, Samuel Albanie, and Andrea Vedaldi. 2019. Small steps and giant leaps: Minimal Newton solvers for deep learning. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV’19). IEEE, Los Alamitos, CA, 4762–4771.
[9]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[10]
Binyuan Hui, Ruiying Geng, Qiyu Ren, Binhua Li, Yongbin Li, Jian Sun, Fei Huang, Luo Si, Pengfei Zhu, and Xiaodan Zhu. 2021. Dynamic hybrid relation network for cross-domain context-dependent semantic parsing. CoRR abs/2101.01686 (2021).
[11]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746–1751. DOI:
[12]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15).
[13]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations: Conference Track (ICLR’15). http://arxiv.org/abs/1412.6980
[14]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations: Conference Track (ICLR’17). https://openreview.net/forum?id=SJU4ayYgl.
[15]
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2020. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20).
[16]
Qian Liu, Bei Chen, Jian-Guang Lou, Bin Zhou, and Dongmei Zhang. 2020. Incomplete utterance rewriting as semantic segmentation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 2846–2857.
[17]
Qin Lyu, Kaushik Chakrabarti, Shobhit Hathi, Souvik Kundu, Jianwen Zhang, and Zheng Chen. 2020. Hybrid ranking network for Text-to-SQL. CoRR abs/2008.04759 (2020).
[18]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations: Workshop Track (ICLR’13).
[19]
Qingkai Min, Yuefeng Shi, and Yue Zhang. 2019. A pilot study for Chinese SQL semantic parsing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3652–3658.
[20]
Vivak Patel. 2016. Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning. SIAM Journal on Optimization 26, 4 (2016), 2620–2648.
[21]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’18) (Volume 1: Long Papers). 2227–2237.
[22]
Boris T. Polyak and Anatoli B. Juditsky. 1992. Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization 30, 4 (1992), 838–855.
[23]
Ning Qian. 1999. On the momentum term in gradient descent learning algorithms. Neural Networks 12, 1 (1999), 145–151.
[24]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-Training. Retrieved January 6, 2023 from https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
[25]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21 (2020), Article 140, 67 pages.
[26]
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1988. Learning Representations by Back-Propagating Errors. MIT Press, Cambridge, MA, 696–699.
[27]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.html.
[28]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 5998–6008.
[29]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2017. Graph attention networks. CoRR abs/1710.10903 (2017).
[30]
Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, and Matthew Richardson. 2020. RAT-SQL: Relation-aware schema encoding and linking for Text-to-SQL parsers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 7567–7578.
[31]
Kai Wang, Weizhou Shen, Yunyi Yang, Xiaojun Quan, and Rui Wang. 2020. Relational graph attention network for aspect-based sentiment analysis. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 3229–3238.
[32]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS’19). 5754–5764.
[33]
LeCun Yann, Bottou Leon, Bengio Yoshua, and Haffner Patrick. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE 86, 11 (1998), 2278–2324.
[34]
Tao Yu, Rui Zhang, Heyang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, et al. 2019. CoSQL: A conversational Text-to-SQL challenge towards cross-domain natural language interfaces to databases. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 1962–1979.
[35]
Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, et al. 2018. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and Text-to-SQL task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
[36]
Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Irene Li Heyang Er, et al. 2019. SParC: Cross-domain semantic parsing in context. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
[37]
Wei Yu, Tao Chang, Xiaoting Guo, Mengzhu Wang, and Xiaodong Wang. 2021. An interaction-modeling mechanism for context-dependent Text-to-SQL translation based on heterogeneous graph aggregation. Neural Networks 142 (2021), 573–582.
[38]
Wei Yu, Xiaoting Guo, Fei Chen, Tao Chang, Mengzhu Wang, and Xiaodong Wang. 2021. Similar questions correspond to similar SQL queries: A case-based reasoning approach for Text-to-SQL translation. In Case-Based Reasoning Research and Development, Antonio A. Sánchez-Ruiz and Michael W. Floyd (Eds.). Springer International Publishing, Cham, Switzerland, 294–308.
[39]
Yusen Zhang, Xiangyu Dong, Shuaichen Chang, Tao Yu, Peng Shi, and Rui Zhang. 2020. Did you ask a good question? A cross-domain question intention classification benchmark for Text-to-SQL. CoRR abs/2010.12634 (2020). https://arxiv.org/abs/2010.12634
[40]
Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating structured queries from natural language using reinforcement learning. CoRR abs/1709.00103 (2017). http://arxiv.org/abs/1709.00103
[41]
Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, and Bo Xu. 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, (ACL’16) (Volume 2: Short Papers).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 4
April 2023
682 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3588902
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2023
Online AM: 04 January 2023
Accepted: 19 December 2022
Revised: 17 October 2022
Received: 19 September 2021
Published in TALLIP Volume 22, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Text-to-SQL
  2. answerability classification
  3. relational graph

Qualifiers

  • Research-article

Funding Sources

  • Natural Science Key Project of Sichuan Minzu College

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 285
    Total Downloads
  • Downloads (Last 12 months)93
  • Downloads (Last 6 weeks)8
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media