Abstract
Understanding the complexity of the translation of Natural Language (NL) sentences to SQL queries becomes an essential part in the resolution process. The majority of the proposed models either focus on simple queries or suffer when exposed to unseen domains or new schemas structures; This can be understood as the greater part of solutions are based on limited datasets or treat the problem in an end-to-end perspective. Our previously proposed model which is SQLSketch that provides an intelligent method for handling complex queries was able to outperform all the state-of-the-art models on the GreatSQL dataset. This paper addresses the problem of translating NL sentences to SQL queries in an effective way by leveraging our previous SQLSketch model with a type aware layer, a values classification method as well as a compatibility based module that enhance the quality of the predicted items (SQLSketch-TVC). We evaluate the new model using the Components and Exact matching metrics. The results show that SQLSketch-TVC outperforms the other models on all SQL components and provides a novel way for inferring values from the input Question.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Hanane B, Machkour M, Koutti L (2020) A model of a generic Arabic language interface for multimodel database. Int J Speech Technol 23:669–681
Gupta A, Akula A, Malladi D, Kukkadapu P, Ainavolu V, Sangal R (2012) A novel approach towards building a portable NLIDB system using the computational paninian grammar framework. In: International conference on asian language processing
Siasar djahantighi F, Norouzifard M, Davarpanah SH, Shenassa MH (2008) Using natural language processing in order to create SQL queries. djahantighi. In: 2008. ICCCE 2008. International conference on Computer and communication engineering, IEEE, pp 600–604
Ahkouk K, Machkour M (2020) Towards an interface for translating natural language questions to SQL: A conceptual framework from a systematic review. International Journal of Reasoning-based Intelligent Systems 12:264
Ahkouk K, Machkour M, Majhadi K, Mama R (2021) SQLSKetch: Generating SQL Queries using a sketch-based approach. Journal of intelligent and fuzzy systems
Zhong V, Xiong C, Socher R (2017) seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
Xu X, Liu C, Song D (2017) Sqlnet: Generating structured queries from natural language without reinforcement learning. arXiv preprint arXiv:1711.04436
Wang C, Brockschmidt M, Singh R (2017) Pointing out sql queries from text. Technical Report
Yu T, Li Z, Zhang Z, Zhang R, Radev D (2018) TypeSQL: Knowledge-based Type-Aware Neural Text-to-SQL Generation. arXiv:1804.09769
Dong L, Lapata M (2018) Coarse-to-fine decoding for neural semantic parsing. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, pp 731–742
Shi T, Tatwawadi K, Chakrabarti K, Mao Y, Polozov O, Chen W (2018) IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles. arXiv:1809.05054
Hwang W, Yim J, Park S, Seo M (2019) A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization. arXiv:1902.01069
He P, Yi M, Chakrabarti K, Chen W (2019) X-SQL: Reinforce schema representation with context. arXiv:1908.08113
Liu X, He P, Chen W, Gao J (2019) Multi-Task Deep Neural Networks for Natural Language Understanding. arXiv:1901.11504
Yu T, Zhang R, Yang K, Yasunaga M, Wang D, Li Z, James M, Li I, Yao Q, Roman S, Zhang Z, Radev D (2018) A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain. arXiv:1809.08887
Yu T, Yasunaga M, Yang K, Zhang R, Wang D, Li Z, Radev D (2018) SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task. arXiv:1810.05237
Lee D (2019) Clause-Wise and Recursive Decoding for Complex and Cross-Domain Text-to-SQL Generation. arXiv:1904.08835
Lin K, Bogin B, Neumann M, Berant J (2019) Matt Gardner. Grammar-based Neural Text-to-SQL Generation. arXiv:1905.13326
Bogin B, Gardner M, Berant J (2019) Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing. arXiv:1905.06241
Zhao L, Cao H, Zhao Y (2019) GP: Context-free Grammar Pre-training for Text-to-SQL Parsers. arXiv:1905.06241
Wang C, Tatwawadi K, Brockschmidt M, Huang P, Mao Y, Polozov O, Singh R (2018) Robust text-to-sql generation with execution-guided decoding. arXiv:1807.03100
Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805
Kingma DP, Ba J (2014) Adam: A Method for Stochastic Optimization. arXiv:1412.6980
Acknowledgements
We thank Sara Slila for the help and the participation in this work. We also thank all people near or far who provided feedback and participated in the promising discussions for this project.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ahkouk, K., Machkour, M. SQLSketch-TVC: Type, value and compatibility based approach for SQL queries. Appl Intell 53, 3889–3898 (2023). https://doi.org/10.1007/s10489-022-03587-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03587-0