Enhancing Table Retrieval with Dual Graph Representations

Tianyun Liu¹²,
Xinghua Zhang^12,13,
Zhenyu Zhang^12,13,
Yubin Wang^12,13,
Quangang Li^12,13,
Shuai Zhang^12,13 &
…
Tingwen Liu^12,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14172))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1341 Accesses

Abstract

Table retrieval aims to rank candidate tables for answering natural language query, in which the most critical problem is how to learn informative representations for structured tables. Most previous methods roughly flatten the table and send it into a sequence encoder, ignoring the structure information of tables and the semantic interaction between table cells and contexts. In this paper, we propose a dual graph based method to perceive the semantics and structure of tables, so as to preferably support the downstream table retrieval task. Inspired by human cognition, we first decouple a table into the row view and column view, then build dual graphs from these two views with the consideration of table contexts. Afterward, intra-graph and inter-graph interactions are iteratively performed for aggregating and exchanging local row- and column-oriented features respectively, and an adaptive fusion strategy is eventually tailor-made for sophisticated table representations. In this way, the table structure and semantic information are well considered with dual-graph modeling. Consequently, the input query can match the target tables based on their full-fledged table representations and achieve the ultimate ranking results more accurately. Extensive experiments verify the superiority of our dual graphs over strong baselines on two table retrieval datasets WikiTables and WebQueryTable. Further analyses also confirm the adaptability for row-/column-oriented tables, and show the rationality and generalization of dual graphs. The source code is available at https://github.com/ty33123/DualG.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 67.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 84.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Semantic Matching over Matrix-Style Tables in Richly Formatted Documents

An Information Retrieval-Based Approach to Table-Based Question Answering

Extracting Novel Facts from Tables for Knowledge Graph Completion

Notes

References

Cafarella, M.J., Halevy, A., Khoussainova, N.: Data integration for the relational web. Proc. VLDB Endowment 2(1), 1090–1101 (2009)
Google Scholar
Chen, W., et al.: TabFact: a large-scale dataset for table-based fact verification. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=rkeJRhNYDH
Chen, W., Zha, H., Chen, Z., Xiong, W., Wang, H., Wang, W.Y.: HybridQA: a dataset of multi-hop question answering over tabular and textual data. In: Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.91, https://aclanthology.org/2020.findings-emnlp.91
Chen, Z., Trabelsi, M., Heflin, J., Xu, Y., Davison, B.D.: Table search using a deep contextualized language model. In: Huang, J., et al. (eds.) Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, 25–30 July 2020. ACM (2020). https://doi.org/10.1145/3397271.3401044, https://doi.org/10.1145/3397271.3401044
Chen, Z., Trabelsi, M., Heflin, J., Yin, D., Davison, B.D.: MGNETS: multi-graph neural networks for table search. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 2945–2949. Association for Computing Machinery, New York, NY, USA (2021), https://doi.org/10.1145/3459637.3482140
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. vol. 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423, https://aclanthology.org/N19-1423
Eberius, J., Braunschweig, K., Hentsch, M., Thiele, M., Ahmadov, A., Lehner, W.: Building the dresden web table corpus: a classification approach. In: 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC) (2015). https://doi.org/10.1109/BDC.2015.30
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. vol. 2, Short Papers. Association for Computational Linguistics, Valencia, Spain (2017). https://aclanthology.org/E17-2068
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=SJU4ayYgl
Kurland, O.: The cluster hypothesis in information retrieval. In: Jones, G.J.F., Sheridan, P., Kelly, D., de Rijke, M., Sakai, T. (eds.) The 36th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2013, Dublin, Ireland - July 28 - August 01, 2013. ACM (2013). https://doi.org/10.1145/2484028.2484192, https://doi.org/10.1145/2484028.2484192
Li, X., Sun, Y., Cheng, G.: TSQA: Tabular scenario based question answering. Proc. AAAI Conf. Artif. Intell. 35(15), 13297–13305 (2021). https://ojs.aaai.org/index.php/AAAI/article/view/17570
MacDonald, E., Barbosa, D.: Neural relation extraction on wikipedia tables for augmenting knowledge graphs. In: d’Aquin, M., Dietze, S., Hauff, C., Curry, E., Cudré-Mauroux, P. (eds.) CIKM 2020: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, 19–23 October 2020. ACM (2020). https://doi.org/10.1145/3340531.3412164, https://doi.org/10.1145/3340531.3412164
Pan, F., Canim, M., Glass, M., Gliozzo, A., Fox, P.: CLTR: an end-to-end, transformer-based system for cell-level table retrieval and table question answering. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.acl-demo.24, https://aclanthology.org/2021.acl-demo.24
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at trec-3. Nist Special Publication Sp 109 (1995)
Google Scholar
Shi, Q., Zhang, Y., Yin, Q., Liu, T.: Logic-level evidence retrieval and graph-based verification network for table-based fact verification. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.emnlp-main.16, https://aclanthology.org/2021.emnlp-main.16
Shraga, R., Roitman, H., Feigenblat, G., Canim, M.: Ad hoc table retrieval using intrinsic and extrinsic similarities. In: WWW 2020: The Web Conference 2020, Taipei, Taiwan, April 20–24, 2020. ACM/IW3C2 (2020). https://doi.org/10.1145/3366423.3379995, https://doi.org/10.1145/3366423.3379995
Shraga, R., Roitman, H., Feigenblat, G., Canim, M.: Web table retrieval using multimodal deep learning. In: Huang, J., (eds.), Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, 25–30 July 2020. ACM (2020). https://doi.org/10.1145/3397271.3401120, https://doi.org/10.1145/3397271.3401120
Sun, Y., Yan, Z., Tang, D., Duan, N., Qin, B.: Content-based table retrieval for web queries. Neurocomputing 349, 183–189 (2019). https://doi.org/10.1016/j.neucom.2018.10.033, https://www.sciencedirect.com/science/article/pii/S0925231218312219
Trabelsi, M., Chen, Z., Zhang, S., Davison, B.D., Heflin, J.: StruBERT: structure-aware BERT for table search and matching. In: Proceedings of the ACM Web Conference 2022. WWW 2022, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3485447.3511972, https://doi.org/10.1145/3485447.3511972
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., (ed.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA (2017)
Google Scholar
Venetis, P., et al.: Recovering semantics of tables on the web. Proc. VLDB Endowment 4(9), 528–538 (2011)
Google Scholar
Wang, D., Shiralkar, P., Lockard, C., Huang, B., Dong, X.L., Jiang, M.: TCN: table convolutional network for web table interpretation. In: Proceedings of the Web Conference 2021. WWW 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3442381.3450090, https://doi.org/10.1145/3442381.3450090
Wang, F., Sun, K., Chen, M., Pujara, J., Szekely, P.: Retrieving complex tables with multi-granular graph representation learning. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3404835.3462909
Yin, P., Neubig, G., Yih, W.T., Riedel, S.: TaBERT: pretraining for joint understanding of textual and tabular data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.745, https://aclanthology.org/2020.acl-main.745
Zhang, L., Zhang, S., Balog, K.: Table2vec: neural word and entity embeddings for table population and retrieval. In: Piwowarski, B., Chevalier, M., Gaussier, É., Maarek, Y., Nie, J., Scholer, F. (eds.) Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, 21–25 July 2019. ACM (2019). https://doi.org/10.1145/3331184.3331333, https://doi.org/10.1145/3331184.3331333
Zhang, S., Balog, K.: Ad hoc table retrieval using semantic similarity. In: Champin, P., Gandon, F.L., Lalmas, M., Ipeirotis, P.G. (eds.) Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW 2018, Lyon, France, 23–27 April 2018. ACM (2018). https://doi.org/10.1145/3178876.3186067, https://doi.org/10.1145/3178876.3186067

Download references

Acknowledgment

This work is supported by the National Key Research and Development Program of China (grant No.2021YFB3100600), the Strategic Priority Research Program of Chinese Academy of Sciences (grant No.XDC02040400) and the Youth Innovation Promotion Association of CAS (Grant No. 2021153).

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Tianyun Liu, Xinghua Zhang, Zhenyu Zhang, Yubin Wang, Quangang Li, Shuai Zhang & Tingwen Liu
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Xinghua Zhang, Zhenyu Zhang, Yubin Wang, Quangang Li, Shuai Zhang & Tingwen Liu

Authors

Tianyun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xinghua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yubin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Quangang Li
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tingwen Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tingwen Liu .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Danai Koutra
University of Vienna, Vienna, Austria
Claudia Plant
Max Planck Institute for Software Systems, Kaiserslautern, Germany
Manuel Gomez Rodriguez
Politecnico di Torino, Turin, Italy
Elena Baralis
CENTAI, Turin, Italy
Francesco Bonchi

Ethics declarations

Ethics Statement

I understand that using technology can have ethical implications, especially in collection, processing, and privacy of form retrieval data. I acknowledge and recognize the importance of complying with ethical standards and the hazards of potential risks.

In the data collection and processing, my training data comes from two publicly available tabular search datasets. Although we do not collect or store any sensitive information, we should strictly restrict the retrieval text of users and ensure that it does not contain any dangerous information.

In addition, when the model used in police or military related applications, we should pay special attention to its use in these areas, which must conducted in a more responsible manner. To prevent models from providing inaccurate search results for police or military personnel, users are responsible for ensuring that they comply with ethical principles and laws and regulations when using model outputs, and for screening search results.

In summary, I strive to ensure that the model outputs search results in an ethical and responsible manner, and I urge my users to do the same. I will continue to adhere to ethical standards and stay abreast of emerging ethical issues in the fields of machine learning and data mining.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, T. et al. (2023). Enhancing Table Retrieval with Dual Graph Representations. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14172. Springer, Cham. https://doi.org/10.1007/978-3-031-43421-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-43421-1_7
Published: 18 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43420-4
Online ISBN: 978-3-031-43421-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Enhancing Table Retrieval with Dual Graph Representations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Semantic Matching over Matrix-Style Tables in Richly Formatted Documents

An Information Retrieval-Based Approach to Table-Based Question Answering

Extracting Novel Facts from Tables for Knowledge Graph Completion

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Ethics Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Enhancing Table Retrieval with Dual Graph Representations

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Semantic Matching over Matrix-Style Tables in Richly Formatted Documents

An Information Retrieval-Based Approach to Table-Based Question Answering

Extracting Novel Facts from Tables for Knowledge Graph Completion

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Ethics Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation