Abstract
Recent studies have shown effectiveness in using neural networks for Chinese word segmentation. However, these models rely on large-scale data and are less effective for low-resource datasets because of insufficient training data. We propose a transfer learning method to improve low-resource word segmentation by leveraging high-resource corpora. First, we train a teacher model on high-resource corpora and then use the learned knowledge to initialize a student model. Second, a weighted data similarity method is proposed to train the student model on low-resource data. Experiment results show that our work significantly improves the performance on low-resource datasets: 2.3% and 1.5% F-score on PKU and CTB datasets. Furthermore, this paper achieves state-of-the-art results: 96.1%, and 96.2% F-score on PKU and CTB datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cai, D., Zhao, H.: Neural word segmentation learning for Chinese. In: Meeting of the Association for Computational Linguistics (2016)
Chen, X., Qiu, X., Zhu, C., Huang, X.: Gated recursive neural network for Chinese word segmentation. In: ACL (1), pp. 1744–1753. The Association for Computer Linguistics (2015)
Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for Chinese word segmentation. In: EMNLP, pp. 1197–1206. The Association for Computational Linguistics (2015)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Emerson, T.: The second international Chinese word segmentation bakeoff. In: Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, pp. 123–133 (2005)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. (2014)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, Number 8 in ICML 2001, pp. 282–289 (2001)
Liu, Y., Zhang, Y., Che, W., Liu, T., Wu, F.: Domain adaptation for CRF-based Chinese word segmentation using free annotations. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) EMNLP, pp. 864–874. ACL (2014)
Ma, J., Hinrichs, E.W.: Accurate linear-time Chinese word segmentation via embedding matching. In: ACL (1), pp. 1733–1743 (2015)
Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, Long Papers, vol. 1, pp. 293–303. Association for Computational Linguistics (2014)
Peng, F., Feng, F., McCallum, A.: Chinese segmentation and new word detection using conditional random fields. In: Proceedings of the 20th International Conference on Computational Linguistics, Stroudsburg, PA, USA, COLING 2004. Association for Computational Linguistics (2004)
Recht, B., Ré, C., Wright, S.J., Niu, F.: HOGWILD: a lock-free approach to parallelizing stochastic gradient descent. In: NIPS, pp. 693–701 (2011)
Sun, W., Xu, J.: Enhancing Chinese word segmentation using unlabeled data. In: Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John Mcintyre Conference Centre, Edinburgh, UK, A Meeting of SIGDAT, A Special Interest Group of the ACL, pp. 970–979 (2011)
Sun, X.: Structure regularization for structured prediction. In: Advances in Neural Information Processing Systems 27, pp. 2402–2410 (2014)
Sun, X.: Asynchronous parallel learning for neural networks and structured models with dense features. In: COLING (2016)
Sun, X., Li, W., Wang, H., Qin, L.: Feature-frequency-adaptive on-line training for fast and accurate natural language processing. Comput. Linguist. 40(3), 563–586 (2014)
Sun, X., Wang, H., Li, W.: Fast online training with frequency-adaptive learning rates for Chinese word segmentation and new word detection. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, Korea, Long Papers, vol. 1, pp. 253–262. Association for Computational Linguistics (2012)
Tseng, H.: A conditional random field word segmenter. In: Fourth SIGHAN Workshop on Chinese Language Processing (2005)
Xue, N., Shen, L.: Chinese word segmentation as LMR tagging. In: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing (2003)
Zhang, M., Zhang, Y., Che, W., Liu, T.: Type-supervised domain adaptation for joint segmentation and POS-tagging. In: EACL, pp. 588–597 (2014)
Zhang, M., Zhang, Y., Fu, G.: Transition-based neural word segmentation. In: Meeting of the Association for Computational Linguistics, pp. 421–431 (2016)
Zhang, R., Kikui, G., Sumita, E.: Subword-based tagging by conditional random fields for Chinese word segmentation. In: Proceedings of the Human Language Technology Conference of the NAACL, Stroudsburg, PA, USA, NAACL-Short 2006, Companion Volume, Short Papers, pp. 193–196. Association for Computational Linguistics (2006)
Zhang, Y., Clark, S.: Chinese segmentation with a word-based perceptron algorithm. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 840–847. Association for Computational Linguistics (2007)
Zhao, H., Huang, C., Li, M., Lu, B.-L.: A unified character-based tagging framework for Chinese word segmentation. ACM Trans. Asian Lang. Inf. Process. 9(2), 5 (2010)
Zhao, K., Huang, L.: Minibatch and parallelization for online large margin structured learning. In: HLT-NAACL, pp. 370–379. The Association for Computational Linguistics (2013)
Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and POS tagging. In: EMNLP, pp. 647–657. ACL (2013)
Acknowledgments
We thank the anonymous reviewers for their valuable comments. This work was supported in part by National High Technology Research and Development Program of China (863 Program, No. 2015AA015404), National Natural Science Foundation of China (No. 61673028).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Xu, J., Ma, S., Zhang, Y., Wei, B., Cai, X., Sun, X. (2018). Transfer Deep Learning for Low-Resource Chinese Word Segmentation with a Novel Neural Network. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2017. Lecture Notes in Computer Science(), vol 10619. Springer, Cham. https://doi.org/10.1007/978-3-319-73618-1_62
Download citation
DOI: https://doi.org/10.1007/978-3-319-73618-1_62
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73617-4
Online ISBN: 978-3-319-73618-1
eBook Packages: Computer ScienceComputer Science (R0)