A Regularization-based Framework for Bilingual Grammar Induction

Abstract

Grammar induction aims to discover syntactic structures from unannotated sentences. In this paper, we propose a framework in which the learning process of the grammar model of one language is influenced by knowledge from the model of another language. Unlike previous work on multilingual grammar induction, our approach does not rely on any external resource, such as parallel corpora, word alignments or linguistic phylogenetic trees. We propose three regularization methods that encourage similarity between model parameters, dependency edge scores, and parse trees respectively. We deploy our methods on a state-of-the-art unsupervised discriminative parser and evaluate it on both transfer grammar induction and bilingual grammar induction. Empirical results on multiple languages show that our methods outperform strong baselines.

Publication
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Related