Files
Abstract
This thesis addresses the challenge of efficient search within the exponential space of parse trees generated by weighted context-free grammars. We introduce four novel probabilistic methods that prioritize and prune the search space, independent of grammar structure or linguistic annotations. Each method leverages lexical cues from the input sentence to guide search effectively. Empirical evaluations demonstrate their integration across multiple parsing architectures, including CKY dynamic programming, best-first graph-based search, and pruned beam search. Combined, these methods achieve state-of-the-art parsing speeds—over 1,500 words per second for English, Chinese, and German—without loss in labeled F1 accuracy. Observed runtime complexity improves from O(N3)O(N^3)O(N3) to O(N1.5)O(N^{1.5})O(N1.5), outperforming traditional coarse-to-fine pruning by more than an order of magnitude.