Incremental segmentation and annotation strategies for real-time natural language processing applications

Yarmohammadi, Mahsa

doi:10.6083/M4M61JBZ

Yarmohammadi, Mahsa

2016

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

The input data for real-time natural language processing applications, such as simultaneous speech-to-speech translation systems, often arrives as a continuous stream without clear boundaries. To facilitate real-time processing, the application requires segments of this stream to be separated appropriately. In this thesis, we introduce "hedge parsing," a fast incremental parsing method that enables syntax-aware segmentation of input streams. Unlike full syntactic parsing, which requires complete data, hedge parsing can work with incomplete information, making it suitable for real-time scenarios. It provides a complete hierarchical structure rather than just bracketing information, enhancing processing performance.

Details

Title

Incremental segmentation and annotation strategies for real-time natural language processing applications

Creator

Yarmohammadi, Mahsa : Oregon Health and Science University : (0000-0002-5020-4410)

Contributor

Roark, Brian Advisor (Google Inc.)
Bedrick, Steven Committee member (Oregon Health and Science University)
Sproat, Richard Committee member (Google Inc.)
Bangalore, Srinivas Committee member (Interactions Labs)
Heeman, Peter Committee member (Oregon Health and Science University)

Date

2016-09-01

Subjects

Linguistics
Speech
Natural Language Processing

DOI

https://doi.org/10.6083/M4M61JBZ

Content Type

Dissertation

Degree Type

Ph.D.

Academic Program

Computer Science & Engineering