Liu et al., 2006 - Google Patents

A study in machine learning from imbalanced data for sentence boundary detection in speech

Liu et al., 2006

Document ID: 13217462085754328976
Author: Liu Y; Chawla N; Harper M; Shriberg E; Stolcke A
Publication year: 2006
Publication venue: Computer Speech & Language

External Links

Cited by

Snippet

Enriching speech recognition output with sentence boundaries improves its human readability and enables further processing by downstream language processing modules. We have constructed a hidden Markov model (HMM) system to detect sentence boundaries …

Continue reading at www3.nd.edu (PDF) (other versions)

238000001514 detection method 0 title abstract description 63

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass

Similar Documents

Publication	Publication Date	Title
Liu et al.	2006	A study in machine learning from imbalanced data for sentence boundary detection in speech
Liu et al.	2006	Enriching speech recognition with automatic detection of sentence boundaries and disfluencies
Lee et al.	2005	Spoken document understanding and organization
Wu et al.	2010	Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels
Ostendorf et al.	2005	Human language technology: Opportunities and challenges
EP1447792B1 (en)	2008-04-09	Method and apparatus for modeling a speech recognition system and for predicting word error rates from text
US20040024598A1 (en)	2004-02-05	Thematic segmentation of speech
Aksënova et al.	2021	How might we create better benchmarks for speech recognition?
Liu et al.	2005	Structural metadata research in the EARS program
Ostendorf et al.	2008	Speech segmentation and spoken document processing
Sunkara et al.	2020	Robust prediction of punctuation and truecasing for medical asr
Sridhar et al.	2009	Combining lexical, syntactic and prosodic cues for improved online dialog act tagging
Furui	2005	Recent progress in corpus-based spontaneous speech recognition
Xu et al.	2018	A bidirectional lstm approach with word embeddings for sentence boundary detection
Sharma et al.	2022	A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows
Liu et al.	2004	Comparing and combining generative and posterior probability models: Some advances in sentence boundary detection in speech
Errattahi et al.	2019	System-independent asr error detection and classification using recurrent neural network
Augustyniak et al.	2020	Punctuation prediction in spontaneous conversations: Can we mitigate asr errors with retrofitted word embeddings?
Shafran et al.	2005	A comparison of classifiers for detecting emotion from speech
Harper et al.	2004	Multimodal model integration for sentence unit detection
Kahn et al.	2012	Joint reranking of parsing and word recognition with automatic segmentation
Liu et al.	2014	Paraphrastic language models
NithyaKalyani et al.	2019	Speech summarization for tamil language
Ghannay et al.	2020	A study of continuous space word and sentence representations applied to ASR error detection
Furui et al.	2008	Transcription and distillation of spontaneous speech