Liu et al., 2006 - Google Patents
A study in machine learning from imbalanced data for sentence boundary detection in speechLiu et al., 2006
View PDF- Document ID
- 13217462085754328976
- Author
- Liu Y
- Chawla N
- Harper M
- Shriberg E
- Stolcke A
- Publication year
- Publication venue
- Computer Speech & Language
External Links
Snippet
Enriching speech recognition output with sentence boundaries improves its human readability and enables further processing by downstream language processing modules. We have constructed a hidden Markov model (HMM) system to detect sentence boundaries …
- 238000001514 detection method 0 title abstract description 63
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | A study in machine learning from imbalanced data for sentence boundary detection in speech | |
Liu et al. | Enriching speech recognition with automatic detection of sentence boundaries and disfluencies | |
Lee et al. | Spoken document understanding and organization | |
Wu et al. | Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels | |
Ostendorf et al. | Human language technology: Opportunities and challenges | |
EP1447792B1 (en) | Method and apparatus for modeling a speech recognition system and for predicting word error rates from text | |
US20040024598A1 (en) | Thematic segmentation of speech | |
Aksënova et al. | How might we create better benchmarks for speech recognition? | |
Liu et al. | Structural metadata research in the EARS program | |
Ostendorf et al. | Speech segmentation and spoken document processing | |
Sunkara et al. | Robust prediction of punctuation and truecasing for medical asr | |
Sridhar et al. | Combining lexical, syntactic and prosodic cues for improved online dialog act tagging | |
Furui | Recent progress in corpus-based spontaneous speech recognition | |
Xu et al. | A bidirectional lstm approach with word embeddings for sentence boundary detection | |
Sharma et al. | A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows | |
Liu et al. | Comparing and combining generative and posterior probability models: Some advances in sentence boundary detection in speech | |
Errattahi et al. | System-independent asr error detection and classification using recurrent neural network | |
Augustyniak et al. | Punctuation prediction in spontaneous conversations: Can we mitigate asr errors with retrofitted word embeddings? | |
Shafran et al. | A comparison of classifiers for detecting emotion from speech | |
Harper et al. | Multimodal model integration for sentence unit detection | |
Kahn et al. | Joint reranking of parsing and word recognition with automatic segmentation | |
Liu et al. | Paraphrastic language models | |
NithyaKalyani et al. | Speech summarization for tamil language | |
Ghannay et al. | A study of continuous space word and sentence representations applied to ASR error detection | |
Furui et al. | Transcription and distillation of spontaneous speech |