Togashi et al., 2008 - Google Patents

A browsing system for classroom lecture speech.

Togashi et al., 2008

Document ID: 16763916897272170432
Author: Togashi S; Nakagawa S
Publication year: 2008
Publication venue: INTERSPEECH

External Links

Cited by

Snippet

Developing technologies to summarize and retrieve huge quantities of spoken documents, recorded during classroom lectures, for the purpose of e-Learning or self-learning are important. In this paper, we describe an adaptation method of a language model to …

Continue reading at www.isca-archive.org (PDF) (other versions)

230000004301 light adaptation 0 abstract description 2

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/102—Programmed access in sequence to addressed parts of tracks of operating record carriers
- G11B27/105—Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/06—Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
- G09B5/065—Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems

Similar Documents

Publication	Publication Date	Title
Glass et al.	2007	Recent progress in the MIT spoken lecture processing project.
Glass et al.	2004	Analysis and processing of lecture audio data: Preliminary investigations
Foote	1999	An overview of audio information retrieval
Larson et al.	2012	Spoken content retrieval: A survey of techniques and technologies
Van Thong et al.	2002	Speechbot: an experimental speech-based search engine for multimedia content on the web
Goh et al.	2020	The Auditory English Lexicon Project: A multi-talker, multi-region psycholinguistic database of 10,170 spoken words and nonwords
Chen et al.	2002	Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese
Furui	2005	Recent progress in corpus-based spontaneous speech recognition
Chen et al.	2000	Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics
Lee et al.	2014	Spoken knowledge organization by semantic structuring and a prototype course lecture system for personalized learning
Nouza et al.	2012	Making czech historical radio archive accessible and searchable for wide public
Nouza et al.	2012	Voice technology to enable sophisticated access to historical audio archive of the czech radio
Hirschberg et al.	1999	Finding information in audio: A new paradigm for audio browsing and retrieval
Galibert et al.	2005	Ritel: an open-domain, human-computer dialog system.
Smaïli et al.	2019	Summarizing videos into a target language: Methodology, architectures and evaluation
Glass et al.	2005	The MIT spoken lecture processing project
Togashi et al.	2008	A browsing system for classroom lecture speech.
Crestani	1999	Vocal access to a newspaper archive: design issues and preliminary investigations
González et al.	2013	An illustrated methodology for evaluating ASR systems
Togashi et al.	2006	Summarization of spoken lectures based on linguistic surface and prosodic information
Adell Mercado et al.	2012	Buceador, a multi-language search engine for digital libraries
Jones et al.	1995	Video mail retrieval using voice: an overview of the Stage 2 system
Viswanathan et al.	2000	Multimedia document retrieval using speech and speaker recognition
Eskevich	2014	Towards effective retrieval of spontaneous conversational spoken content
Lin et al.	2022	Fast task-specific adaptation in spoken language assessment with meta-learning