[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Haertel et al., 2010 - Google Patents

Automatic diacritization for low-resource languages using a hybrid word and consonant CMM

Haertel et al., 2010

View PDF
Document ID
1258018722655072411
Author
Haertel R
McClanahan P
Ringger E
Publication year
Publication venue
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

External Links

Snippet

We are interested in diacritizing Semitic languages, especially Syriac, using only diacritized texts. Previous methods have required the use of tools such as part-of-speech taggers, segmenters, morphological analyzers, and linguistic rules to produce state-of-the-art results …
Continue reading at aclanthology.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • G06F17/271Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • G06F17/2715Statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/274Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS

Similar Documents

Publication Publication Date Title
Mori Word-based partial annotation for efficient corpus construction
KR20190065665A (en) Apparatus and method for recognizing Korean named entity using deep-learning
Micher Improving coverage of an Inuktitut morphological analyzer using a segmental recurrent neural network
Abate et al. Development of Amharic morphological analyzer using memory-based learning
EP1503295A1 (en) Text generation method and text generation device
Tennage et al. Transliteration and byte pair encoding to improve tamil to sinhala neural machine translation
JP5565827B2 (en) A sentence separator training device for language independent word segmentation for statistical machine translation, a computer program therefor and a computer readable medium.
Haertel et al. Automatic diacritization for low-resource languages using a hybrid word and consonant CMM
Mekki et al. Sentence boundary detection of various forms of Tunisian Arabic
Kumar et al. Morphological analysis of the Dravidian language family
Ovi et al. BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging
Mammadov et al. Part-of-speech tagging for azerbaijani language
KR100202292B1 (en) Text analyzer
JP5500636B2 (en) Phrase table generator and computer program therefor
KR100487716B1 (en) Method for machine translation using word-level statistical information and apparatus thereof
JP5454763B2 (en) Device for associating words in a sentence pair and computer program therefor
Thi Xuan Huong et al. Using large n-gram for Vietnamese spell checking
KR20230011220A (en) Apparatus for pre-learning for deep learning language model capable of understanding and generating language and method using the same
Khoufi et al. Chunking Arabic texts using conditional random fields
Priva Constructing typing-time corpora: A new way to answer old questions
KR20080028655A (en) Method and apparatus for part-of-speech tagging
Slayden et al. Thai sentence-breaking for large-scale SMT
Myint A hybrid approach for part-of-speech tagging of Burmese texts
Alnajjar et al. Automated prediction of medieval Arabic diacritics
Sarma et al. A Comprehensive Survey of Noun Phrase Chunking in Natural Languages