Haertel et al., 2010 - Google Patents
Automatic diacritization for low-resource languages using a hybrid word and consonant CMMHaertel et al., 2010
View PDF- Document ID
- 1258018722655072411
- Author
- Haertel R
- McClanahan P
- Ringger E
- Publication year
- Publication venue
- Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
External Links
Snippet
We are interested in diacritizing Semitic languages, especially Syriac, using only diacritized texts. Previous methods have required the use of tools such as part-of-speech taggers, segmenters, morphological analyzers, and linguistic rules to produce state-of-the-art results …
- 230000000877 morphologic 0 abstract description 16
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mori | Word-based partial annotation for efficient corpus construction | |
KR20190065665A (en) | Apparatus and method for recognizing Korean named entity using deep-learning | |
Micher | Improving coverage of an Inuktitut morphological analyzer using a segmental recurrent neural network | |
Abate et al. | Development of Amharic morphological analyzer using memory-based learning | |
EP1503295A1 (en) | Text generation method and text generation device | |
Tennage et al. | Transliteration and byte pair encoding to improve tamil to sinhala neural machine translation | |
JP5565827B2 (en) | A sentence separator training device for language independent word segmentation for statistical machine translation, a computer program therefor and a computer readable medium. | |
Haertel et al. | Automatic diacritization for low-resource languages using a hybrid word and consonant CMM | |
Mekki et al. | Sentence boundary detection of various forms of Tunisian Arabic | |
Kumar et al. | Morphological analysis of the Dravidian language family | |
Ovi et al. | BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging | |
Mammadov et al. | Part-of-speech tagging for azerbaijani language | |
KR100202292B1 (en) | Text analyzer | |
JP5500636B2 (en) | Phrase table generator and computer program therefor | |
KR100487716B1 (en) | Method for machine translation using word-level statistical information and apparatus thereof | |
JP5454763B2 (en) | Device for associating words in a sentence pair and computer program therefor | |
Thi Xuan Huong et al. | Using large n-gram for Vietnamese spell checking | |
KR20230011220A (en) | Apparatus for pre-learning for deep learning language model capable of understanding and generating language and method using the same | |
Khoufi et al. | Chunking Arabic texts using conditional random fields | |
Priva | Constructing typing-time corpora: A new way to answer old questions | |
KR20080028655A (en) | Method and apparatus for part-of-speech tagging | |
Slayden et al. | Thai sentence-breaking for large-scale SMT | |
Myint | A hybrid approach for part-of-speech tagging of Burmese texts | |
Alnajjar et al. | Automated prediction of medieval Arabic diacritics | |
Sarma et al. | A Comprehensive Survey of Noun Phrase Chunking in Natural Languages |