Gupta et al., 2013 - Google Patents
Encoding transliteration variation through dimensionality reduction: FIRE Shared Task on Transliterated SearchGupta et al., 2013
View PDF- Document ID
- 15503199251877184702
- Author
- Gupta P
- Rosso P
- Banchs R
- Publication year
- Publication venue
- Fifth forum for information retrieval evaluation
External Links
Snippet
There exist a large amount of user generated Web content in Roman script for the languages which are written in indigenous scripts for various reasons. In the light of this phenomenon, the search engines face a non-trivial problem of matching queries and …
- 238000000034 method 0 abstract description 4
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jung | Semantic vector learning for natural language understanding | |
Plank et al. | Embedding semantic similarity in tree kernels for domain adaptation of relation extraction | |
Klementiev et al. | Inducing crosslingual distributed representations of words | |
Sun et al. | Fast online training with frequency-adaptive learning rates for chinese word segmentation and new word detection | |
Prettenhofer et al. | Cross-language text classification using structural correspondence learning | |
Gupta et al. | Query expansion for mixed-script information retrieval | |
Baniata et al. | A neural machine translation model for arabic dialects that utilises multitask learning (mtl) | |
Torisawa | Inducing gazetteers for named entity recognition by large-scale clustering of dependency relations | |
Darwish et al. | Arabic pos tagging: Don’t abandon feature engineering just yet | |
CN114254653A (en) | Scientific and technological project text semantic extraction and representation analysis method | |
Almarwani et al. | Arabic textual entailment with word embeddings | |
Antony et al. | Machine transliteration for indian languages: A literature survey | |
Prabhakar et al. | Machine transliteration and transliterated text retrieval: a survey | |
Shu et al. | Word segmentation in Chinese language processing | |
Habib et al. | An exploratory approach to find a novel metric based optimum language model for automatic bangla word prediction | |
Li et al. | Markbert: Marking word boundaries improves chinese bert | |
Nguyen et al. | Sub-character neural language modelling in Japanese | |
Nooralahzadeh et al. | Part of speech tagging for french social media data | |
Mi et al. | Toward better loanword identification in Uyghur using cross-lingual word embeddings | |
Gupta et al. | Encoding transliteration variation through dimensionality reduction: FIRE Shared Task on Transliterated Search | |
Atmakuri et al. | A comparison of features for POS tagging in Kannada | |
Tolmachev et al. | Shrinking Japanese morphological analyzers with neural networks and semi-supervised learning | |
Lu et al. | An automatic spelling correction method for classical mongolian | |
Ligozat | Question classification transfer | |
Das et al. | Language identification of Bengali-English code-mixed data using character & phonetic based LSTM models |