Goweder et al., 2008 - Google Patents
A hybrid method for stemming Arabic textGoweder et al., 2008
View PDF- Document ID
- 7128852800162434733
- Author
- Goweder A
- Alhami H
- Rashed T
- Al-Musrati A
- Publication year
- Publication venue
- Journal of computer Science, URL: http://eref. uqu. edu. sa/files/eref2/folder6/f181. pdf
External Links
Snippet
There are several stemming approaches that are applied to Arabic language, yet no a complete stemmer for this language is available. The existing stem-based stemmers for stemming Arabic text have a poor performance in terms of accuracy and error rates. In order …
- 238000000034 method 0 abstract description 56
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/273—Orthographic correction, e.g. spelling checkers, vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kannan et al. | Preprocessing techniques for text mining | |
US8341520B2 (en) | Method and system for spell checking | |
Kadri et al. | Effective stemming for Arabic information retrieval | |
Shao et al. | Mining new word translations from comparable corpora | |
Pettersson et al. | A multilingual evaluation of three spelling normalisation methods for historical text | |
Gupta et al. | A survey of common stemming techniques and existing stemmers for indian languages | |
Inoue et al. | Morphosyntactic tagging with pre-trained language models for Arabic and its dialects | |
Zhikov et al. | An efficient algorithm for unsupervised word segmentation with branching entropy and MDL | |
Dahab et al. | A comparative study on Arabic stemmers | |
Mann et al. | Multi-field information extraction and cross-document fusion | |
Stallard et al. | Unsupervised morphology rivals supervised morphology for arabic mt | |
Fodil et al. | Theme classification of Arabic text: A statistical approach | |
Goweder et al. | A hybrid method for stemming Arabic text | |
Hmeidi et al. | A novel approach to the extraction of roots from Arabic words using bigrams | |
Maheswari et al. | Rule based morphological variation removable stemming algorithm | |
Atwan et al. | Impact of stemmer on arabic text retrieval | |
Islam et al. | Automatic authorship detection from Bengali text using stylometric approach | |
Nwesri et al. | Capturing out-of-vocabulary words in Arabic text | |
Kranig | Evaluation of language identification methods | |
Basnayake et al. | Plagiarism detection in Sinhala language: A software approach | |
Nejja et al. | Context's impact on the automatic spelling correction | |
Rajalingam | A rule based iterative affix stripping stemming algorithm for Tamil | |
Hijjawi et al. | An application of pattern matching stemmer in arabic dialogue system | |
Ndebvu et al. | Hybrid Approach for Automatic Stemming of Chichewa Words Using Clustering with 4-grams and Affix Stripping | |
Al-Shammari | A novel algorithm for normalizing noisy Arabic text |