Chernik et al., 2006 - Google Patents
Syllable-based compression for XML documentsChernik et al., 2006
View PDF- Document ID
- 6478084070453057687
- Author
- Chernik K
- Lánský J
- Galamboš L
- Publication year
- Publication venue
- Snášel, V., Richta, K., and Pokorný, J.: Proceedings of the Dateso 2006 Annual International Workshop on Databases, Texts, Specifications and Objects. CEUR-WS
External Links
Snippet
Syllable-based compression achieves sufficiently good results on text documents of a medium size. Since the majority of XML documents are of that size, we suppose that the syllable-based method can give good results on XML documents, especially on documents …
- 238000007906 compression 0 title abstract description 63
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2264—Transformation
- G06F17/227—Tree transformation for tree-structured or markup documents, e.g. eXtensible Stylesheet Language Transformation (XSL-T) stylesheets, Omnimark, Balise
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2247—Tree structured documents; Markup, e.g. Standard Generalized Markup Language [SGML], Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2205—Storage facilities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30908—Information retrieval; Database structures therefor; File system structures therefor of semistructured data, the undelying structure being taken into account, e.g. mark-up language structure data
- G06F17/30914—Mapping or conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/42—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100614677B1 (en) | Method for compressing/decompressing a structured document | |
Liefke et al. | XMill: an efficient compressor for XML data | |
US9208256B2 (en) | Methods of coding and decoding, by referencing, values in a structured document, and associated systems | |
US20040003343A1 (en) | Method and system for encoding a mark-up language document | |
WO2007075690A2 (en) | A compressed schema representation object and method for metadata processing | |
KR100803285B1 (en) | Method for a Queriable XML Compression using the Reverse Arithmetic Encoding and the Type Inference Engine | |
US20040225754A1 (en) | Method of compressing XML data and method of decompressing compressed XML data | |
Skibiński et al. | Revisiting dictionary‐based compression | |
EP1803225A1 (en) | Adaptive compression scheme | |
Skibiński et al. | Effective asymmetric XML compression | |
Chernik et al. | Syllable-based compression for XML documents | |
Harrusi et al. | XML syntax conscious compression | |
Cheney | An Empirical Evaluation of Simple DTD-Conscious Compression Techniques. | |
Toman | Syntactical compression of XML data | |
Spiesser et al. | Optimization of html automatically generated by wysiwyg programs | |
Skibiński et al. | Combining efficient XML compression with query processing | |
League et al. | Schema-Based Compression of XML Data with Relax NG. | |
Skibinski | Improving HTML compression | |
Oroumchian et al. | Experiments with persian text compression for web | |
Skibinski et al. | Fast transform for effective XML compression | |
Galambos et al. | Compression of Semistructured Documents | |
Brisaboa et al. | A compressed self-indexed representation of XML documents | |
Plantinga | An asymmetric, semi-adaptive text compression algorithm | |
Nair | XML compression techniques: A survey | |
Skibiński et al. | A highly efficient XML compression scheme for the web |