[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

A syntactic approach to twig-query matching on XML streams

Published: 01 June 2011 Publication History

Abstract

Query matching on XML streams is challenging work for querying efficiency when the amount of queried stream data is huge and the data can be streamed in continuously. In this paper, the method Syntactic Twig-Query Matching (STQM) is proposed to process queries on an XML stream and return the query results continuously and immediately. STQM matches twig queries on the XML stream in a syntactic manner by using a lexical analyzer and a parser, both of which are built from our lexical-rules and grammar-rules generators according to the user's queries and document schema, respectively. For query matching, the lexical analyzer scans the incoming XML stream and the parser recognizes XML structures for retrieving every twig-query result from the XML stream. Moreover, STQM obtains query results without a post-phase for excluding false positives, which are common in many streaming query methods. Through the experimental results, we found that STQM matches the twig query efficiently and also has good scalability both in the queried data size and the branch degree of the twig query. The proposed method takes less execution time than that of a sequence-based approach, which is widely accepted as a proper solution to the XML stream query.

References

[1]
Compilers-Principles, Techniques & Tools. 2nd ed. Addition-Wesley.
[2]
Structural joins: a primitive for efficient XML query pattern matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE), pp. 141-152.
[3]
Efficient filtering of XML documents for selective dissemination of information. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), pp. 53-64.
[4]
XFIS: an XML filtering system based on string representation and matching. International Journal of Web Engineering and Technology. v4 i1. 70-94.
[5]
Bedkett, D. (Ed.), 2004. RDF/XML Syntax Specification (Revised), W3C Recommendation. Available at: http://www.w3.org/TR/rdf-syntax-grammar/.
[6]
Berglund, A., Boag, S., ChamBerlin, D., Fernández, M.F., Kay, M., Robie, J., Siméon, J. (Ed.), 2007. XML Path Language (XPath) 2.0, W3C Recommendation. Available at: http://www.w3.org/TR/xpath20/.
[7]
Boag, S., ChamBerlin, D., Fernández, M.F., Florescu, D., Robie, J., Siméon, J. (Ed.), 2007. XQuery 1.0: An XML Query Language, W3C Recommendation. Available at: http://www.w3.org/TR/xquery/.
[8]
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F. (Ed.), 2008. Extensible Markup Language (XML) 1.0 (Fifth Edition), W3C Recommendation. Available at: http://www.w3.org/TR/xml/.
[9]
Holistic twig joins: optimal XML pattern matching. In: Proceedings of ACM SIGMOD International Conference on Management of Data (COMAD), pp. 310-321.
[10]
On boosting holism in XML twig pattern matching using structural indexing techniques. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (COMAD), pp. 455-466.
[11]
Efficient structural joins on indexed XML documents. In: Proceedings of the 28th International Conference on Very Large Data Bases (VLDB), pp. 263-274.
[12]
Corbett, R., Stallman, R., 2006. Bison, the GNU parser generator. Available at: http://www.gnu.org/software/bison/.
[13]
DC, 2008. Dublin Core Metadata Initiative. Available at: http://dublincore.org/documents/dces/.
[14]
Path sharing and predicate evaluation for high-performance XML filtering. ACM Transactions on Database Systems (TODS). v28 i4. 467-516.
[15]
High-performance XML filtering: an overview of YFilter. IEEE Data Engineering Bulletin. v26 i1. 41-48.
[16]
Syntactic methods in pattern recognition. IEEE Transactions on Systems, Man and Cybernetics. vSMC-6 i8. 590-591.
[17]
TwigX-Guide: twig query pattern matching for XML trees. American Journal of Applied Sciences. v5 i9. 1212-1218.
[18]
Extending path summary and region encoding for efficient structural query processing in native XML databases. Journal of Systems and Software. v82 i6. 1025-1035.
[19]
XR-Tree: indexing XML data for efficient structural joins. Proceedings of the 19th International Conference on Data Engineering (ICDE). 253-263.
[20]
YACC: yet another compiler-compiler. Unix Programmer's Manual. v2b.
[21]
An efficient bottom-up filtering of XML messages by exploiting the postfix commonality of XPath queries. IEICE Transactions on Information and Systems. v91-D i8. 2124-2133.
[22]
Attribute grammars for scalable query processing on XML streams. The VLDB Journal. v16 i3. 317-342.
[23]
FiST: scalable XML document filtering by sequencing twig patterns. In: Proceedings of the 31st International Conference on Very Large Data Bases (VLDB), pp. 217-228.
[24]
Value-based predicate filtering of XML documents. Data & Knowledge Engineering. v67 i1. 51-73.
[25]
Lesk, M.E., Schmidt, E., 1975. Lex-A Lexical Analyzer Generator. Computing Science Technical Report, Number 39, Bell Laboratories.
[26]
Lex & Yacc. 2nd ed. O'Reilly & Associates.
[27]
Indexing and querying XML data for regular path expressions. In: Proceedings of the 27th International Conference on Very Large Data Bases (VLDB), pp. 361-370.
[28]
TJFast: effective processing of XML twig pattern matching. In: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web (WWW2005), pp. 1118-1119.
[29]
Early profile pruning on XML-aware publish-subscribe systems. In: Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), pp. 866-877.
[30]
Paxson, V., 1995. Flex: The Fast Lexical Analyzer. Lawrence Berkeley Laboratory. Available at: http://flex.sourceforge.net/.
[31]
Branch sequencing based XML message broker architecture. In: Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE), pp. 656-665.
[32]
PRIX: indexing and querying XML using prufer sequences. In: Proceedings of the 20th International Conference on Data Engineering (ICDE), pp. 288-299.
[33]
SAX, 2004. Available at: http://www.saxproject.org/.
[34]
Syntax-directed transformations of XML streams. In: Proceedings of the Workshop on Programming Language Technologies for XML (PLAN-X), pp. 79-90.
[35]
XMark: a benchmark for XML data management. In: Proceedings of the 28th International Conference on Very Large Data Bases (VLDB), pp. 974-985.
[36]
Attributed grammar-a tool for combining syntactic and statistical approaches to pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics. vSMC-10 i12. 873-885.
[37]
UW, 2002. University of Washington XML Repository. Available at: http://www.cs.washington.edu/research/xmldatasets/.
[38]
On supporting containment queries in relational database management systems. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (COMAD), pp. 425-436.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Systems and Software
Journal of Systems and Software  Volume 84, Issue 6
June, 2011
187 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 June 2011

Author Tags

  1. Stream query
  2. Syntactic pattern recognition
  3. Twig query processing
  4. XML

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Dec 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media