Abstract
WebVAT is an open-source platform-independent visualization tool designed to facilitate Web page analysis. The tool, built on top of the Mozilla Web browser, exposes Mozilla’s internal representation of Web pages, Frame Tree, reflecting HTML rendering information. Compared to HTML DOM analyzers, WebVAT provides access to a cleaner, fuller, and more accurate data structure, which contains layout information, reflecting changes made by CSS and some types of dynamic content. WebVAT provides a framework for experiments and evaluations of algorithms over the Frame Tree. WebVAT also captures user interaction with the browser and can be used for data collection. WebVAT is a working tool actively used in the HearSay [10] project. This paper describes the architecture, design, and some of the applications of WebVAT.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Chakrabarti, S.: Integrating the document object model with hyperlinks for enhanced topic distillation and information extraction. In: WWW’01 (2001)
Mahmud, J., Borodin, Y., Das, D., Ramakrishnan, I.: Combating information overload in non-visual web access using context. In: IUI, Short paper (2007)
Mahmud, J., Borodin, Y., Ramakrishnan, I.: Csurf: A context-driven non-visual web-browser. In: Proceedings of WWW (to Appear)
Mukherjee, S., Yang, G., Ramakrishnan, I.: Automatic annotation of content-rich html documents: Structural and semantic analysis. In: Fensel, D., Sycara, K.P., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, Springer, Heidelberg (2003)
Ramakrishnan, I., Stent, A., Yang, G.: Hearsay: Enabling audio browsing on hypertext content. In: WWW (2004)
Sun, Z., Mahmud, J., Mukherjee, S., Ramakrishnan, I.V.: Model-directed web transactions under constrained modalities. In: WWW ’06. Proceedings of the 15th international conference on World Wide Web, pp. 447–456 (2006)
Yu, S., Cai, D., Wen, J.-R., Ma, W.-Y.: Improving pseudo-relevance feedback in web information retrieval using web page segnmentation. In: WWW (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Borodin, Y., Mahmud, J., Ahmed, A., Ramakrishnan, I.V. (2007). WebVAT: Web Page Visualization and Analysis Tool. In: Baresi, L., Fraternali, P., Houben, GJ. (eds) Web Engineering. ICWE 2007. Lecture Notes in Computer Science, vol 4607. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73597-7_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-73597-7_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73596-0
Online ISBN: 978-3-540-73597-7
eBook Packages: Computer ScienceComputer Science (R0)