Abstract
Information retrieval has evolved from searches of references, to abstracts, to documents. Search on the Web involves search engines that promise to parse full-text and other files: audio, video, and multimedia. With the indexable Web at 320 million pages and growing, difficulties with locating relevant information have become apparent. The most prevalent means for information retrieval relies on syntax-based methods: keywords or strings of characters are presented to a search engine, and it returns all the matches in the available documents. This method is satisfactory and easy to implement, but it has some inherent limitations that make it unsuitable for many tasks. Instead of looking for syntactical patterns, the user often is interested in keyword meaning or the location of a particular word in a title or header. This paper describes some precise search approaches in the environmental domain that locate information according to syntactic criteria, augmented by the utilization of information in a certain context. The main emphasis of this paper lies in the treatment of structured knowledge, where essential aspects about the topic of interest are encoded not only by the individual items, but also by their relationships among each other. Examples for such structured knowledge are hypertext documents, diagrams, logical and chemical formulae. Benefits of this approach are enhanced precision and approximate search in an already focused, context-specific search engine for the environment: EnviroDaemon.
Similar content being viewed by others
References
S. Abiteboul and V. Vianu, “Queries and computation on the Web.” In Proceedings of International Conference on Database Theory, Delphi, Greece, pp. 262–275, January 1997.
P. Buneman, “Semistructured data.” In Proceedings of the 16th ACM Symposium on Principles of Database Systems. Tucson, AZ, pp. 117–121, May 1997.
R. Chandrasekar and B. Srinivas, “Gleaning information from the web: Using syntax to filter out irrelevant information.” In Proceedings of AAAI Spring Symposium on Natural Language Processing from the WWW. Stanford: California, March 1997.
S. S. Chawathe, A. Rajaraman, H. Garcia-Molina, and J. Widom, “Change detection in hierarchically structured information.” In Proceedings of the ACM SIGMOD International Conference on Management of Data Tucson, AZ, pp. 560–563, May 1997.
R. Cooley, B. Mobasher, and J. Srivastava, “Web mining: Information and pattern discovery on the World Wide Web.” In Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence pp. 558–567, 1997.
M. Fernandez, D. Florescu, J. Kang, A. Levy, and D. Suciu, “STRUDEL: A web-site management system.” In Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, AZ, pp. 549–552, May 1997.
A. Glossbrenner and E. Glossbrenner, Search Engines For the World Wide Web. Peachpit Press: Berkeley, CA, 1998.
M. Z. Hasan, A. O. Mendelzon, and D. Vista, “Applying database visualization to the World Wide Web.” In ACM SIGMOD Record 25 pp. 45–49, 1996.
M. J. Healey, J. T. Lewis, and G. Samtani, “Using EnviroDaemon to search the Internet for environmental information and building custom search engines,” Environmental Quality Management, Spring, pp. 101–109, 1998.
Z. Lacroix, A. Sahuguet, R. Chandrasekar, and B. Srinivas, “A novel approach to query the web,” In Proceedings of ER97 Workshop on Conceptual Modeling for Multimedia Information Seeking, Los Angeles, California, November 1997.
S. Lawrence and C. L. Giles, “Searching the World Wide Web,” Science 280(3), pp. 98–100, 1998.
M. Lesk, “Going digital,” Scientific American pp. 58–60, 1997.
A. Y. Levy, A. Rajaraman, and J. J. Ordille, “Querying heterogeneous information sources using source descriptions,” In Proceedings of the 22nd International Conference on Very Large Data Bases, Bombay, India, pp. 54–65, September 1996.
K. Mahalingam and M. N. Huhns, “A tool for organizing web information”. IEEE Computer pp. 80–83, 1997.
A. O. Mendelzon, G.A. Mihaila, and T. Milo. “Querying the World Wide Web.” Journal of Digital Libraries 1(1) pp. 54–67, 1997.
B. Schatz, “Information retrieval in digital libraries: Bringing search to the Net.” Science 275, pp. 327–333, 1997.
S. Thurm, and G. Anders. “Inktomi IPO sparks another Internet frenzy.” Wall Street Journal 1998.
J. T. L. Wang, K. Zhang, and D. Shasha, “Pattern matching and pattern discovery in scientific, program, and document databases.” In Proceedings of the ACM SIGMOD International Conference on Management of Data, San Jose, CA, pp. 487, May 1995.
J. T. L. Wang, K. Zhang, K. Jeong, and D. Shasha, “A system for approximate tree matching,” IEEE Transactions on Knowledge and Data Engineering 6(4), pp. 559–571, 1994.
E. O. Wilson, Consilience. New York: Alfred A. Knopf, 1998.
T. E. Weber, “Web's vastness foils even best search engines,” Wall Street Journal. 1998.
K. Zhang, D. Shasha, and J. T. L. Wang, “Approximate tree matching in the presence of variable length don't cares,” Journal of Algorithms 16(1), pp. 33–66, 1994.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Chang, G., Samtani, G., Healey, M. et al. Precise Environmental Searches: Integrating Hierarchical Information Search with EnviroDaemon. Journal of Systems Integration 10, 253–267 (2001). https://doi.org/10.1023/A:1011206028302
Issue Date:
DOI: https://doi.org/10.1023/A:1011206028302