[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1374596.1374603acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Searching and navigating petabyte-scale file systems based on facets

Published: 11 November 2007 Publication History

Abstract

As users interact with file systems of ever increasing size, it is becoming more difficult for them to familiarize themselves with the entire contents of the file system. In petabyte-scale systems, users must navigate a pool of billions of shared files in order to find the information they are looking for. One way to help alleviate this problem is to integrate navigation and search into a common framework.
One such method is faceted search. This method originated within the information retrieval community, and has proved popular for navigating large repositories, such as those in e-commerce sites and digital libraries. This paper introduces faceted search and outlines several current research directions in adapting faceted search techniques to petabyte-scale file systems.

References

[1]
S. Ames. The viewfs interface and query language. UCSC tech report in preparation.
[2]
S. Ames, N. Bobb, K. M. Greenan, O. S. Hofmann, M. W. Storer, C. Maltzahn, E. L. Miller, and S. A. Brandt. LiFS: An attribute-rich file system for storage class memories. In Proceedings of Mass Storage Systems and Technologies, May 2006.
[3]
Apple Developer Connection. Working with Spotlight. http://developer.apple.com/macosx/tiger/spotlight.html, 2004.
[4]
Beagle Project. About beagle. http://beagle-project.org/About.
[5]
E. Elmacioglu, M.-Y. Kan, D. Lee, and Y. Zhang. Web based linkage. In Proceedings of the Workshop on Web Information and Data Management (WIDM 2007).
[6]
C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, 1998.
[7]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP '03), Bolton Landing, NY, Oct. 2003. ACM.
[8]
D. Giampaolo. Practical File System Design with the Be File System. Morgan Kaufman Publishers Inc., San Francisco, CA, USA, 1998.
[9]
D. K. Gifford, P. Jouvelot, M. A. Sheldon, and J. W. O. Jr. Semantic file systems. In Proceedings of the 13th ACM Symposium on Operating Systems Principles (SOSP '91), pages 16--25. ACM Press, December 1992.
[10]
Google. Google desktop - features. http://desktop.google.com/features, 2007.
[11]
D. Hawking. Challenges in enterprise search. In Proceedings of the Australasian Database Conference, pages 15--26, January 2004.
[12]
S. Henderson. Genre, task, topic and time: Facets of personal digital document management. In Proceedings of the 6th ACM SIGCHI New Zealand Chapter's International Conference on Computer-Human Interaction (CHINZ '05), pages 75--82, New York, NY, USA, 2005. ACM Press.
[13]
A. K. Karlson, G. Robertson, D. C. Robbins, M. Czerwinski, and G. Smith. Fathumb: A facet-based interface for mobile search. In Proceedings of CHI '06, Human Factors in Computing Systems, New York, NY, USA, 2006. ACM Press.
[14]
C. Maltzahn, N. Bobb, M. W. Storer, D. Eads, S. A. Brandt, and E. L. Miller. Graffiti: A framework for testing collaborative distributed metadata. In Proceedings in Informatics, pages 97--111, 2007.
[15]
M. A. Olson. The design and implementation of the inversion file system. In Proceedings of the Winter 1993 USENIX Technical Conference, pages 205--217, January 1993.
[16]
Y. Padioleau and O. Ridoux. A logic file system. In Proceedings of the 2003 USENIX Annual Technical Conference, pages 99--112, June 2003.
[17]
S. Shah, C. A. N. Soules, G. R. Ganger, and B. D. Nobel. Using provenance to aid in personal file search. In Proceedings of USENIX Annual Technical Conference (USENIX 2007), June 2007.
[18]
C. A. N. Soules and G. R. Ganger. Connections: using context to enhance file search. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP '05), pages 119--132, New York, NY, USA, 2005. ACM Press.
[19]
E. Stoica, M. A. Hearst, and M. Richardson. Automating creation of hierarchical faceted metadata structures. In Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computation Linguistics (NAACL-HLT 2007). NAACL-HLT, 2007.
[20]
D. Tunkelang. Dynamic category sets: An approach for faceted search. In Faceted Search Workshop '06, 2006.
[21]
Y. Zhang and J. Koren. Efficient bayesian hierarchical user modeling for recommendation systems. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07), New York, NY, USA, July 2007. ACM Press.

Cited By

View all
  • (2017)A cross-layer optimized storage system for workflow applicationsFuture Generation Computer Systems10.1016/j.future.2017.02.03875(423-437)Online publication date: Oct-2017
  • (2015)Facet-value extraction scheme from textual contents in XML dataInternational Journal of Web Information Systems10.1108/IJWIS-04-2015-001211:3(270-290)Online publication date: 17-Aug-2015
  • (2014)Extracting Facets from Textual Contents for Faceted Search over XML DataProceedings of the 16th International Conference on Information Integration and Web-based Applications & Services10.1145/2684200.2684294(420-429)Online publication date: 4-Dec-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PDSW '07: Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
November 2007
72 pages
ISBN:9781595938992
DOI:10.1145/1374596
  • Conference Chair:
  • Garth A. Gibson
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. enterprise search
  2. faceted search
  3. information retrieval
  4. metadata
  5. petabyte-scale storage
  6. semantic file system
  7. virtual directory

Qualifiers

  • Research-article

Funding Sources

Conference

SC '07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 41 submissions, 41%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2017)A cross-layer optimized storage system for workflow applicationsFuture Generation Computer Systems10.1016/j.future.2017.02.03875(423-437)Online publication date: Oct-2017
  • (2015)Facet-value extraction scheme from textual contents in XML dataInternational Journal of Web Information Systems10.1108/IJWIS-04-2015-001211:3(270-290)Online publication date: 17-Aug-2015
  • (2014)Extracting Facets from Textual Contents for Faceted Search over XML DataProceedings of the 16th International Conference on Information Integration and Web-based Applications & Services10.1145/2684200.2684294(420-429)Online publication date: 4-Dec-2014
  • (2014)The case for sampling on very large file systems2014 30th Symposium on Mass Storage Systems and Technologies (MSST)10.1109/MSST.2014.6855542(1-11)Online publication date: Jun-2014
  • (2012)MinersoftACM Transactions on Internet Technology10.1145/2220352.222035412:1(1-34)Online publication date: 5-Jul-2012
  • (2012)Faceted navigation framework for XML dataInternational Journal of Web Information Systems10.1108/174400812112828658:4(348-370)Online publication date: 16-Nov-2012
  • (2011)A framework of faceted navigation for XML dataProceedings of the 13th International Conference on Information Integration and Web-based Applications and Services10.1145/2095536.2095544(28-35)Online publication date: 5-Dec-2011
  • (2010)Supporting multiple paths to objects in information hierarchiesInformation Processing and Management: an International Journal10.1016/j.ipm.2009.06.00746:1(22-43)Online publication date: 1-Jan-2010
  • (2010)Searching for Software on the EGEE InfrastructureJournal of Grid Computing10.1007/s10723-010-9155-y8:2(281-304)Online publication date: 23-Mar-2010
  • (2009)Faceted SearchSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S00190ED1V01Y200904ICR0051:1(1-80)Online publication date: Jan-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media