Abstract
As modern computer systems face the challenge of large data, filesystems have to deal with a large number of files. This leads to amplified concerns of metadata operations as well as data operations. Most filesystems manage metadata of files by constructing in-memory data structures, such as directory entry (dentry) and inode. We found inefficiencies on management of metadata in existing filesystems, such as path traversal mechanism. In this article, we optimize the metadata operations by (1) looking up dentry cache (dcache) hash table in backward manner. To adopt the backward finding mechanism, we devise the rename and permission-granted mechanism. We also propose (2) compacting the metadata into dentry structures for in-memory space efficiency. We evaluate our optimized metadata managing mechanisms with several benchmarks, including a real-world workload. These optimizations significantly reduce dcache lookup latency by up to 40% and improve overall throughput by up to 72% in a real-world benchmark.
Similar content being viewed by others
References
Lustre file system. http://lustre.org/
Ghemawat, S., Gobioff, H., Leung, S.T.: The google file system. ACM SIGOPS Operat. Syst. Rev. 37, 29–43 (2003)
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)
Muralidhar, S., Lloyd, W., Roy, S., Hill, C., Lin, E., Liu, W., Pan, S., Shankar, S., Sivakumar, V., Tang, L., et al.: f4: Facebook’s warm blob storage system. In: Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 383–398 (2014)
Beaver, D., Kumar, S., Li, H.C., Sobel, J., Vajgel, P., et al.: Finding a needle in haystack: Facebook’s photo storage. OSDI 10, 1–8 (2010)
Hua, Y., Zhu, Y., Jiang, H., Feng, D., Tian, L.: Scalable and adaptive metadata management in ultra large-scale file systems. In: Proceedings of the 28th International Conference on Distributed Computing Systems, 2008. ICDCS’08. pp. 403–410. IEEE (2008),
Lensing, P.H., Cortes, T., Hughes, J., Brinkmann, A.: File system scalability with highly decentralized metadata on independent storage devices. In: Proceedings of the 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 366–375, IEEE (2016)
Ren, K., Zheng, Q., Patil, S., Gibson, G.: Indexfs: Scaling file system metadata performance with stateless caching and bulk insertion. In: Proceedings of the SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 237–248. IEEE (2014)
Weil, S.A., Pollack, K.T., Brandt, S.A., Miller, E.L.: Dynamic metadata management for petabyte-scale file systems. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, p. 4. IEEE Computer Society (2004)
Zhang, S., Catanese, H., Wang, A.A.I.: The composite-file file system: decoupling the one-to-one mapping of files and metadata for better performance. In: Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST 16), pp. 15–22 (2016)
Ren, K., Gibson, G.A.: Tablefs: Enhancing metadata efficiency in the local file system. In: Proceedings of the USENIX Annual Technical Conference, pp. 145–156 (2013)
Patil, S., Gibson, G.A.: Scale and concurrency of giga+: File system directories with millions of files. FAST 11, 13–13 (2011)
Song, N.Y., Kim, H., Yeom, H.Y.: Efficient metadata management in large-scale systems. In: International Conference on High Performance Computing in Asia-Pacific Region (HPCAsia), ACM (2018)
Lee, C., Sim, D., Hwang, J.Y., Cho, S.: F2fs: A new file system for flash storage. In: Proceedings of the 13th Conference on FAST, pp. 273–286 (2015)
Apache hbase. https://hbase.apache.org
Lensing, P.H., Cortes, T., Brinkmann, A.: Direct lookup and hash-based metadata placement for local file systems. In: Proceedings of the 6th International Systems and Storage Conference, p. 5, ACM (2013)
Tsai, C.C., Zhan, Y., Reddy, J., Jiao, Y., Zhang, T., Porter, D.E.: How to get more value from your file system directory cache. In: Proceedings of the 25th Symposium on Operating Systems Principles, pp. 441–456. ACM (2015)
Yuan, J., Zhan, Y., Jannen, W., Pandey, P., Akshintala, A., Chandnani, K., Deo, P., Kasheff, Z., Walsh, L., Bender, M., et al.: Optimizing every operation in a write-optimized file system. In: Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST 16), pp. 1–14 (2016)
McVoy, L.W., Staelin, C., et al.: lmbench: Portable tools for performance analysis. In: Proceedings of the USENIX Annual Technical Conference, pp. 279–294. San Diego, CA, USA (1996)
Katcher, J.: Postmark: A new file system benchmark. Technical Report TR3022, Network Appliance (1997)
Kyujanggak Institute for Korean Studies. http://kyujanggak.snu.ac.kr/LANG/en/main/main.jsp
Jannen, W., Yuan, J., Zhan, Y., Akshintala, A., Esmet, J., Jiao, Y., Mittal, A., Pandey, P., Reddy, P., Walsh, L., et al.: Betrfs: Write-optimization in a kernel file system. ACM Trans. Storage 11(4), 18 (2015)
Zhan, Y., Conway, A., Jiao, Y., Knorr, E., Bender, M.A., Farach-Colton, M., Jannen, W., Johnson, R., Porter, D.E., Yuan, J.: The full path to full-path indexing. In: Proceedings of the 16th USENIX Conference on File and Storage Technologies, pp. 123–138. USENIX Association (2018)
Welch, B., Noer, G.: Optimizing a hybrid ssd/hdd hpc storage system based on file size distributions. In: Proceedings of the 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–12. IEEE (2013)
Bisson, T., Patel, Y., Pasupathy, S.: Designing a fast file system crawler with incremental differencing. ACM SIGOPS Operat. Syst. Rev. 46(3), 11–19 (2012)
O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (lsm-tree). Acta Informa. 33(4), 351–385 (1996)
Esmet, J., Bender, M.A., Farach-Colton, M., Kuszmaul, B.C.: The TokuFS streaming file system. In: Proceedings of the HotStorage (2012)
Duchamp, D.: Optimistic lookup of whole nfs paths in a single operation. In: Proceedings of the USENIX Summer, pp. 161–169 (1994)
Welch, B.: A comparison of three distributed file system architectures: Vnode, sprite, and plan 9. Comput. Syst. 7(2), 175–199 (1994)
Acknowledgments
This research was supported by National Research Foundation of Korea (NRF) (2015M3C4A7065645, 2015M3C4A7065646, 2016R1D1A1B03934393).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Song, N.Y., Kim, H., Han, H. et al. Optimizing of metadata management in large-scale file systems. Cluster Comput 21, 1865–1879 (2018). https://doi.org/10.1007/s10586-018-2814-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-018-2814-7