research-article

Scalable multi-access flash store for big data analytics

Authors:

Sang-Woo Jun,

Ming Liu,

Kermin Elliott Fleming,

ArvindAuthors Info & Claims

FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

Pages 55 - 64

https://doi.org/10.1145/2554688.2554789

Published: 26 February 2014 Publication History

Get Access

Abstract

For many "Big Data" applications, the limiting factor in performance is often the transportation of large amount of data from hard disks to where it can be processed, i.e. DRAM. In this paper we examine an architecture for a scalable distributed flash store which aims to overcome this limitation in two ways. First, the architecture provides a high-performance, high-capacity, scalable random-access storage. It achieves high-throughput by sharing large numbers of flash chips across a low-latency, chip-to-chip backplane network managed by the flash controllers. The additional latency for remote data access via this network is negligible as compared to flash access time. Second, it permits some computation near the data via a FPGA-based programmable flash controller. The controller is located in the datapath between the storage and the host, and provides hardware acceleration for applications without any additional latency. We have constructed a small-scale prototype whose network bandwidth scales directly with the number of nodes, and where average latency for user software to access flash store is less than 70mus, including 3.5mus of network overhead.

References

[1]

FUSE: Filesystem in Userspace. http://fuse.sourceforge.net/. Accessed: Sept. 2013.

Google Scholar

[2]

FusionIO. http://www.fusionio.com. Accessed: Sept. 2013.

Google Scholar

[3]

Hadoop Distributed File System. http://hadoop.apache.org/docs/stable/hdfs_user_guide.html. Accessed: Sept. 2013.

Google Scholar

[4]

IBM Netezza. http://www-01.ibm.com/software/data/netezza/. Accessed: Sept. 2013.

Google Scholar

[5]

Infiniband. http://www.infinibandta.org/. Accessed: Sept. 2013.

Google Scholar

[6]

Lustre. http://wiki.lustre.org/index.php/Main_Page. Accessed: Sept. 2013.

Google Scholar

[7]

NFS Specifications. http://tools.ietf.org/html/rfc1094. Accessed: Sept. 2013.

Google Scholar

[8]

PureStorage FlashArray. http://www.purestorage.com/flash-array/. Accessed: Sept. 2013.

Google Scholar

[9]

SciDB. http://scidb.org/. Accessed: Sept. 2013.

Google Scholar

[10]

Virtex-6 FPGA ML605 Evaluation Kit. http://www.xilinx.com/products/devkits/EK-V6-ML605-G.htm. Accessed: Sept. 2013.

Google Scholar

[11]

Virtex-7 Datasheet. http://www.xilinx.com/support/documentation/data_sheets/ds180_7Series_Overview.pdf. Accessed: Sept. 2013.

Google Scholar

[12]

M. Balakrishnan, A. Kadav, V. Prabhakaran, and D. Malkhi. Differential raid: rethinking raid for ssd reliability. In Proceedings of the 5th European conference on Computer systems, EuroSys '10, pages 15--26, New York, NY, USA, 2010. ACM.

Digital Library

Google Scholar

[13]

M. Balakrishnan, D. Malkhi, V. Prabhakaran, T. Wobber, M. Wei, and J. D. Davis. Corfu: a shared log design for flash clusters. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, NSDI'12, pages 1--1, Berkeley, CA, USA, 2012. USENIX Association.

Digital Library

Google Scholar

[14]

A. M. Caulfield and S. Swanson. Quicksan: a storage area network for fast, distributed, solid state disks. In Proceedings of the 40th Annual International Symposium on Computer Architecture, ISCA '13, pages 464--474, New York, NY, USA, 2013. ACM.

Digital Library

Google Scholar

[15]

S. R. Chalamalasetti, K. Lim, M. Wright, A. AuYoung, P. Ranganathan, and M. Margala. An fpga memcached appliance. In Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays, FPGA '13, pages 245--254, New York, NY, USA, 2013. ACM.

Digital Library

Google Scholar

[16]

D. Elliott, M. Stumm, W. M. Snelgrove, C. Cojocaru, and R. McKenzie. Computational ram: Implementing processors in memory. IEEE Des. Test, 16(1):32--41, Jan. 1999.

Digital Library

Google Scholar

[17]

K. E. Fleming, M. Adler, M. Pellauer, A. Parashar, Arvind, and J. Emer. Leveraging Latency-Insensitivity to Ease Multiple FPGA Design. In 20th Annual ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2012), February 2012.

Digital Library

Google Scholar

[18]

S. Ghemawat, H. Gobioff, and S.-T. Leung. The google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP'03, pages 29--43, New York, NY, USA, 2003. ACM.

Digital Library

Google Scholar

[19]

A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. Vl2: a scalable and flexible data center network. Commun. ACM, 54(3):95--104, Mar. 2011.

Digital Library

Google Scholar

[20]

D. Halperin, S. Kandula, J. Padhye, P. Bahl, and D. Wetherall. Augmenting data center networks with multi-gigabit wireless links. SIGCOMM Comput. Commun. Rev., 41(4):38--49, Aug. 2011.

Digital Library

Google Scholar

[21]

S. W. Jun, K. Fleming, M. Adler, and J. Emer. Zip-io: Architecture for application-specific compression of big data. In Field-Programmable Technology (FPT), 2012 International Conference on, pages 343--351, 2012.

Crossref

Google Scholar

[22]

Y. Kang, Y. suk Kee, E. Miller, and C. Park. Enabling cost-effective data processing with smart ssd. In Mass Storage Systems and Technologies (MSST), 2013 IEEE 29th Symposium on, 2013.

Crossref

Google Scholar

[23]

K. Nagarajan, B. Holland, A. George, K. Slatton, and H. Lam. Accelerating machine-learning algorithms on fpgas using pattern-based decomposition. Journal of Signal Processing Systems, 62(1):43--63, 2011.

Digital Library

Google Scholar

[24]

S. Nath and A. Kansal. Flashdb: dynamic self-tuning database for nand flash. In Proceedings of the 6th international conference on Information processing in sensor networks, IPSN '07, pages 410--419, New York, NY, USA, 2007. ACM.

Digital Library

Google Scholar

[25]

A. Parashar, M. Adler, K. Fleming, M. Pellauer, and J. Emer. LEAP: A Virtual Platform Architecture for FPGAs. In CARL '10: The 1st Workshop on the Intersections of Computer Architecture and Reconfigurable Logic, 2010.

Google Scholar

[26]

D. Tsirogiannis, S. Harizopoulos, M. A. Shah, J. L. Wiener, and G. Graefe. Query processing techniques for solid state drives. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, SIGMOD '09, pages 59--72, New York, NY, USA, 2009. ACM.

Digital Library

Google Scholar

[27]

F. Xia, Y. Dou, G. Lei, and Y. Tan. Fpga accelerator for protein secondary structure prediction based on the gor algorithm. BMC Bioinformatics, 12(Suppl 1):S5, 2011.

Crossref

Google Scholar

Cited By

View all

Wong LZhang JLi J(2024)DONGLE 2.0: Direct FPGA-Orchestrated NVMe Storage for HLSACM Transactions on Reconfigurable Technology and Systems10.1145/365003817:3(1-32)Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1145/3650038
Wang XNiu YLiu FXu Z(2022)When FPGA Meets Cloud: A First Look at PerformanceIEEE Transactions on Cloud Computing10.1109/TCC.2020.299254810:2(1344-1357)Online publication date: 1-Apr-2022
https://doi.org/10.1109/TCC.2020.2992548
Ma SGao S(2019)The Impact of Adopting Computational Storage in Heterogeneous Computing Systems2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig)10.1109/ReConFig48160.2019.8994767(1-8)Online publication date: Dec-2019
https://doi.org/10.1109/ReConFig48160.2019.8994767
Show More Cited By

Index Terms

Scalable multi-access flash store for big data analytics
1. Information systems
  1. Information retrieval
    1. Search engine architectures and scalability
      1. Distributed retrieval
      2. Peer-to-peer retrieval
  2. Information storage systems
    1. Storage architectures
      1. Distributed storage

Recommendations

A design of DDR-1 solid state drive using PCI-e interface
APCC'09: Proceedings of the 15th Asia-Pacific conference on Communications

In this paper, we designed and analyzed high-end DRAM-based SSD storage using DDR-1 memory and PCI-e Interface. SSD is a storage device that uses DRAM or NAND flash as primary storage media. Since the SSD Stores and accesses data directly to memory ...
Hot Data-Aware FTL Based on Page-Level Address Mapping
HPCC '10: Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications

The development of flash memory drives flash based SSD to enter into large-scale storage systems. The performance of SSD is highly dependent on the design of FTL. For the last few years, several FTL schemes have been proposed. Such as FAST, BAST, SAST ...
Exploiting Data Longevity for Enhancing the Lifetime of Flash-based Storage Class Memory

Storage-class memory (SCM) combines the benefits of a solid-state memory, such as high-performance and robustness, with the archival capabilities and low cost of conventional hard-disk magnetic storage. Among candidate solid-state nonvolatile memory ...

Reviews

Reviewer: Ioannis Koltsidas

A storage architecture for big data environments, where high-throughput, low-latency access to the data is required, is the focus of this paper. The authors propose a system based on all-flash storage that follows a fully distributed, scalable architecture of interconnected storage nodes. Nodes are equipped with storage resources and a field-programmable gate array (FPGA), and are interconnected with a low-latency, high-bandwidth network. Both the flash controller and the network controller are implemented on the same FPGA and are tightly coupled, enabling low-latency data transfers from flash across the network. The node FPGA may implement application-specific accelerators, allowing the system to move computational capabilities to where the storage is. The accelerator exposes an interface to the file system, which the applications can use to parameterize the computation that the controller should perform on the data. For instance, by applying a predicate to data tuples within the storage controller, the system can filter out data that is not relevant for a query computation. Thereby, fewer data need to be transferred to the host, resulting in lower latency and less bandwidth consumption. The network of storage nodes exposes a single address space to the users. By means of a two-level tagging mechanism, the process of completing a request on a remote node is transparent to the issuer. The authors present interesting experimental results. According to the paper, the end-to-end latency when accessing “remote storage is much less than the sum of storage and network latencies accounted for separately.” In addition, latency scales linearly with the number of network hops: one could build a network with “dozens of nodes before the network latency becomes a significant portion of the storage [access] latency.” Therefore, the system is expected to maintain good performance at a larger scale. Overall, the proposed architecture is a promising approach to building distributed storage based on flash for high-performance data processing. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

February 2014

272 pages

ISBN:9781450326711

DOI:10.1145/2554688

General Chair:
Vaughn Betz
University of Toronto, Canada
,
Program Chair:
George A. Constantinides
Imperial College London, UK

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 February 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

FPGA'14

Sponsor:

SIGDA

FPGA'14: The 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 26 - 28, 2014

California, Monterey, USA

Acceptance Rates

FPGA '14 Paper Acceptance Rate 30 of 110 submissions, 27%;

Overall Acceptance Rate 125 of 627 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
696
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)3

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wong LZhang JLi J(2024)DONGLE 2.0: Direct FPGA-Orchestrated NVMe Storage for HLSACM Transactions on Reconfigurable Technology and Systems10.1145/365003817:3(1-32)Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1145/3650038
Wang XNiu YLiu FXu Z(2022)When FPGA Meets Cloud: A First Look at PerformanceIEEE Transactions on Cloud Computing10.1109/TCC.2020.299254810:2(1344-1357)Online publication date: 1-Apr-2022
https://doi.org/10.1109/TCC.2020.2992548
Ma SGao S(2019)The Impact of Adopting Computational Storage in Heterogeneous Computing Systems2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig)10.1109/ReConFig48160.2019.8994767(1-8)Online publication date: Dec-2019
https://doi.org/10.1109/ReConFig48160.2019.8994767
Torabzadehkashi MRezaei SHeydarigorji ABobarshad HAlves VBagherzadeh N(2019)Catalina: In-Storage Processing Acceleration for Scalable Big Data Analytics2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/EMPDP.2019.8671589(430-437)Online publication date: Feb-2019
https://doi.org/10.1109/EMPDP.2019.8671589
Han WChen XLi SLi GSong ZLi DChen S(2018)A novel non-volatile memory storage system for I/O-intensive applicationsFrontiers of Information Technology & Electronic Engineering10.1631/FITEE.170006119:10(1291-1302)Online publication date: 28-Nov-2018
https://doi.org/10.1631/FITEE.1700061
Torabzadehkashi MRezaei SAlves VBagherzadeh N(2018)CompStor: An In-storage Computation Platform for Scalable Distributed Processing2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2018.00195(1260-1267)Online publication date: May-2018
https://doi.org/10.1109/IPDPSW.2018.00195
Ghaffari FAkoglu AVasic BDeclercq D(2017)Multi-Mode Low-Latency Software-Defined Error Correction for Data Centers2017 26th International Conference on Computer Communication and Networks (ICCCN)10.1109/ICCCN.2017.8038467(1-8)Online publication date: Jul-2017
https://doi.org/10.1109/ICCCN.2017.8038467
Jun SXu SArvind (2017)Terabyte Sort on FPGA-Accelerated Flash Storage2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM.2017.53(17-24)Online publication date: Apr-2017
https://doi.org/10.1109/FCCM.2017.53
Xu SLee SJun SLiu MHicks JArvind (2016)BluecacheProceedings of the VLDB Endowment10.14778/3025111.302511310:4(301-312)Online publication date: 1-Nov-2016
https://dl.acm.org/doi/10.14778/3025111.3025113
Jun SLiu MLee SHicks JAnkcorn JKing MXu SArvind (2016)BlueDBMACM Transactions on Computer Systems10.1145/289899634:3(1-31)Online publication date: 30-Jun-2016
https://dl.acm.org/doi/10.1145/2898996
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

A design of DDR-1 solid state drive using PCI-e interface

Hot Data-Aware FTL Based on Page-Level Address Mapping

Exploiting Data Longevity for Enhancing the Lifetime of Flash-based Storage Class Memory

Reviews

Access critical reviews of Computing literature here