[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2554688.2554789acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
research-article

Scalable multi-access flash store for big data analytics

Published: 26 February 2014 Publication History

Abstract

For many "Big Data" applications, the limiting factor in performance is often the transportation of large amount of data from hard disks to where it can be processed, i.e. DRAM. In this paper we examine an architecture for a scalable distributed flash store which aims to overcome this limitation in two ways. First, the architecture provides a high-performance, high-capacity, scalable random-access storage. It achieves high-throughput by sharing large numbers of flash chips across a low-latency, chip-to-chip backplane network managed by the flash controllers. The additional latency for remote data access via this network is negligible as compared to flash access time. Second, it permits some computation near the data via a FPGA-based programmable flash controller. The controller is located in the datapath between the storage and the host, and provides hardware acceleration for applications without any additional latency. We have constructed a small-scale prototype whose network bandwidth scales directly with the number of nodes, and where average latency for user software to access flash store is less than 70mus, including 3.5mus of network overhead.

References

[1]
FUSE: Filesystem in Userspace. http://fuse.sourceforge.net/. Accessed: Sept. 2013.
[2]
FusionIO. http://www.fusionio.com. Accessed: Sept. 2013.
[3]
Hadoop Distributed File System. http://hadoop.apache.org/docs/stable/hdfs_user_guide.html. Accessed: Sept. 2013.
[4]
IBM Netezza. http://www-01.ibm.com/software/data/netezza/. Accessed: Sept. 2013.
[5]
Infiniband. http://www.infinibandta.org/. Accessed: Sept. 2013.
[6]
Lustre. http://wiki.lustre.org/index.php/Main_Page. Accessed: Sept. 2013.
[7]
NFS Specifications. http://tools.ietf.org/html/rfc1094. Accessed: Sept. 2013.
[8]
PureStorage FlashArray. http://www.purestorage.com/flash-array/. Accessed: Sept. 2013.
[9]
SciDB. http://scidb.org/. Accessed: Sept. 2013.
[10]
Virtex-6 FPGA ML605 Evaluation Kit. http://www.xilinx.com/products/devkits/EK-V6-ML605-G.htm. Accessed: Sept. 2013.
[11]
Virtex-7 Datasheet. http://www.xilinx.com/support/documentation/data_sheets/ds180_7Series_Overview.pdf. Accessed: Sept. 2013.
[12]
M. Balakrishnan, A. Kadav, V. Prabhakaran, and D. Malkhi. Differential raid: rethinking raid for ssd reliability. In Proceedings of the 5th European conference on Computer systems, EuroSys '10, pages 15--26, New York, NY, USA, 2010. ACM.
[13]
M. Balakrishnan, D. Malkhi, V. Prabhakaran, T. Wobber, M. Wei, and J. D. Davis. Corfu: a shared log design for flash clusters. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, NSDI'12, pages 1--1, Berkeley, CA, USA, 2012. USENIX Association.
[14]
A. M. Caulfield and S. Swanson. Quicksan: a storage area network for fast, distributed, solid state disks. In Proceedings of the 40th Annual International Symposium on Computer Architecture, ISCA '13, pages 464--474, New York, NY, USA, 2013. ACM.
[15]
S. R. Chalamalasetti, K. Lim, M. Wright, A. AuYoung, P. Ranganathan, and M. Margala. An fpga memcached appliance. In Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays, FPGA '13, pages 245--254, New York, NY, USA, 2013. ACM.
[16]
D. Elliott, M. Stumm, W. M. Snelgrove, C. Cojocaru, and R. McKenzie. Computational ram: Implementing processors in memory. IEEE Des. Test, 16(1):32--41, Jan. 1999.
[17]
K. E. Fleming, M. Adler, M. Pellauer, A. Parashar, Arvind, and J. Emer. Leveraging Latency-Insensitivity to Ease Multiple FPGA Design. In 20th Annual ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2012), February 2012.
[18]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP'03, pages 29--43, New York, NY, USA, 2003. ACM.
[19]
A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. Vl2: a scalable and flexible data center network. Commun. ACM, 54(3):95--104, Mar. 2011.
[20]
D. Halperin, S. Kandula, J. Padhye, P. Bahl, and D. Wetherall. Augmenting data center networks with multi-gigabit wireless links. SIGCOMM Comput. Commun. Rev., 41(4):38--49, Aug. 2011.
[21]
S. W. Jun, K. Fleming, M. Adler, and J. Emer. Zip-io: Architecture for application-specific compression of big data. In Field-Programmable Technology (FPT), 2012 International Conference on, pages 343--351, 2012.
[22]
Y. Kang, Y. suk Kee, E. Miller, and C. Park. Enabling cost-effective data processing with smart ssd. In Mass Storage Systems and Technologies (MSST), 2013 IEEE 29th Symposium on, 2013.
[23]
K. Nagarajan, B. Holland, A. George, K. Slatton, and H. Lam. Accelerating machine-learning algorithms on fpgas using pattern-based decomposition. Journal of Signal Processing Systems, 62(1):43--63, 2011.
[24]
S. Nath and A. Kansal. Flashdb: dynamic self-tuning database for nand flash. In Proceedings of the 6th international conference on Information processing in sensor networks, IPSN '07, pages 410--419, New York, NY, USA, 2007. ACM.
[25]
A. Parashar, M. Adler, K. Fleming, M. Pellauer, and J. Emer. LEAP: A Virtual Platform Architecture for FPGAs. In CARL '10: The 1st Workshop on the Intersections of Computer Architecture and Reconfigurable Logic, 2010.
[26]
D. Tsirogiannis, S. Harizopoulos, M. A. Shah, J. L. Wiener, and G. Graefe. Query processing techniques for solid state drives. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, SIGMOD '09, pages 59--72, New York, NY, USA, 2009. ACM.
[27]
F. Xia, Y. Dou, G. Lei, and Y. Tan. Fpga accelerator for protein secondary structure prediction based on the gor algorithm. BMC Bioinformatics, 12(Suppl 1):S5, 2011.

Cited By

View all
  • (2024)DONGLE 2.0: Direct FPGA-Orchestrated NVMe Storage for HLSACM Transactions on Reconfigurable Technology and Systems10.1145/365003817:3(1-32)Online publication date: 5-Mar-2024
  • (2022)When FPGA Meets Cloud: A First Look at PerformanceIEEE Transactions on Cloud Computing10.1109/TCC.2020.299254810:2(1344-1357)Online publication date: 1-Apr-2022
  • (2019)The Impact of Adopting Computational Storage in Heterogeneous Computing Systems2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig)10.1109/ReConFig48160.2019.8994767(1-8)Online publication date: Dec-2019
  • Show More Cited By

Recommendations

Reviews

Ioannis Koltsidas

A storage architecture for big data environments, where high-throughput, low-latency access to the data is required, is the focus of this paper. The authors propose a system based on all-flash storage that follows a fully distributed, scalable architecture of interconnected storage nodes. Nodes are equipped with storage resources and a field-programmable gate array (FPGA), and are interconnected with a low-latency, high-bandwidth network. Both the flash controller and the network controller are implemented on the same FPGA and are tightly coupled, enabling low-latency data transfers from flash across the network. The node FPGA may implement application-specific accelerators, allowing the system to move computational capabilities to where the storage is. The accelerator exposes an interface to the file system, which the applications can use to parameterize the computation that the controller should perform on the data. For instance, by applying a predicate to data tuples within the storage controller, the system can filter out data that is not relevant for a query computation. Thereby, fewer data need to be transferred to the host, resulting in lower latency and less bandwidth consumption. The network of storage nodes exposes a single address space to the users. By means of a two-level tagging mechanism, the process of completing a request on a remote node is transparent to the issuer. The authors present interesting experimental results. According to the paper, the end-to-end latency when accessing “remote storage is much less than the sum of storage and network latencies accounted for separately.” In addition, latency scales linearly with the number of network hops: one could build a network with “dozens of nodes before the network latency becomes a significant portion of the storage [access] latency.” Therefore, the system is expected to maintain good performance at a larger scale. Overall, the proposed architecture is a promising approach to building distributed storage based on flash for high-performance data processing. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
February 2014
272 pages
ISBN:9781450326711
DOI:10.1145/2554688
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 February 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. big data
  2. flash
  3. fpga networks
  4. ssd
  5. storage system

Qualifiers

  • Research-article

Conference

FPGA'14
Sponsor:

Acceptance Rates

FPGA '14 Paper Acceptance Rate 30 of 110 submissions, 27%;
Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)3
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)DONGLE 2.0: Direct FPGA-Orchestrated NVMe Storage for HLSACM Transactions on Reconfigurable Technology and Systems10.1145/365003817:3(1-32)Online publication date: 5-Mar-2024
  • (2022)When FPGA Meets Cloud: A First Look at PerformanceIEEE Transactions on Cloud Computing10.1109/TCC.2020.299254810:2(1344-1357)Online publication date: 1-Apr-2022
  • (2019)The Impact of Adopting Computational Storage in Heterogeneous Computing Systems2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig)10.1109/ReConFig48160.2019.8994767(1-8)Online publication date: Dec-2019
  • (2019)Catalina: In-Storage Processing Acceleration for Scalable Big Data Analytics2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/EMPDP.2019.8671589(430-437)Online publication date: Feb-2019
  • (2018)A novel non-volatile memory storage system for I/O-intensive applicationsFrontiers of Information Technology & Electronic Engineering10.1631/FITEE.170006119:10(1291-1302)Online publication date: 28-Nov-2018
  • (2018)CompStor: An In-storage Computation Platform for Scalable Distributed Processing2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW.2018.00195(1260-1267)Online publication date: May-2018
  • (2017)Multi-Mode Low-Latency Software-Defined Error Correction for Data Centers2017 26th International Conference on Computer Communication and Networks (ICCCN)10.1109/ICCCN.2017.8038467(1-8)Online publication date: Jul-2017
  • (2017)Terabyte Sort on FPGA-Accelerated Flash Storage2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM.2017.53(17-24)Online publication date: Apr-2017
  • (2016)BluecacheProceedings of the VLDB Endowment10.14778/3025111.302511310:4(301-312)Online publication date: 1-Nov-2016
  • (2016)BlueDBMACM Transactions on Computer Systems10.1145/289899634:3(1-31)Online publication date: 30-Jun-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media