[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3503222.3507702acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

GenStore: a high-performance in-storage processing system for genome sequence analysis

Published: 22 February 2022 Publication History

Abstract

Read mapping is a fundamental step in many genomics applications. It is used to identify potential matches and differences between fragments (called reads) of a sequenced genome and an already known genome (called a reference genome). Read mapping is costly because it needs to perform approximate string matching (ASM) on large amounts of data. To address the computational challenges in genome analysis, many prior works propose various approaches such as accurate filters that select the reads within a dataset of genomic reads (called a read set) that must undergo expensive computation, efficient heuristics, and hardware acceleration. While effective at reducing the amount of expensive computation, all such approaches still require the costly movement of a large amount of data from storage to the rest of the system, which can significantly lower the end-to-end performance of read mapping in conventional and emerging genomics systems.
We propose GenStore, the first in-storage processing system designed for genome sequence analysis that greatly reduces both data movement and computational overheads of genome sequence analysis by exploiting low-cost and accurate in-storage filters. GenStore leverages hardware/software co-design to address the challenges of in-storage processing, supporting reads with 1) different properties such as read lengths and error rates, which highly depend on the sequencing technology, and 2) different degrees of genetic variation compared to the reference genome, which highly depends on the genomes that are being compared. Through rigorous analysis of read mapping processes of reads with different properties and degrees of genetic variation, we meticulously design low-cost hardware accelerators and data/computation flows inside a NAND flash-based solid-state drive (SSD). Our evaluation using a wide range of real genomic datasets shows that GenStore, when implemented in three modern NAND flash-based SSDs, significantly improves the read mapping performance of state-of-the-art software (hardware) baselines by 2.07-6.05× (1.52-3.32×) for read sets with high similarity to the reference genome and 1.45-33.63× (2.70-19.2×) for read sets with low similarity to the reference genome.

References

[1]
Michelle M Clark, Amber Hildreth, Sergey Batalov, Yan Ding, Shimul Chowdhury, et al. Diagnosis of Genetic Diseases in Seriously Ill Children by Rapid Whole-genome Sequencing and Automated Phenotyping and Interpretation. Science Translational Medicine, 2019.
[2]
Lauge Farnaes, Amber Hildreth, Nathaly M Sweeney, Michelle M Clark, Shimul Chowdhury, et al. Rapid Whole-genome Sequencing Decreases Infant Morbidity and Cost of Hospitalization. NPJ Genomic Medicine, 2018.
[3]
Nathaly M Sweeney, Shareef A Nahas, Shimul Chowdhury, Sergey Batalov, Michelle Clark, et al. Rapid Whole Genome Sequencing Impacts Care and Resource Utilization in Infants with Congenital Heart Disease. NPJ Genomic Medicine, 2021.
[4]
Can Alkan, Jeffrey M Kidd, Tomas Marques-Bonet, Gozde Aksay, Francesca Antonacci, et al. Personalized Copy Number and Segmental Duplication Maps Using Next-Generation Sequencing. Nature Genetics, 2009.
[5]
Mauricio Flores, Gustavo Glusman, Kristin Brogaard, Nathan D Price, and Leroy Hood. P4 Medicine: How Systems Medicine Will Transform the Healthcare Sector and Society. Personalized Medicine, 2013.
[6]
Geoffrey S Ginsburg and Huntington F Willard. Genomic and Personalized Medicine: Foundations and Applications. Translational Research, 2009.
[7]
Lynda Chin, Jannik N Andersen, and P Andrew Futreal. Cancer Genomics: From Discovery Science to Personalized Medicine. Nature Medicine, 2011.
[8]
Euan A Ashley. Towards Precision Medicine. Nature Reviews Genetics, 2016.
[9]
Joshua S Bloom, Laila Sathe, Chetan Munugala, Eric M Jones, Molly Gasperini, et al. Massively Scaled-up Testing for SARS-CoV-2 RNA via Next-generation Sequencing of Pooled and Barcoded Nasal and Saliva Samples. Nature Biomedical Engineering, 2021.
[10]
Ramesh Yelagandula, Aleksandr Bykov, Alexander Vogt, Robert Heinen, Ezgi Özkan, et al. Multiplexed Detection of SARS-CoV-2 and Other Respiratory Infections in High Throughput by SARSeq. Nature Communications, 2021.
[11]
Vien Thi Minh Le and Binh An Diep. Selected Insights from Application of Whole Genome Sequencing for Outbreak Investigations. Current Opinion in Critical Care, 2013.
[12]
Vlad Nikolayevskyy, Katharina Kranzer, Stefan Niemann, and Francis Drobniewski. Whole Genome Sequencing of Mycobacterium Tuberculosis for Detection of Recent Transmission and Tracing Outbreaks: A Systematic Review. Tuberculosis, 2016.
[13]
Shaofu Qiu, Peng Li, Hongbo Liu, Yong Wang, Nan Liu, et al. Whole-genome Sequencing for Tracing the Transmission Link between Two ARD Outbreaks Caused by A Novel HAdV Serotype 7 Variant, China. Scientific Reports, 2015.
[14]
Carol A Gilchrist, Stephen D Turner, Margaret F Riley, William A Petri, and Erik L Hewlett. Whole-genome Sequencing in Outbreak Analysis. Clinical Microbiology Reviews, 2015.
[15]
Sean Hoban, Joanna L Kelley, Katie E Lotterhos, Michael F Antolin, Gideon Bradburd, et al. Finding the Genomic Basis of Local Adaptation: Pitfalls, Practical Solutions, and Future Directions. The American Naturalist, 2016.
[16]
J Romiguier, Philippe Gayral, Marion Ballenghien, Arnaud Bernard, Vincent Cahais, et al. Comparative Population Genomics in Animals Uncovers the Determinants of Genetic Diversity. Nature, 2014.
[17]
Hans Ellegren and Nicolas Galtier. Determinants of Genetic Diversity. Nature Reviews Genetics, 2016.
[18]
Ana Prohaska, Fernando Racimo, Andrew J Schork, Martin Sikora, Aaron J Stern, et al. Human Disease Variation in the Light of Population Genomics. Cell, 2019.
[19]
Hans Ellegren. Genome Sequencing and Population Genomics in Non-Model Organisms. Trends in Ecology & Evolution, 2014.
[20]
Javier Prado-Martinez, Peter H. Sudmant, Jeffrey M. Kidd, Heng Li, Joanna L. Kelley, et al. Great Ape Genetic Diversity and Population History. Nature, 2013.
[21]
Ana Prohaska, Fernando Racimo, Andrew J Schork, Martin Sikora, Aaron J Stern, et al. Human Disease Variation in the Light of Population Genomics. Cell, 2019.
[22]
Jason A Reuter, Damek V Spacek, and Michael P Snyder. High-Throughput Sequencing Technologies. Molecular Cell, 2015.
[23]
Erwin L van Dijk, Hélène Auger, Yan Jaszczyszyn, and Claude Thermes. Ten Years of Next-Generation Sequencing Technology. Trends in Genetics, 2014.
[24]
Travis C Glenn. Field Guide to Next-Generation DNA Sequencers. Molecular Ecology Resources, 2011.
[25]
Sara Goodwin, John D McPherson, and W Richard McCombie. Coming of Age: Ten Years of Next-generation Sequencing Technologies. Nature Reviews Genetics, 2016.
[26]
Michael A Quail, Miriam Smith, Paul Coupland, Thomas D Otto, Simon R Harris, et al. A Tale of Three Next Generation Sequencing Platforms: Comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq Sequencers. BMC Genomics, 2012.
[27]
Mehdi Kchouk, Jean-Francois Gibrat, and Mourad Elloumi. Generations of Sequencing Technologies: from First to Next Generation. Biology and Medicine, 2017.
[28]
Franziska Pfeiffer, Carsten Gröber, Michael Blank, Kristian Händler, Marc Beyer, et al. Systematic Evaluation of Error Rates and Causes in Short Samples in Next-generation Sequencing. Scientific Reports, 2018.
[29]
Shanika L Amarasinghe, Shian Su, Xueyi Dong, Luke Zappia, Matthew E Ritchie, and Quentin Gouil. Opportunities and Challenges in Long-read Sequencing Data Analysis. Genome Biology, 2020.
[30]
Damla Senol Cali, Jeremie S Kim, Saugata Ghose, Can Alkan, and Onur Mutlu. Nanopore Sequencing Technology and Tools for Genome Assembly: Computational Analysis of the Current State, Bottlenecks and Future Directions. Briefings in Bioinformatics, 2018.
[31]
Simon Ardui, Adam Ameur, Joris R Vermeesch, and Matthew S Hestand. Single Molecule Real-Time (SMRT) Sequencing Comes of Age: Applications and Utilities for Medical Diagnostics. Nucleic Acids Research, 2018.
[32]
Jason L Weirather, Mariateresa de Cesare, Yunhao Wang, Paolo Piazza, Vittorio Sebastiano, et al. Comprehensive Comparison of Pacific Biosciences and Oxford Nanopore Technologies and Their Applications to Transcriptome Analysis. F1000Research, 2017.
[33]
Erwin L van Dijk, Yan Jaszczyszyn, Delphine Naquin, and Claude Thermes. The Third Revolution in Sequencing Technology. Trends in Genetics, 2018.
[34]
Yunhao Wang, Yue Zhao, Audrey Bollas, Yuru Wang, and Kin Fai Au. Nanopore Sequencing Technology, Bioinformatics and Applications. Nature Biotechnology, 2021.
[35]
Mohammed Alser, Zülal Bingöl, Damla Senol Cali, Jeremie Kim, Saugata Ghose, et al. Accelerating Genome Analysis: A Primer on An Ongoing Journey. IEEE Micro, 2020.
[36]
Mohammed Alser, Jeremy Rotman, Kodi Taraszka, Huwenbo Shi, Pelin Icer Baykal, et al. Technology Dictates Algorithms: Recent Developments in Read Alignment. Genome Biology, 2021.
[37]
Hongyi Xin, Donghyuk Lee, Farhad Hormozdiari, Samihan Yedkar, Onur Mutlu, and Can Alkan. Accelerating Read Mapping with FastHASH. BMC Genomics, 2013.
[38]
Hongyi Xin, John Greth, John Emmons, Gennady Pekhimenko, Carl Kingsford, et al. Shifted Hamming Distance: A Fast and Accurate SIMD-friendly Filter to Accelerate Alignment Verification in Read Mapping. Bioinformatics, 2015.
[39]
Yatish Turakhia, Gill Bejerano, and William J Dally. Darwin: A Genomics Co-processor Provides up to 15,000 x Acceleration on Long Read Assembly. In ASPLOS, 2018.
[40]
Damla Senol Cali, Gupreet Kalsi, Zulal Bingöl, Lavanya Subramanian, Can Firtina, et al. GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis. In MICRO, 2020.
[41]
Mohammed Alser, Taha Shahroodi, Juan Gómez-Luna, Can Alkan, and Onur Mutlu. SneakySnake: A Fast and Accurate Universal Genome Pre-alignment Filter for CPUs, GPUs and FPGAs. Bioinformatics, 2020.
[42]
Mohammed Alser, Hasan Hassan, Hongyi Xin, Oğuz Ergin, Onur Mutlu, and Can Alkan. GateKeeper: A New Hardware Architecture for Accelerating Pre-alignment in DNA Short Read Mapping. Bioinformatics, 2017.
[43]
Anirban Nag, CN Ramachandra, Rajeev Balasubramonian, Ryan Stutsman, Edouard Giacomin, et al. GenCache: Leveraging In-cache Operators for Efficient Sequence Alignment. In MICRO, 2019.
[44]
Jeremie S Kim, Can Firtina, Meryem Banu Cavlak, Damla Senol Cali, Mohammed Alser, et al. AirLift: A Fast and Comprehensive Technique for Remapping Alignments between Reference Genomes. arXiv, 2019.
[45]
Jeremie S Kim, Damla Senol Cali, Hongyi Xin, Donghyuk Lee, Saugata Ghose, et al. GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping Using Processing-in-memory Technologies. BMC Genomics, 2018.
[46]
Can Firtina, Jeremie S Kim, Mohammed Alser, Damla Senol Cali, A Ercument Cicek, et al. Apollo: A Sequencing-technology-independent, Scalable and Accurate Assembly Polishing Algorithm. Bioinformatics, 2020.
[47]
Martin Šošić and Mile Šikić. Edlib: A C/C++ Library for Fast, Exact Sequence Alignment Using Edit Distance. Bioinformatics, 2017.
[48]
Mohammed Alser, Onur Mutlu, and Can Alkan. MAGNET: Understanding and Improving the Accuracy of Genome Pre-alignment Filtering. arXiv, 2017.
[49]
Mohammed Alser, Hasan Hassan, Akash Kumar, Onur Mutlu, and Can Alkan. Shouji: A Fast and Efficient Pre-alignment Filter for Sequence Alignment. Bioinformatics, 2019.
[50]
Saul B Needleman and Christian D Wunsch. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. Journal of Molecular Biology, 1970.
[51]
Temple F Smith, Michael S Waterman, et al. Identification of Common Molecular Subsequences. Journal of Molecular Biology, 1981.
[52]
Osamu Gotoh. An Improved Algorithm for Matching Biological Sequences. Journal of Molecular Biology, 1982.
[53]
Shawn E Levy and Richard M Myers. Advancements in Next-generation Sequencing. Annual Review of Genomics and Human Genetics, 2016.
[54]
Taishan Hu, Nilesh Chitnis, Dimitri Monos, and Anh Dinh. Next-Generation Sequencing Technologies: An Overview. Human Immunology, 2021.
[55]
1000 Genomes Project Consortium et al. A Global Reference for Human Genetic Variation. Nature, 2015.
[56]
Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller. A Greedy Algorithm for Aligning DNA Sequences. Journal of Computational Biology, 2000.
[57]
Guy St C Slater and Ewan Birney. Automated Generation of Heuristics for Biological Sequence Comparison. BMC Bioinformatics, 2005.
[58]
Heng Li. Minimap2: Pairwise Alignment for Nucleotide Sequences. Bioinformatics, 2018.
[59]
Gene Myers. A Fast Bit-vector Algorithm for Approximate String Matching Based on Dynamic Programming. JACM, 1999.
[60]
Santiago Marco-Sola, Juan Carlos Moure, Miquel Moreto, and Antonio Espinosa. Fast Gap-affine Pairwise Alignment Using the Wavefront Algorithm. Bioinformatics, 2021.
[61]
Wenqin Huangfu, Shuangchen Li, Xing Hu, and Yuan Xie. RADAR: A 3D-ReRAM based DNA Alignment Accelerator Architecture. In DAC, 2018.
[62]
Daichi Fujiki, Arun Subramaniyan, Tianjun Zhang, Yu Zeng, Reetuparna Das, et al. Genax: A Genome Sequencing Accelerator. In ISCA, 2018.
[63]
Daichi Fujiki, Shunhao Wu, Nathan Ozog, Kush Goliya, David Blaauw, et al. SeedEx: A Genome Sequencing Accelerator for Optimal Alignments in Subminimal Space. In MICRO, 2020.
[64]
Subho Sankar Banerjee, Mohamed El-Hadedy, Jong Bin Lim, Zbigniew T Kalbarczyk, Deming Chen, et al. ASAP: Accelerated Short-read Alignment on Programmable Hardware. IEEE Transactions on Computers, 2019.
[65]
S Karen Khatamifard, Zamshed Chowdhury, Nakul Pande, Meisam Razaviyayn, Chris H Kim, and Ulya R Karpuzcu. GeNVoM: Read Mapping Near Non-Volatile Memory. TCBB, 2021.
[66]
Saransh Gupta, Mohsen Imani, Behnam Khaleghi, Venkatesh Kumar, and Tajana Rosing. RAPID: A ReRAM Processing In-memory Architecture for DNA Sequence Alignment. In ISLPED, 2019.
[67]
Xue-Qi Li, Guang-Ming Tan, and Ning-Hui Sun. PIM-Align: A Processing-in-Memory Architecture for FM-Index Search Algorithm. Journal of Computer Science and Technology, 2021.
[68]
Shaahin Angizi, Jiao Sun, Wei Zhang, and Deliang Fan. Aligns: A Processing-in-memory Accelerator for DNA Short Read Alignment Leveraging SOT-MRAM. In DAC, 2019.
[69]
Farzaneh Zokaee, Hamid R Zarandi, and Lei Jiang. Aligner: A Process-in-memory Architecture for Short Read Alignment in ReRAMs. IEEE Computer Architecture Letters, 2018.
[70]
Advait Madhavan, Timothy Sherwood, and Dmitri Strukov. Race Logic: A Hardware Acceleration for Dynamic Programming Algorithms. ACM SIGARCH Computer Architecture News, 2014.
[71]
Haoyu Cheng, Yong Zhang, and Yun Xu. Bitmapper2: A GPU-accelerated All-mapper Based on The Sparse Q-gram Index. TCBB, 2018.
[72]
Ernst Joachim Houtgast, Vlad-Mihai Sima, Koen Bertels, and Zaid Al-Ars. Hardware Acceleration of BWA-MEM Genomic Short Read Mapping for Longer Read Lengths. Computational Biology and Chemistry, 2018.
[73]
Ernst Joachim Houtgast, VladMihai Sima, Koen Bertels, and Zaid AlArs. An Efficient GPU-accelerated Implementation of Genomic Short Read Mapping with BWA-MEM. ACM SIGARCH Computer Architecture News, 2017.
[74]
Amit Goyal, Hyuk Jung Kwon, Kichan Lee, Reena Garg, Seon Young Yun, et al. Ultra-fast Next Generation Human Genome Sequencing Data Processing Using DRAGENTM Bio-IT Processor for Precision Medicine. Open Journal of Genetics, 2017.
[75]
Yu-Ting Chen, Jason Cong, Zhenman Fang, Jie Lei, and Peng Wei. When Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration. In USENIX HotCloud, 2016.
[76]
Peng Chen, Chao Wang, Xi Li, and Xuehai Zhou. Accelerating the Next Generation Long Read Mapping with the FPGA-based System. TCBB, 2014.
[77]
Yen-Lung Chen, Bo-Yi Chang, Chia-Hsiang Yang, and Tzi-Dar Chiueh. A High-Throughput FPGA Accelerator for Short-Read Mapping of the Whole Human Genome. IEEE TPDS, 2021.
[78]
Alberto Zeni, Giulia Guidi, Marquita Ellis, Nan Ding, Marco D Santambrogio, et al. Logan: High-performance GPU-based X-drop Long-read Alignment. In IPDPS, 2020.
[79]
Nauman Ahmed, Jonathan Lévy, Shanshan Ren, Hamid Mushtaq, Koen Bertels, and Zaid Al-Ars. GASAL2: A GPU Accelerated Sequence Alignment Library for High-Throughput NGS Data. BMC Bioinformatics, 2019.
[80]
Takahiro Nishimura, Jacir L Bordim, Yasuaki Ito, and Koji Nakano. Accelerating the Smith-waterman Algorithm Using Bitwise Parallel Bulk Computation Technique on GPU. In IPDPSW), 2017.
[81]
Edans Flavius de Oliveira Sandes, Guillermo Miranda, Xavier Martorell, Eduard Ayguade, George Teodoro, and Alba Cristina Magalhaes Melo. CUDAlign 4.0: Incremental Speculative Traceback for Exact Chromosome-wide Alignment in GPU Clusters. IEEE TPDS, 2016.
[82]
Yongchao Liu and Bertil Schmidt. GSWABE: Faster GPU-accelerated Sequence Alignment with Optimal Alignment Retrieval for Short DNA Sequences. Concurrency and Computation: Practice and Experience, 2015.
[83]
Yongchao Liu, Adrianto Wirawan, and Bertil Schmidt. CUDASW++ 3.0: Accelerating Smith-Waterman Protein Database Search by Coupling CPU and GPU SIMD Instructions. BMC Bioinformatics, 2013.
[84]
Richard Wilton, Tamas Budavari, Ben Langmead, Sarah J Wheelan, Steven L Salzberg, and Alexander S Szalay. Arioc: High-throughput Read Alignment with GPU-accelerated Exploration of The Seed-and-extend Search Space. PeerJ, 2015.
[85]
Xia Fei, Zou Dan, Lu Lina, Man Xin, and Zhang Chunlei. FPGASW: Accelerating Large-scale Smith–Waterman Sequence Alignment Application with Backtracking on FPGA Linear Systolic Array. Interdisciplinary Sciences: Computational Life Sciences, 2018.
[86]
Hasitha Muthumala Waidyasooriya and Masanori Hariyama. Hardware-acceleration of Short-read Alignment Based on the Burrows-wheeler Transform. TPDS, 2015.
[87]
Yu-Ting Chen, Jason Cong, Jie Lei, and Peng Wei. A Novel High-throughput Acceleration Engine for Read Alignment. In FCCM, 2015.
[88]
Enzo Rucci, Carlos Garcia, Guillermo Botella, Armando De Giusti, Marcelo Naiouf, and Manuel Prieto-Matias. SWIFOLD: Smith-Waterman Implementation on FPGA with OpenCL for Long DNA Sequences. BMC Systems Biology, 2018.
[89]
Abbas Haghi, Santiago Marco-Sola, Lluc Alvarez, Dionysios Diamantopoulos, Christoph Hagleitner, and Miquel Moreto. An FPGA Accelerator of the Wavefront Algorithm for Genomics Pairwise Alignment. In FPL, 2021.
[90]
Luyi Li, Jun Lin, and Zhongfeng Wang. PipeBSW: A Two-Stage Pipeline Structure for Banded Smith-Waterman Algorithm on FPGA. In ISVLSI, 2021.
[91]
Ham, Tae Jun and Bruns-Smith, David and Sweeney, Brendan and Lee, Yejin and Seo, Seong Hoon and Song, U Gyeong and Oh, Young H and Asanovic, Krste and Lee, Jae W and Wills, Lisa Wu. Genesis: A Hardware Acceleration Framework for Genomic Data Analysis. In ISCA, 2020.
[92]
Tae Jun Ham, Yejin Lee, Seong Hoon Seo, U Gyeong Song, Jae W Lee, et al. Accelerating Genomic Data Analytics With Composable Hardware Acceleration Framework. IEEE Micro, 2021.
[93]
Lisa Wu, David Bruns-Smith, Frank A Nothaft, Qijing Huang, Sagar Karandikar, et al. FPGA Accelerated Indel Realignment in the Cloud. In HPCA, 2019.
[94]
Gagandeep Singh, Mohammed Alser, Damla Senol Cali, Dionysios Diamantopoulos, Juan Gómez-Luna, et al. FPGA-based Near-Memory Acceleration of Modern Data-Intensive Applications. IEEE Micro, 2021.
[95]
Jung-Sik Kim, Chi Sung Oh, Hocheol Lee, Donghyuk Lee, Hyong-Ryol Hwang, et al. A 1.2 v 12.8 gb/s 2gb Mobile Wide-i/o Dram with 4× 128 i/os Using TSV-based Stacking. In ISSCC, 2011.
[96]
Zülal Bingöl, Mohammed Alser, Onur Mutlu, Ozcan Ozturk, and Can Alkan. Gatekeeper-gpu: Fast and accurate pre-alignment filtering in short read mapping. IPDPSW, 2021.
[97]
Fazal Hameed, Asif Ali Khan, and Jeronimo Castrillon. ALPHA: A Novel Algorithm-Hardware Co-design for Accelerating DNA Seed Location Filtering. ITETC, 2021.
[98]
Licheng Guo, Jason Lau, Zhenyuan Ruan, Peng Wei, and Jason Cong. Hardware Acceleration of Long Read Pairwise Overlapping in Genome Sequencing: A Race between FPGA and GPU. In FCCM, 2019.
[99]
UK10K consortium et al. The UK10K Project Identifies Rare Variants in Health and Disease. Nature, 2015.
[100]
Rahul C Bhoyar, Abhinav Jain, Paras Sehgal, Mohit Kumar Divakar, Disha Sharma, et al. High Throughput Detection and Genetic Epidemiology of SARS-CoV-2 Using COVIDSeq Next-generation Sequencing. PloS One, 2021.
[101]
Yoongu Kim, Weikun Yang, and Onur Mutlu. Ramulator: A Fast and Extensible DRAM Simulator. CAL, 2015.
[102]
Arash Tavakkol, Juan Gómez-Luna, Mohammad Sadrosadati, Saugata Ghose, and Onur Mutlu. MQSim: A Framework for Enabling Realistic Studies of Modern Multi-queue SSD Devices. In FAST, 2018.
[103]
Fraz Syed, Haiying Grunenwald, and Nicholas Caruccio. Next-generation Sequencing Library Preparation: Simultaneous Fragmentation and Tagging Using in Vitro Transposition. Nature Methods, 2009.
[104]
Jay Shendure and Hanlee Ji. Next-generation DNA Sequencing. Nature Biotechnology, 2008.
[105]
Hasindu Gamaarachchi, Hiruna Samarakoon, Sasha P Jenner, James M Ferguson, Timothy G Amos, et al. Fast Nanopore Sequencing Data Analysis with SLOW5. Nature Biotechnology, 2022.
[106]
Oxford Nanopore Technologies. MinION Mk1B IT Requirements. https://community.nanoporetech.com/requirements_documents/minion-it-reqs.pdf, 2021.
[107]
Rasko Leinonen, Hideaki Sugawara, Martin Shumway, and International Nucleotide Sequence Database Collaboration. The Sequence Read Archive. Nucleic Acids Research, 2010.
[108]
Stephan Köstlbacher, Astrid Collingro, Tamara Halter, Frederik Schulz, Sean P Jungbluth, and Matthias Horn. Pangenomics Reveals Alternative Environmental Lifestyles among Chlamydiae. Nature Communications, 2021.
[109]
Lucy van Dorp, Mislav Acman, Damien Richard, Liam P Shaw, Charlotte E Ford, et al. Emergence of Genomic Diversity and Recurrent Mutations in SARS-CoV-2. Infection, Genetics and Evolution, 2020.
[110]
Nathan LaPierre, Mohammed Alser, Eleazar Eskin, David Koslicki, and Serghei Mangul. Metalign: Efficient Alignment-based Metagenomic Profiling Via Containment Min Hash. Genome Biology, 2020.
[111]
F Meyer, A Fritz, Z-L Deng, D Koslicki, A Gurevich, et al. Critical Assessment of Metagenome Interpretation-the second round of challenges. bioRxiv, 2021.
[112]
Michael M Khayat, Sayed Mohammad Ebrahim Sahraeian, Samantha Zarate, Andrew Carroll, Huixiao Hong, et al. Hidden Biases in Germline Structural Variant Detection. Genome Biology, 2021.
[113]
Ruibang Luo, Yiu-Lun Wong, Wai-Chun Law, Lap-Kei Lee, Jeanno Cheung, et al. BALSA: Integrated Secondary Analysis for Whole-genome and Whole-exome Sequencing, Accelerated by GPU. PeerJ, 2014.
[114]
Hongyi Xin, Sunny Nahar, Richard Zhu, John Emmons, Gennady Pekhimenko, et al. Optimal Seed Solver: Optimizing Seed Selection in Read Mapping. Bioinformatics, 2016.
[115]
Stephen F Altschul, Warren Gish, Webb Miller, Eugene W Myers, and David J Lipman. Basic Local Alignment Search Tool. Journal of Molecular Biology, 1990.
[116]
Derrick E Wood, Jennifer Lu, and Ben Langmead. Improved Metagenomic Analysis with Kraken 2. Genome Biology, 2019.
[117]
Saul Schleimer, Daniel S Wilkerson, and Alex Aiken. Winnowing: Local Algorithms for Document Fingerprinting. In ACM SIGMOD, 2003.
[118]
Michael Roberts, Wayne Hayes, Brian R Hunt, Stephen M Mount, and James A Yorke. Reducing Storage Requirements for Biological Sequence Comparison. Bioinformatics, 2004.
[119]
Guillaume Marçais, David Pellow, Daniel Bork, Yaron Orenstein, Ron Shamir, and Carl Kingsford. Improving the Performance of Minimizers and Winnowing Schemes. Bioinformatics, 2017.
[120]
Heng Li. Minimap and Miniasm: Fast Mapping and De Novo Assembly for Noisy Long Sequences. Bioinformatics, 2016.
[121]
Samsung. Samsung SSD 860 PRO. https://www.samsung.com/semiconductor/minisite/ssd/product/consumer/860pro/, 2018.
[122]
Jisung Park, Myungsuk Kim, Sungjin Lee, and Jihong Kim. Improving I/O Performance of Large-page Flash Storage Systems Using Subpage-parallel Reads. In NVMSA, 2018.
[123]
Myungsuk Kim, Jaehoon Lee, Sungjin Lee, Jisung Park, Youngsun Song, and Jihong Kim. Improving Performance and Lifetime of Large-page NAND Storages Using Erase-free Subpage Programming. In DAC, 2017.
[124]
Intel. Intel SSD DC S4500 Series. https://ark.intel.com/content/www/us/en/ark/products/120521/intel-ssd-dc-s4500-series-480gb-2-5in-sata-6gbs-3d1-tlc.html, 2017.
[125]
Samsung. Samsung SSD PM1735. https://www.samsung.com/semiconductor/ssd/enterprise-ssd/MZPLJ3T2HBJR-00007/, 2020.
[126]
AnandTech. New Enterprise SSD Controllers. https://www.anandtech.com/show/16275/new-enterprise-ssd-controllers-from-silicon-motion-phison-fadu.
[127]
Arash Tavakkol, Mohammad Sadrosadati, Saugata Ghose, Jeremie Kim, Yixin Luo, et al. FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives. In ISCA, 2018.
[128]
Myungsuk Kim, Jisung Park, Genhee Cho, Yoona Kim, Lois Orosa, et al. Evanesco: Architectural Support for Efficient Data Sanitization in Modern Flash-Based Storage Systems. In ASPLOS, 2020.
[129]
Yu Cai, Saugata Ghose, Erich F Haratsch, Yixin Luo, and Onur Mutlu. Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-state Drives. In Proc. IEEE, 2017.
[130]
Jisung Park, Youngdon Jung, Jonghoon Won, Minji Kang, Sungjin Lee, and Jihong Kim. RansomeBlocker: a Low-Overhead Ransomware-Proof SSD. In DAC, 2019.
[131]
Jisung Park, Jaeyong Jeong, Sungjin Lee, Youngsun Song, and Jihong Kim. Improving Performance and Lifetime of NAND Storage Systems Using Relaxed Program Sequence. In DAC, 2016.
[132]
Li-Pin Chang. On Efficient Wear Leveling for Large-scale Flash-memory Storage Systems. In ACM SAC, 2007.
[133]
Serial ATA International Organization. SATA revision 3.0 specifications. https://www.sata-io.org.
[134]
Samsung. Samsung SSD 980 PRO. https://www.samsung.com/semiconductor/minisite/ssd/product/consumer/980pro/, 2020.
[135]
PCI-SIG. PCI Express Base Specification Revision 3.0. https://pcisig.com/specifications.
[136]
PCI-SIG. PCI Express Base Specification Revision 4.0, Version 1.0. https://pcisig.com/specifications.
[137]
AMD. AMD^ EPYC^ 7742 CPU. https://www.amd.com/en/products/cpu/amd-epyc-7742.
[138]
Micron. Micron 9300 SSD. https://www.micron.com/products/ssd/product-lines/9300, 2019.
[139]
Western Digital. WD Blue SATA Internal SSD Hard Drive. https://www.westerndigital.com/en-ca/products/internal-drives/wd-blue-sata-2-5-ssd#WDS400T2B0A.
[140]
Ann Franchesca Laguna, Hasindu Gamaarachchi, Xunzhao Yin, Michael Niemier, Sri Parameswaran, and X Sharon Hu. Seed-and-vote Based In-memory Accelerator for DNA Read Mapping. In ICCAD, 2020.
[141]
Roman Kaplan, Leonid Yavits, and Ran Ginosar. RASSA: Resistive Prealignment Accelerator for Approximate DNA Long Read Mapping. IEEE Micro, 2018.
[142]
Gunjae Koo, Kiran Kumar Matam, I Te, HV Krishna Giri Narra, Jing Li, et al. Summarizer: Trading Communication with Computing Near Storage. In MICRO, 2017.
[143]
Vikram Sharma Mailthody, Zaid Qureshi, Weixin Liang, Ziyan Feng, Simon Garcia De Gonzalo, et al. Deepstore: In-storage Acceleration for Intelligent Queries. In MICRO, 2019.
[144]
Valerie A Schneider, Tina Graves-Lindsay, Kerstin Howe, Nathan Bouk, Hsiu-Chuan Chen, et al. Evaluation of GRCh38 and De Novo Haploid Genome Assemblies Demonstrates the Enduring Quality of the Reference Assembly. Genome Research, 2017.
[145]
Quynh Dang. Secure Hash Standard., 2015.
[146]
Ronald Rivest. RFC1321: The MD5 Message-digest Algorithm. https://datatracker.ietf.org/doc/rfc1321/, 1992.
[147]
Nika Mansouri Ghiasi, Jisung Park, Harun Mustafa, Jeremie Kim, Ataberk Olgun, et al. GenStore: A High-Performance and Energy-Efficient In-Storage Computing System for Genome Sequence Analysis. In arXiv, 2022.
[148]
David Sims, Ian Sudbery, Nicholas E Ilott, Andreas Heger, and Chris P Ponting. Sequencing Depth and Coverage: Key Considerations in Genomic Analyses. Nature Reviews Genetics, 2014.
[149]
Illumina. NovaSeq 6000 System Specifications. https://emea.illumina.com/systems/sequencing-platforms/novaseq/specifications.html, 2020.
[150]
Lenovo. ThinkPad T470p. https://www.lenovo.com/ch/en/laptops/thinkpad/t-series/ThinkPad-T470p/p/22TP2TT470P, 2016.
[151]
Wooseong Cheong, Chanho Yoon, Seonghoon Woo, Kyuwook Han, Daehyun Kim, et al. A Flash Memory Controller for 15s Ultra-Low-Latency SSD Using High-Speed 3D NAND Flash with 3s Read Time. In ISSCC, 2018.
[152]
Jisung Park, Myungsuk Kim, Myoungjun Chun, Lois Orosa, Jihong Kim, and Onur Mutlu. Reducing Solid-State Drive Read Latency by Optimizing Read-Retry. In ASPLOS, 2021.
[153]
Florian P Breitwieser, Mihaela Pertea, Aleksey V Zimin, and Steven L Salzberg. Human Contamination in Bacterial Genomes has Created Thousands of Spurious Proteins. Genome Research, 2019.
[154]
Human Microbiome Project Consortium et al. Structure, Function and Diversity of the Healthy Human Microbiome. Nature, 2012.
[155]
David Danko, Daniela Bezdan, Evan E Afshin, Sofia Ahsanuddin, Chandrima Bhattacharya, et al. A Global Metagenomic Map of Urban Microbiomes and Antimicrobial Resistance. Cell, 2021.
[156]
Rob Knight, Alison Vrbanac, Bryn C Taylor, Alexander Aksenov, Chris Callewaert, et al. Best Practices for Analysing Microbiomes. Nature Reviews Microbiology, 2018.
[157]
Eric W Sayers, Jeffrey Beck, Evan E Bolton, Devon Bourexis, James R Brister, et al. Database Resources of the National Center for Biotechnology Information. Nucleic Acids Research, 2021.
[158]
Justin M Zook, David Catoe, Jennifer McDaniel, Lindsay Vang, Noah Spies, et al. Extensive Sequencing of Seven Human Genomes to Characterize Benchmark Reference Materials. Scientific data, 2016.
[159]
Karen Clark, Ilene Karsch-Mizrachi, David J Lipman, James Ostell, and Eric W Sayers. GenBank. Nucleic Acids Research, 2016.
[160]
Fan Wu, Su Zhao, Bin Yu, Yan-Mei Chen, Wen Wang, et al. A New Coronavirus Associated with Human Respiratory Disease in China. Nature, 2020.
[161]
Heike Sichtig, Timothy Minogue, Yi Yan, Christopher Stefan, Adrienne Hall, et al. FDA-ARGOS is A Database with Public Quality-controlled Reference Genomes for Diagnostic Use and Regulatory Science. Nature Communications, 2019.
[162]
Stacia R Engel, Fred S Dietrich, Dianna G Fisk, Gail Binkley, Rama Balakrishnan, et al. The Reference Genome Sequence of Saccharomyces Cerevisiae: Then and Now. G3: Genes, Genomes, Genetics, 2014.
[163]
Tanya Z Berardini, Leonore Reiser, Donghui Li, Yarik Mezheritsky, Robert Muller, et al. The Arabidopsis Information Resource: Making and Mining the “Gold Standard” Annotated Reference Plant Genome. Genesis, 2015.
[164]
Aoife Larkin, Steven J Marygold, Giulia Antonazzo, Helen Attrill, Gilberto dos Santos, et al. FlyBase: Updates to the Drosophila Melanogaster Knowledge Base. Nucleic Acids Research, 2020.
[165]
Deanna M Church, Leo Goodstadt, LaDeana W Hillier, Michael C Zody, Steve Goldstein, et al. Lineage-specific Biology Revealed by a Finished Genome Assembly of the Mouse. PLoS Biology, 2009.
[166]
Thomas Wang. Integer Hash Function. http://web.archive.org/web/20071223173210/http://www.concentric.net/~Ttwang/tech/inthash.htm, 2007.
[167]
Alexander Dobin, Carrie A. Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, et al. STAR: Ultrafast Universal RNA-seq Aligner. Bioinformatics, 2012.
[168]
Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixin Luo, and Onur Mutlu. Reliability Issues in Flash-memory-based Solid-state Drives: Experimental Analysis, Mitigation, Recovery. In Inside Solid State Drives. Springer, 2018.
[169]
Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, and Erich F. Haratsch. Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques. In HPCA, 2017.
[170]
Yixin Luo, Saugata Ghose, Yu Cai, Erich F Haratsch, and Onur Mutlu. Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation. ACM POMACS, 2018.
[171]
Yixin Luo, Saugata Ghose, Yu Cai, Erich F. Haratsch, and Onur Mutlu. HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-recovery and Temperature Awareness. In HPCA, 2018.
[172]
Yu Cai, Yixin Luo, Saugata Ghose, and Onur Mutlu. Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery. In IEEE/IFIP DSN, 2015.
[173]
Yu Cai, Gulay Yalcin, Onur Mutlu, Erich F Haratsch, Adrian Crista, et al. Error Analysis and Management for MLC NAND Flash Memory. Intel Technology, 2013.
[174]
Yu Cai, Gulay Yalcin, Onur Mutlu, Erich F Haratsch, Adrian Cristal, et al. Flash Correct-and-refresh: Retention-aware Error Management for Increased Flash Memory lifetime. In ICCD, 2012.
[175]
Keonsoo Ha, Jaeyong Jeong, and Jihong Kim. An Integrated Approach for Managing Read Disturbs in High-density NAND Flash Memory. IEEE TCAD, 2015.
[176]
Yixin Luo, Yu Cai, Saugata Ghose, Jongmoo Choi, and Onur Mutlu. WARM: Improving NAND Flash Memory Lifetime with Write-Hotness Aware Retention Management. In MSST, 2015.
[177]
Micron. Product Flyer: Micron 3D NAND Flash Memory. https://www.micron.com/-/media/client/global/documents/products/product-flyer/3d_nand_flyer.pdf?la=en, 2016.
[178]
Synopsys, Inc. Design Compiler. https://www.synopsys.com/implementation-and-signoff/rtl-synthesis-test/design-compiler-graphical.html.
[179]
Micron Technology Inc. 4Gb: x4, x8, x16 DDR4 SDRAM Data Sheet, 2016.
[180]
Saugata Ghose, Tianshi Li, Nastaran Hajinazar, Damla Senol Cali, and Onur Mutlu. Demystifying Complex Workload-DRAM Interactions: An Experimental Study. ACM POMACS, 2019.
[181]
Saugata Ghose, Abdullah Giray Yaglikçi, Raghav Gupta, Donghyuk Lee, Kais Kudrolli, et al. What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study. POMACS, 2018.
[182]
Ramulator Source Code. https://github.com/CMU-SAFARI/ramulator.
[183]
Advanced Micro Devices. AMD μProf. https://developer.amd.com/amd-uprof/, 2021.
[184]
M. Holtgrewe. Mason - A Read Simulator for Second Generation Sequencing Data. Technical Report FU Berlin, 2010.
[185]
WikiChip. Cascade Lake SP - Intel. https://en.wikichip.org/wiki/intel/cores/cascade_lake_sp.
[186]
Aaron Stillmaker and Bevan Baas. Scaling Equations for the Accurate Prediction of CMOS Device Performance from 180 Nm to 7 Nm. Integration, 2017.
[187]
ARM Holdings. Cortex-R4. https://developer.arm.com/ip-products/processors/cortex-r/cortex-r4, 2011.
[188]
Yongchao Liu, Douglas L Maskell, and Bertil Schmidt. CUDASW++: Optimizing Smith-Waterman Sequence Database Searches for CUDA-enabled Graphics Processing Units. BMC Research Notes, 2009.
[189]
Yongchao Liu, Bertil Schmidt, and Douglas L Maskell. CUDASW++ 2.0: Enhanced Smith-Waterman Protein Database Search on CUDA-enabled GPUs Based on SIMT and Virtualized SIMD Abstractions. BMC Research Notes, 2010.
[190]
Shuyi Pei, Jing Yang, and Qing Yang. REGISTOR: A Platform for Unstructured Data Processing inside SSD Storage. ACM TOS, 2019.
[191]
Sang-Woo Jun, Andy Wright, Sizhuo Zhang, Shuotao Xu, et al. GraFBoost: Using Accelerated Flash Storage for External Graph Analytics. In ISCA, 2018.
[192]
Jaeyoung Do, Yang-Suk Kee, Jignesh M Patel, Chanik Park, Kwanghyun Park, and David J DeWitt. Query Processing on Smart SSDs: Opportunities and Challenges. In ACM SIGMOD, 2013.
[193]
Sudharsan Seshadri, Mark Gahagan, Sundaram Bhaskaran, Trevor Bunker, Arup De, et al. Willow: A User-Programmable SSD. In USENIX OSDI, 2014.
[194]
Sungchan Kim, Hyunok Oh, Chanik Park, Sangyeun Cho, Sang-Won Lee, and Bongki Moon. In-storage Processing of Database Scans and Joins. Information Sciences, 2016.
[195]
Erik Riedel, Christos Faloutsos, Garth A Gibson, and David Nagle. Active Disks for Large-Scale Data Processing. Computer, 2001.
[196]
Erik Riedel, Garth Gibson, and Christos Faloutsos. Active Storage for Large-Scale Data Mining and Multimedia Applications. VLDB, 1998.
[197]
Boncheol Gu, Andre S Yoon, Duck-Ho Bae, Insoon Jo, Jinyoung Lee, et al. Biscuit: A Framework for Near-data Processing of Big Data Workloads. ISCA, 2016.
[198]
Yangwook Kang, Yang-suk Kee, Ethan L Miller, and Chanik Park. Enabling Cost-effective Data Processing with Smart SSD. In MSST, 2013.
[199]
Xiaohao Wang, Yifan Yuan, You Zhou, Chance C Coats, and Jian Huang. Project Almanac: A Time-traveling Solid-state Drive. In EuroSys, 2019.
[200]
Anurag Acharya, Mustafa Uysal, and Joel Saltz. Active Disks: Programming Model, Algorithms and Evaluation. ASPLOS, 1998.
[201]
Kimberly Keeton, David A Patterson, and Joseph M Hellerstein. A Case for Intelligent Disks (IDISKs). SIGMOD Record, 1998.
[202]
Sang-Woo Jun, Ming Liu, Sungjin Lee, Jamey Hicks, John Ankcorn, et al. Bluedbm: An Appliance for Big Data Analytics. In ISCA, 2015.
[203]
Sang-Woo Jun, Ming Liu, Sungjin Lee, Jamey Hicks, John Ankcorn, et al. Bluedbm: Distributed Flash Storage for Big Data Analytics. ACM TOCS, 2016.
[204]
Mahdi Torabzadehkashi, Siavash Rezaei, Ali Heydarigorji, Hosein Bobarshad, Vladimir Alves, and Nader Bagherzadeh. Catalina: In-storage Processing Acceleration for Scalable Big Data Analytics. In Euromicro PDP, 2019.
[205]
Joo Hwan Lee, Hui Zhang, Veronica Lagrange, Praveen Krishnamoorthy, Xiaodong Zhao, and Yang Seok Ki. SmartSSD: FPGA Accelerated Near-Storage Data Analytics on SSD. IEEE Computer Architecture Letters, 2020.
[206]
Mohammadamin Ajdari, Pyeongsu Park, Joonsung Kim, Dongup Kwon, and Jangwoo Kim. CIDR: A Cost-effective In-line Data Reduction System for Terabit-per-second Scale SSD Arrays. In HPCA, 2019.
[207]
Benjamin Y Cho, Won Seob Jeong, Doohwan Oh, and Won Woo Ro. Xsd: Accelerating Mapreduce by Harnessing the GPU inside an SSD. In Near-Data Processing, 2013.
[208]
Won Seob Jeong, Changmin Lee, Keunsoo Kim, Myung Kuk Yoon, Won Jeon, et al. REACT: Scalable and High-performance Regular Expression Pattern Matching Accelerator for In-storage Processing. IEEE TPDS, 2019.
[209]
Sang-Woo Jun, Huy T Nguyen, Vijay Gadepally, et al. In-storage Embedded Accelerator for Sparse Pattern Processing. In HPEC, 2016.
[210]
Gretchen A Morrison, Jianmin Fu, Grace C Lee, Nathan P Wiederhold, Connie F Ca nete-Gibas, et al. Nanopore Sequencing of the Fungal Intergenic Spacer Sequence as a Potential Rapid Diagnostic Assay. Journal of Clinical Microbiology, 2020.
[211]
Oxford Nanopore Technologies. R10.3: the Newest Nanopore for High Accuracy Nanopore Sequencing – Now Available in Store. https://nanoporetech.com/about-us/news/r103-newest-nanopore-high-accuracy-nanopore-sequencing-now-available-store/, 2020.
[212]
Michael A Quail, Iwanka Kozarewa, Frances Smith, Aylwyn Scally, Philip J Stephens, et al. A Large Genome Center’s Improvements to the Illumina Sequencing System. Nature Methods, 2008.
[213]
PacBio Sequencing. Pacific Biosciences Closes Acquisition of Omniome and Establishes San Diego Presence. https://www.pacb.com/press_releases/pacific-biosciences-closes-acquisition-of-omniome-and-establishes-san-diego-presence/, 2021.
[214]
Tim Dunn, Harisankar Sadasivan, Jack Wadden, Kush Goliya, Kuan-Yu Chen, et al. Squigglefilter: An accelerator for portable virus detection. MICRO, 2021.
[215]
Tobias P Loka, Simon H Tausch, and Bernhard Y Renard. Reliable Variant Calling during Runtime of Illumina Sequencing. Scientific Reports, 2019.
[216]
Haowen Zhang, Haoran Li, Chirag Jain, Haoyu Cheng, Kin Fai Au, et al. Real-time Mapping of Nanopore Raw Signals. Bioinformatics, 2021.
[217]
Sam Kovaka, Yunfan Fan, Bohan Ni, Winston Timp, and Michael C Schatz. Targeted Nanopore Sequencing by Real-time Mapping of Raw Electrical Signal with UNCALLED. Nature Biotechnology, 2020.
[218]
Ebrahim Afshinnekoo, Cem Meydan, Shanin Chowdhury, Dyala Jaroudi, Collin Boyer, et al. Geospatial Resolution of Human and Bacterial Diversity with City-scale Metagenomics. Cell Systems, 2015.
[219]
Tiffany Hsu, Regina Joice, Jose Vallarino, Galeb Abu-Ali, Erica M Hartmann, et al. Urban Transit System Microbial Communities Differ by Surface Type and Interaction with Humans and the Environment. Msystems, 2016.
[220]
Rachel M Sherman, Juliet Forman, Valentin Antonescu, Daniela Puiu, Michelle Daya, et al. Assembly of A Pan-genome from Deep Sequencing of 910 Humans of African Descent. Nature Genetics, 2019.
[221]
Qiuhui Li, Shilin Tian, Bin Yan, Chi Man Liu, Tak-Wah Lam, et al. Building a Chinese Pan-genome of 486 Individuals. Communications Biology, 2021.
[222]
Karen H Miga and Ting Wang. The Need for A Human Pangenome Reference Sequence. Annual Review of Genomics and Human Genetics, 2021.
[223]
Haowen Zhang, Chirag Jain, and Srinivas Aluru. A Comprehensive Evaluation of Long Read Error Correction Methods. BMC Genomics, 2020.
[224]
Karen H Miga, Sergey Koren, Arang Rhie, Mitchell R Vollger, Ariel Gershman, et al. Telomere-to-telomere Assembly of A Complete Human X Chromosome. Nature, 2020.
[225]
Glennis A Logsdon, Mitchell R Vollger, PingHsun Hsieh, Yafei Mao, Mikhail A Liskovykh, et al. The Structure, Function and Evolution of A Complete Human Chromosome 8. Nature, 2021.
[226]
Peter Perešíni, Vladimír Boža, Broňa Brejová, and Tomáš Vinař. Nanopore Base Calling on the Edge. Bioinformatics, 2021.
[227]
Omar Ahmed, Massimiliano Rossi, Sam Kovaka, Michael C Schatz, Travis Gagie, et al. Pan-genomic Matching Statistics for Targeted Nanopore Sequencing. iScience, 2021.
[228]
Aaron Pomerantz, Nicolás Pe nafiel, Alejandro Arteaga, Lucas Bustamante, Frank Pichardo, et al. Real-time DNA Barcoding in a Rainforest Using Nanopore Sequencing: Opportunities for Rapid Biodiversity Assessments and Local Capacity Building. GigaScience, 2018.
[229]
Shinichi Sunagawa, Luis Pedro Coelho, Samuel Chaffron, Jens Roat Kultima, Karine Labadie, et al. Structure and Function of the Global Ocean Microbiome. Science, 2015.
[230]
Simon Lax, Daniel P Smith, Jarrad Hampton-Marcell, Sarah M Owens, Kim M Handley, et al. Longitudinal Analysis of Microbial Interaction between Humans and the Indoor Environment. Science, 2014.

Cited By

View all
  • (2024)Rethinking Page Table Structure for Fast Address Translation in GPUs: A Fixed-Size Hashed Page TableProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676900(325-337)Online publication date: 14-Oct-2024
  • (2024)ISP Agent: A Generalized In-storage-processing Workload Offloading Framework by Providing Multiple Optimization OpportunitiesACM Transactions on Architecture and Code Optimization10.1145/363295121:1(1-24)Online publication date: 19-Jan-2024
  • (2024)AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651341(101-118)Online publication date: 27-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
February 2022
1164 pages
ISBN:9781450392051
DOI:10.1145/3503222
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Filtering
  2. Genomics
  3. Near-Data Processing
  4. Read Mapping
  5. Storage

Qualifiers

  • Research-article

Conference

ASPLOS '22

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)366
  • Downloads (Last 6 weeks)64
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Rethinking Page Table Structure for Fast Address Translation in GPUs: A Fixed-Size Hashed Page TableProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676900(325-337)Online publication date: 14-Oct-2024
  • (2024)ISP Agent: A Generalized In-storage-processing Workload Offloading Framework by Providing Multiple Optimization OpportunitiesACM Transactions on Architecture and Code Optimization10.1145/363295121:1(1-24)Online publication date: 19-Jan-2024
  • (2024)AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651341(101-118)Online publication date: 27-Apr-2024
  • (2024)Cambricon-LLM: A Chiplet-Based Hybrid Architecture for On-Device Inference of 70B LLM2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00108(1474-1488)Online publication date: 2-Nov-2024
  • (2024)Flagger: Cooperative Acceleration for Large-Scale Cross-Silo Federated Learning Aggregation2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00071(915-930)Online publication date: 29-Jun-2024
  • (2024)QUETZAL: Vector Acceleration Framework for Modern Genome Sequence Analysis Algorithms2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00050(597-612)Online publication date: 29-Jun-2024
  • (2024)PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00033(340-353)Online publication date: 29-Jun-2024
  • (2024)Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00034(345-360)Online publication date: 2-Mar-2024
  • (2024)BeaconGNN: Large-Scale GNN Acceleration with Out-of-Order Streaming In-Storage Computing2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00033(330-344)Online publication date: 2-Mar-2024
  • (2023)ABCD Analysis of Industries Using High-Performance ComputingInternational Journal of Case Studies in Business, IT, and Education10.47992/IJCSBE.2581.6942.0282(448-465)Online publication date: 30-Jun-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media