default search action
David W. Nellans
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [c34]Vijay Kandiah, Daniel Lustig, Oreste Villa, David W. Nellans, Nikos Hardavellas:
Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows. CGO 2023: 186-198 - [c33]Harini Muthukrishnan, Daniel Lustig, Oreste Villa, Thomas F. Wenisch, David W. Nellans:
FinePack: Transparently Improving the Efficiency of Fine-Grained Transfers in Multi-GPU Systems. HPCA 2023: 516-529 - [c32]Aninda Manocha, Zi Yan, Esin Tureci, Juan L. Aragón, David W. Nellans, Margaret Martonosi:
Architectural Support for Optimizing Huge Page Selection Within the OS. MICRO 2023: 1213-1226 - [c31]Yaosheng Fu, Evgeny Bolotin, Aamer Jaleel, Gal Dalal, Shie Mannor, Jacob Subag, Noam Korem, Michael Behar, David W. Nellans:
AutoScratch: ML-Optimized Cache Management for Inference-Oriented GPUs. MLSys 2023 - 2022
- [j5]Yaosheng Fu, Evgeny Bolotin, Niladrish Chatterjee, David W. Nellans, Stephen W. Keckler:
GPU Domain Specialization via Composable On-Package Architecture. ACM Trans. Archit. Code Optim. 19(1): 4:1-4:23 (2022) - [c30]Aninda Manocha, Zi Yan, Esin Tureci, Juan L. Aragón, David W. Nellans, Margaret Martonosi:
The Implications of Page Size Management on Graph Analytics. IISWC 2022: 199-214 - [d1]Aninda Manocha, Zi Yan, Esin Tureci, Juan Luis Aragón, David W. Nellans, Margaret Martonosi:
The Implications of Page Size Management on Graph Analytics. Zenodo, 2022 - 2021
- [c29]Oreste Villa, Daniel Lustig, Zi Yan, Evgeny Bolotin, Yaosheng Fu, Niladrish Chatterjee, Nan Jiang, David W. Nellans:
Need for Speed: Experiences Building a Trustworthy System-Level GPU Simulator. HPCA 2021: 868-880 - [c28]Harini Muthukrishnan, David W. Nellans, Daniel Lustig, Jeffrey A. Fessler, Thomas F. Wenisch:
Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers. ISCA 2021: 139-152 - [c27]Harini Muthukrishnan, Daniel Lustig, David W. Nellans, Thomas F. Wenisch:
GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management. MICRO 2021: 46-58 - [i4]Yaosheng Fu, Evgeny Bolotin, Niladrish Chatterjee, David W. Nellans, Stephen W. Keckler:
GPU Domain Specialization via Composable On-Package Architecture. CoRR abs/2104.02188 (2021) - 2020
- [c26]Xiaowei Ren, Daniel Lustig, Evgeny Bolotin, Aamer Jaleel, Oreste Villa, David W. Nellans:
HMG: Extending Cache Coherence Protocols Across Modern Hierarchical Multi-GPU Systems. HPCA 2020: 582-595 - [c25]Esha Choukse, Michael B. Sullivan, Mike O'Connor, Mattan Erez, Jeff Pool, David W. Nellans, Stephen W. Keckler:
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs. ISCA 2020: 926-939 - [c24]Mahmoud Khairy, Vadim Nikiforov, David W. Nellans, Timothy G. Rogers:
Locality-Centric Data and Threadblock Management for Massive GPUs. MICRO 2020: 1022-1036 - [i3]Ahmet Fatih Inci, Evgeny Bolotin, Yaosheng Fu, Gal Dalal, Shie Mannor, David W. Nellans, Diana Marculescu:
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems. CoRR abs/2012.04210 (2020)
2010 – 2019
- 2019
- [j4]Saptadeep Pal, Eiman Ebrahimi, Arslan Zulfiqar, Yaosheng Fu, Victor Zhang, Szymon Migacz, David W. Nellans, Puneet Gupta:
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training. IEEE Micro 39(5): 91-101 (2019) - [c23]Zi Yan, Daniel Lustig, David W. Nellans, Abhishek Bhattacharjee:
Nimble Page Management for Tiered Memory Systems. ASPLOS 2019: 331-345 - [c22]Akhil Arunkumar, Evgeny Bolotin, David W. Nellans, Carole-Jean Wu:
Understanding the Future of Energy Efficiency in Multi-Module GPUs. HPCA 2019: 519-532 - [c21]Zi Yan, Daniel Lustig, David W. Nellans, Abhishek Bhattacharjee:
Translation ranger: operating system support for contiguity-aware TLBs. ISCA 2019: 698-710 - [c20]Oreste Villa, Mark Stephenson, David W. Nellans, Stephen W. Keckler:
NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs. MICRO 2019: 372-383 - [i2]Esha Choukse, Michael B. Sullivan, Mike O'Connor, Mattan Erez, Jeff Pool, David W. Nellans, Stephen W. Keckler:
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs. CoRR abs/1903.02596 (2019) - [i1]Saptadeep Pal, Eiman Ebrahimi, Arslan Zulfiqar, Yaosheng Fu, Victor Zhang, Szymon Migacz, David W. Nellans, Puneet Gupta:
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training. CoRR abs/1907.13257 (2019) - 2018
- [c19]Vinson Young, Aamer Jaleel, Evgeny Bolotin, Eiman Ebrahimi, David W. Nellans, Oreste Villa:
Combining HW/SW Mechanisms to Improve NUMA Performance of Multi-GPU Systems. MICRO 2018: 339-351 - 2017
- [c18]Akhil Arunkumar, Evgeny Bolotin, Benjamin Y. Cho, Ugljesa Milic, Eiman Ebrahimi, Oreste Villa, Aamer Jaleel, Carole-Jean Wu, David W. Nellans:
MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability. ISCA 2017: 320-332 - [c17]Ugljesa Milic, Oreste Villa, Evgeny Bolotin, Akhil Arunkumar, Eiman Ebrahimi, Aamer Jaleel, Alex Ramírez, David W. Nellans:
Beyond the socket: NUMA-aware GPUs. MICRO 2017: 123-135 - 2016
- [c16]Tianhao Zheng, David W. Nellans, Arslan Zulfiqar, Mark Stephenson, Stephen W. Keckler:
Towards high performance paged memory for GPUs. HPCA 2016: 345-357 - [c15]Neha Agarwal, David W. Nellans, Eiman Ebrahimi, Thomas F. Wenisch, John Danskin, Stephen W. Keckler:
Selective GPU caches to eliminate CPU-GPU HW cache coherence. HPCA 2016: 494-506 - 2015
- [j3]Evgeny Bolotin, David W. Nellans, Oreste Villa, Mike O'Connor, Alex Ramírez, Stephen W. Keckler:
Designing Efficient Heterogeneous Memory Architectures. IEEE Micro 35(4): 60-68 (2015) - [c14]Neha Agarwal, David W. Nellans, Mark Stephenson, Mike O'Connor, Stephen W. Keckler:
Page Placement Strategies for GPUs within Heterogeneous Memory Systems. ASPLOS 2015: 607-618 - [c13]Neha Agarwal, David W. Nellans, Mike O'Connor, Stephen W. Keckler, Thomas F. Wenisch:
Unlocking bandwidth for GPUs in CC-NUMA systems. HPCA 2015: 354-365 - [c12]Mark Stephenson, Siva Kumar Sastry Hari, Yunsup Lee, Eiman Ebrahimi, Daniel R. Johnson, David W. Nellans, Mike O'Connor, Stephen W. Keckler:
Flexible software profiling of GPU architectures. ISCA 2015: 185-197 - 2014
- [b1]David W. Nellans:
Improving Operating System and Hardware Interactions Through Co-Design. University of Utah, USA, 2014 - [c11]Oreste Villa, Daniel R. Johnson, Mike O'Connor, Evgeny Bolotin, David W. Nellans, Justin Luitjens, Nikolai Sakharnykh, Peng Wang, Paulius Micikevicius, Anthony Scudiero, Stephen W. Keckler, William J. Dally:
Scaling the Power Wall: A Path to Exascale. SC 2014: 830-841 - 2013
- [c10]Anirudh Badam, Vivek S. Pai, David W. Nellans:
Better flash access via shape-shifting virtual memory pages. TRIOS@SOSP 2013: 3:1-3:14 - [c9]Matias Bjørling, Jens Axboe, David W. Nellans, Philippe Bonnet:
Linux block IO: introducing multi-queue SSD access on multi-core systems. SYSTOR 2013: 22:1-22:10 - 2012
- [j2]Manu Awasthi, David W. Nellans, Kshitij Sudan, Rajeev Balasubramonian, Al Davis:
Managing Data Placement in Memory Systems with Multiple Memory Controllers. Int. J. Parallel Program. 40(1): 57-83 (2012) - 2011
- [c8]Manu Awasthi, David W. Nellans, Rajeev Balasubramonian, Al Davis:
Prediction Based DRAM Row-Buffer Management in the Many-Core Era. PACT 2011: 183-184 - [c7]Xiangyong Ouyang, David W. Nellans, Robert Wipfel, David Flynn, Dhabaleswar K. Panda:
Beyond block I/O: Rethinking traditional storage primitives. HPCA 2011: 301-311 - 2010
- [c6]Manu Awasthi, David W. Nellans, Kshitij Sudan, Rajeev Balasubramonian, Al Davis:
Handling the problems and opportunities posed by multiple on-chip memory controllers. PACT 2010: 319-330 - [c5]Seth H. Pugsley, Josef B. Spjut, David W. Nellans, Rajeev Balasubramonian:
SWEL: hardware cache coherence protocols to map shared data onto shared caches. PACT 2010: 465-476 - [c4]Kshitij Sudan, Niladrish Chatterjee, David W. Nellans, Manu Awasthi, Rajeev Balasubramonian, Al Davis:
Micro-pages: increasing DRAM efficiency with locality-aware data placement. ASPLOS 2010: 219-230 - [c3]David W. Nellans, Kshitij Sudan, Erik Brunvand, Rajeev Balasubramonian:
Improving Server Performance on Multi-cores via Selective Off-Loading of OS Functionality. ISCA Workshops 2010: 275-292 - [c2]David W. Nellans, Kshitij Sudan, Rajeev Balasubramonian, Erik Brunvand:
Hardware prediction of OS run-length for fine-grained resource customization. ISPASS 2010: 111-112
2000 – 2009
- 2009
- [j1]David W. Nellans, Rajeev Balasubramonian, Erik Brunvand:
OS execution on multi-cores: is out-sourcing worthwhile? ACM SIGOPS Oper. Syst. Rev. 43(2): 104-105 (2009) - 2004
- [c1]David W. Nellans, Vamshi Krishna Kadaru, Erik Brunvand:
ARCS: an architectural level communication driven simulator. ACM Great Lakes Symposium on VLSI 2004: 73-77
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-24 20:32 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint