[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1331699acmconferencesBook PagePublication PagesmicroConference Proceedingsconference-collections
MICRO 40: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
2007 Proceeding
Publisher:
  • IEEE Computer Society
  • 1730 Massachusetts Ave., NW Washington, DC
  • United States
Conference:
Micro-40: The 40th Annual IEEE/ACM International Symposium on MicroarchitectureDecember 1 - 5, 2007
ISBN:
978-0-7695-3047-5
Published:
01 December 2007
Sponsors:

Reflects downloads up to 17 Jan 2025Bibliometrics
Abstract

No abstract available.

Article
Message from the General Chairs
Page viii
Article
Message from the Program Chairs
Article
Organizing Committee
Article
Reviewers
Article
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0

A significant part of future microprocessor real estate will be dedicated to L2 or L3 caches. These on-chip caches will heavily impact processor perfor- mance, power dissipation, and thermal management strategies. There are a number of interconnect ...

Article
Process Variation Tolerant 3T1D-Based Cache Architectures

Process variations will greatly impact the stability, leakage power consumption, and performance of future microprocessors. These variations are especially detrimental to 6T SRAM (6-transistor static memory) structures and will become critical with ...

Article
Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing

Parameter variation is detrimental to a processor's frequency and leakage power. One proposed technique to mitigate it is Fine-Grain Body Biasing (FGBB), where different parts of the processor chip are given a voltage bias that changes the speed and ...

Article
Optimal versus Heuristic Global Code Scheduling
Pages 43–55

We present a global instruction scheduler based on inte- ger linear programming (ILP) that was implemented exper- imentally in the Intel Itanium® product compiler. It features virtually the full scale of known EPIC scheduling optimiza- tions, more than ...

Article
Global Multi-Threaded Instruction Scheduling

Recently, the microprocessor industry has moved toward chip multiprocessor (CMP) designs as a means of utiliz- ing the increasing transistor counts in the face of physi- cal and micro-architectural limitations. Despite this move, CMPs do not directly ...

Article
Revisiting the Sequential Programming Model for Multi-Core

Single-threaded programming is already considered a complicated task. The move to multi-threaded programming only increases the complexity and cost involved in software development due to rewriting legacy code, training of the programmer, increased ...

Article
Penelope: The NBTI-Aware Processor
Pages 85–96

Transistors consist of lower number of atoms with every technology generation. Such atoms may be displaced due to the stress caused by high temperature, frequency and current, leading to failures. NBTI (negative bias temperature instability) is one of ...

Article
Software-Based Online Detection of Hardware Defects Mechanisms, Architectural Support, and Evaluation

As silicon process technology scales deeper into the nanometer regime, hardware defects are becoming more common. Such de- fects are bound to hinder the correct operation of future processor systems, unless new online techniques become available to ...

Article
Self-calibrating Online Wearout Detection

Technology scaling, characterized by decreasing feature size, thin- ning gate oxide, and non-ideal voltage scaling, will become a major hindrance to microprocessor reliability in future technology gener- ations. Physical analysis of device failure ...

Article
Implementing Signatures for Transactional Memory
Pages 123–133

Transactional Memory (TM) systems must track the read and write sets--items read and written during a transaction--to detect conflicts among concurrent trans- actions. Several TMs use signatures, which summarize unbounded read/write sets in bounded ...

Article
Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs

DRAMs require periodic refresh for preserving data stored in them. The refresh interval for DRAMs depends on the vendor and the de- sign technology they use. For each refresh in a DRAM row, the stored information in each cell is read out and then ...

Article
Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors

DRAM memory is a major resource shared among cores in a chip multiprocessor (CMP) system. Memory requests from different threads can interfere with each other. Existing memory access scheduling techniques try to optimize the overall data throughput ...

Article
Impact of Cache Coherence Protocols on the Processing of Network Traffic
Pages 161–171

Sincetheintroductionofthe10GbEstandardin2002,theabilityofgeneralpurposeprocessorstoefficientlyprocessnetworktrafficwithcommonprotocolssuchasTCP/IPhasbeenrevisitedandcriticallyevaluated.However,recentcommerciallyavailableprocessorssuchasIntel®...

Article
Flattened Butterfly Topology for On-Chip Networks

With the trend towards increasing number of cores in chip multiprocessors, the on-chip interconnect that connects the cores needs to scale efficiently. In this work, we propose the use of high-radix networks in on-chip interconnection net- works and ...

Article
Using Address Independent Seed Encryption and Bonsai Merkle Trees to Make Secure Processors OS- and Performance-Friendly

In today's digital world, computer security issues have become increasingly important. In particular, researchers have proposed designs for secure processors which utilize hardware-based mem- ory encryption and integrity verification to protect the ...

Article
Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding
Pages 197–209

In deep sub-micron ICs, growing amounts of on- die memory and scaling effects make embedded memories increasingly vulnerable to reliability and yield problems. As scaling progresses, soft and hard errors in the memory system will increase and single ...

Article
Argus: Low-Cost, Comprehensive Error Detection in Simple Cores

We have developed Argus, a novel approach for pro- viding low-cost, comprehensive error detection for simple cores. The key to Argus is that the operation of a von Neumann core consists of four fundamental tasks--control flow, dataflow, computation, and ...

Article
Leveraging 3D Technology for Improved Reliability
Pages 223–235

Aggressive technology scaling over the years has helped improve processor performance but has caused a reduc- tion in processor reliability. Shrinking transistor sizes and lower supply voltages have increased the vulnerability of computer systems ...

Article
Effective Optimistic-Checker Tandem Core Design through Architectural Pruning

Design complexity is rapidly becoming a limiting fac- tor in the design of modern, high-performance micro- processors. This paper introduces an optimization tech- nique to improve the efficiency of complex processors. Us- ing a new metric ( µUtilization)...

Article
FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators
Pages 249–261

This paper describes FAST, a novel simulation methodol- ogy that can produce simulators that (i) are orders of mag- nitude faster than comparable simulators, (ii) are cycle- accurate, (iii) model the entire system running unmodified applications and ...

Article
Microarchitectural Design Space Exploration Using an Architecture-Centric Approach

The microarchitectural design space of a new processor is too large for an architect to evaluate in its entirety. Even with the use of statistical simulation, evaluation of a single configuration can take excessive time due to the need to run a set of ...

Article
Informed Microarchitecture Design Space Exploration Using Workload Dynamics

Program runtime characteristics exhibit significant variation. As microprocessor architectures become more complex, their efficiency depends on the capability of adapting with workload dynamics. Moreover, with the approaching billion-transistor ...

Article
Time Interpolation: So Many Metrics, So Few Registers

The performance of computer systems varies over the course of their execution. A system may perform well dur- ing some parts of its execution and poorly during others. To understand why a system behaves in this way performance analysts need to study its ...

Article
Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications

The performance of many important commercial workloads, such as on-line transaction processing, is limited by the frequent stalls due to off-chip instruction and data accesses. These applica- tions are characterized by irregular control flow and complex ...

Article
A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy

Current on-chip block-centric memory hierarchies exploit access patterns at the fine-grain scale of small blocks. Several recently proposed techniques for coherence traffic reduction and prefetching suggest that further useful patterns emerge with a ...

Article
Uncorq: Unconstrained Snoop Request Delivery in Embedded-Ring Multiprocessors

Snoopy cache coherence can be implemented in any physical network topology by embedding a logical unidirectional ring in the network. Control messages are forwarded using the ring, while other messages can use any path. While the resulting coherence ...

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Acceptance Rates

MICRO 40 Paper Acceptance Rate 35 of 166 submissions, 21%;
Overall Acceptance Rate 484 of 2,242 submissions, 22%
YearSubmittedAcceptedRate
MICRO-482836122%
MICRO-472795319%
MICRO-462393916%
MICRO 412104019%
MICRO 401663521%
MICRO 391744224%
MICRO 381472920%
MICRO 371582918%
MICRO 361343526%
MICRO 331103128%
MICRO 321312721%
MICRO 311082826%
MICRO 301033534%
Overall2,24248422%