8000 GitHub - hyxie2023/Hw_RAS-2025.github.io
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

hyxie2023/Hw_RAS-2025.github.io

Repository files navigation

HW_RAS-tutorial-HPCA-2025

As cloud services become increasingly integral to modern computing, the reliability of underlying hardware components is critical to maintaining service continuity and performance. This tutorial delves into the advanced methodologies for predicting hardware failures to cloud service reliability.

We begin with an overview of hardware failures in data centers, followed by machine learning-based failure prediction methods designed for different hardware components. Key topics include hardware metrics for predictive analysis, feature engineering, algorithm selection, evaluation, and production deployment. In addition to knowledge share, the tutorial offers a handson Memory Failure Prediction competition, where participants apply their techniques to real-world problems. This tutorial is beneficial for researchers and engineers from both academia and industry, providing them with useful tools and necessary knowledge to enhance hardware resilience and cloud service reliability.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  
0