TIDE: Telemetry Informed Delay Testing for Silent Data Corruption

Description

A major concern for computing systems today is the rising prevalence of silent data corruption (SDC). SDCs are undetected errors that yield incorrect results without triggering system alerts or error logs, particularly within large-scale computing infrastructures, where they pose substantial

A major concern for computing systems today is the rising prevalence of silent data corruption (SDC). SDCs are undetected errors that yield incorrect results without triggering system alerts or error logs, particularly within large-scale computing infrastructures, where they pose substantial risks to system integrity and reliability. Among the contributors to SDC, voltage droop has emerged as one of the most critical, often resulting in timing violations and system failures. Existing test methodologies are inadequate for capturing dynamic voltage fluctuations that occur under realistic workload conditions, thereby limiting their effectiveness for detecting SDCs. To address these limitations, we introduce Telemetry-Informed Delay Testing (TIDE), a novel methodology that enhances SDC detection by leveraging telemetry sensors to monitor voltage fluctuations and their impact on timing integrity. Designed for seamless integration into test flows based on commercial tools, TIDE requires minimal modification to existing infrastructure. By incorporating dynamic, workload-aware test generation, the proposed framework overcomes key limitations of traditional approaches and facilitates early detection of SDCs. The effectiveness of TIDE is demonstrated through case studies conducted on two RISC-V-based SoCs and multiple workloads, wherein we evaluate its ability to detect voltage droop-induced errors under realistic operating conditions. Experimental results show that TIDE achieves a considerable improvement in SDC detection compared to state-of-the-art techniques, thereby advancing SoC reliability and computational robustness in advanced semiconductor technologies.

Downloads

Public access restricted until 2027-05-01.

Details

Contributors
Date Created
2025
Embargo Release Date
Topical Subject
Language
  • en
Note
  • Partial requirement for: M.S., Arizona State University, 2025
  • Field of study: Electrical Engineering
Additional Information
English
Extent
  • 37 pages
Open Access
Peer-reviewed