UnSync: a soft error resilient redundant CMP architecture
Reducing device dimensions, increasing transistor densities, and smaller timing windows, expose the vulnerability of processors to soft errors induced by charge carrying particles. Since these factors are inevitable in the advancement of processor technology, the industry has been forced to improve reliability on general purpose Chip Multiprocessors (CMPs). With the availability of increased hardware resources, redundancy based techniques are the most promising methods to eradicate soft error failures in CMP systems. This work proposes a novel customizable and redundant CMP architecture (UnSync) that utilizes hardware based detection mechanisms (most of which are readily available in the processor), to reduce overheads during error free executions. In the presence of errors (which are infrequent), the always forward execution enabled recovery mechanism provides for resilience in the system. The inherent nature of UnSync architecture framework supports customization of the redundancy, and thereby provides means to achieve possible performance-reliability trade-offs in many-core systems. This work designs a detailed RTL model of UnSync architecture and performs hardware synthesis to compare the hardware (power/area) overheads incurred. It then compares the same with those of the Reunion technique, a state-of-the-art redundant multi-core architecture. This work also performs cycle-accurate simulations over a wide range of SPEC2000, and MiBench benchmarks to evaluate the performance efficiency achieved over that of the Reunion architecture. Experimental results show that, UnSync architecture reduces power consumption by 34.5% and improves performance by up to 20% with 13.3% less area overhead, when compared to Reunion architecture for the same level of reliability achieved.