Record:   Prev Next
作者 Meixner, Albert
書名 Low-cost methods for error detection in multi-core systems
國際標準書號 9780549485063
book jacket
說明 152 p
附註 Source: Dissertation Abstracts International, Volume: 69-02, Section: B, page: 1112
Adviser: Daniel Sorin
Thesis (Ph.D.)--Duke University, 2008
There is broad consensus among academic and industrial researchers in computer architecture that hardware faults, both transient and permanent, will become significantly more frequent as CMOS feature sizes continue to shrink. Circuit-level techniques alone are insufficient to overcome this problem, and therefore system designers have begun to add fault tolerance features to processor micro-architectures and memory systems. Many of the techniques used today were developed in a time when fault coverage was the primary optimization target; hardware, power, and performance costs were only secondary concerns. These priorities do not accurately reflect the needs of today's commodity systems, which are very sensitive to manufacturing and performance costs and can trade-off some amount of fault coverage to reduce these costs
In my dissertation work I have developed novel error detection techniques with significantly lower area and performance costs than those traditionally used in high availability designs. These savings were made possible by a guiding principle of verifying high-level system tasks rather than checking correct operation of specific low-level components. This high-level, end-to-end approach to error-detection has distinct advantages over checking low-level components in terms of applicability to a wide range of systems, coverage of complex component interactions, and implementation cost. The major challenge in developing end-to-end checkers is to find high-level tasks that are both relevant and verifiable at runtime. I approached this problem by decomposing system-level tasks into sub-tasks that are more easily verifiable and, when combined, are sufficient to ensure correctness of a high-level task. Such a decomposition is a step back from a full end-to-end design and requires additional assumptions about the underlying system, but I found the resulting cost and complexity benefits to outweigh the loss in flexibility that comes with them
I have applied the ideas of task decomposition and high-level checking to processor cores, memory systems, and the I/O system, in order to develop low-cost checkers for each of these subsystems. The checking mechanisms resulting from this work are highly effective in detecting errors and incur lower hardware and performance cost than mechanisms with comparable error coverage proposed in the past
School code: 0066
DDC
Host Item Dissertation Abstracts International 69-02B
主題 Computer Science
0984
Alt Author Duke University. Computer Science
Record:   Prev Next