Search: 
Faculty
Students
Papers
Talks
Posters
Tasks

The Platform Viability theme

GSRC Platform Viability Theme

Current solutions for ensuring the viability of our platforms, i.e., that manufactured platforms are indeed working correctly, have been either pushed to the limits or have proven to be either cost-ineffective or inadequate in the face of enormous complexity, parametric variations, environmental variations, and aging. We need fundamental breakthroughs in design, verification, validation, and test technologies to continue to produce and maintain working platforms at an affordable cost. Addressing these immensely complex challenges requires collaborative research in all areas of system validation, software and hardware verification, post-silicon validation, manufacturing testing, and post-deployment resiliency.  Two themes, Platform Viability and Resilient Systems, will jointly address these challenges. The Platform Viability theme will target quality assurance from design specification to shipment, and will explore shared solutions jointly with the Resilient Systems theme, which will focus exclusively on post-deployment and lifetime resiliency.


The objective of the Platform Viability theme is to deliver low-cost solutions that can guarantee the design and production of working platforms. Our overall goals are (i) to develop solutions which can collectively achieve coverage greater than 99% for all relevant error/fault models used in software and hardware verification, silicon debug and manufacturing testing, and will only incur less than 5% area, performance and power overheads for meeting these targets, (ii) to deliver formal verification capabilities ensuring the real-time correctness of concurrent hardware-software for heterogeneous many-core platforms with 100+ nodes, and (iii) to investigate new solutions for testing and verifying power consumption and power management (in contrast to existing objectives for functionality and speed).

To support the infrastructure and mobile segments, the modeling and verification technologies must address the growing concerns of concurrency-related bugs that result from the sophisticated interactions between concurrent software and hardware, as well as between the language-level concurrency abstractions and the hardware-level abstractions. For future many-core designs, we must develop scalable verification solutions in order to support architectural and micro-architectural exploration as well as to ensure that the uncore (non-processor) components work properly in the face of functional/electrical bugs and manufacturing/reliability defects. Post-silicon validation is another critical area demanding focused attention. Test solutions we develop will be embedded and self-test in nature and thus can support both packaged-chip and bare-die testing. Therefore, they will also support known-good-die for 3D integration.

To achieve overall cost reduction and quality improvement, we need to carefully investigate the possibility of hardware resource sharing and joint optimization among all post-silicon and deployment quality assurance functions - including validation, calibration, manufacturing testing, adaptation, diagnosis, and post-deployment testing. Several tasks within the theme and collaborations with the Resilient Systems theme have been planned with this objective in mind.