|
Reliability testing and Quality Assurance
The silicon detector systems which are used in the LHC experiments as
well as in many other high energy physics (HEP) experiments must have a
high degree of reliability.
Owing to the difficult access, the stressful environments and the
expected long duration of the experiments, these detector systems must
be able to perform their functions over their expected lifetime.
In addition, the complex quality assurance (QA) planning needed to
achieve high reliability within a strict budget and limited timescale has
been identified as a major challenge for all future detectors and thus our
laboratory has been founded at CERN for this purpose.
The Laboratory is able to provide several reliability tests: Accelerated
life testing, Environmental Stress Screening, Shear test and Wire Pull test.
Accelerated Life Testing (ALT), Environmental Stress Screening (ESS)
and Burn-in are sometimes confused in the literature due to the fact that
they have the same goal – minimizing the occurrence of early field failures.
So, what is the difference?
ALT is a quantitative testing method which allows to quantify the life
characteristics of the product, component or system.
ESS is a qualitative testing method which is conducted under accelerated
test conditions.
Burn-in is a qualitative testing method which is conducted under ambient
test conditions.
All the reliability tests allow to identify the failure mode and the failure
mechanism for the object under test.
The failure mode is the manner whereby the failure is observed. The
failure mechanism is the physical, chemical or other processes that lead to
failure.
Some ...
Mechanical Failure Mechanisms Failure Mechanisms in Electronics
Fatigue
Corrosion
Wear
Electromigration
Expansion-Contraction Fatigue
Thermal Fatigue
Corrosion
Example Failure Modes and Failure Mechanisms in the silicon device:
(Source: Sony Semiconductor Network Comp.)
Failed part |
Failure Mode |
Failure Mechanism |
Chip |
OPEN |
- Aluminium wiring disconnection (corrosion, migration)
- Bonding pad corrosion
- Bond peeling (formation of Au-Al and other intermetallic compounds)
|
Chip |
SHORTS or LEAKS |
- Oxide film breakdown or pinhole
- Electrostatic breakdown
- Chip cracking
- PN junction
|
Chip |
IC FUNCTION FAILURE |
- Hot electron or hot carrier injection to the oxide film
- Surface inversion
- Crystal defect
- Contamination within the process degradation
- Moisture absorption
|
Package |
OPEN |
- Bonding wire disconnection or peeling
- Lead dirtiness, stress due to oxidation or absorbed moisture + heat stress
|
Package |
Package cracking,
shorts or leaks |
- Delamination between the chip or lead frame and mold resin
- Wire touching (contact between the wires, chip and lead
frame)
- Electrochemical migration between leads
|
Package |
Soldering defect |
- Surface contamination or oxidation
- Creep
- Flux residues
|
In planning an reliability program, one of the first tasks is to decide
what stress stimuli should be employed in examining the realiability of
a given product.
Relationships between Stress Stimulus and typically Precipitated
Product Flaw:
(Source: Accel. Stress Testing Handbook, H.A. Chan, P.J. Englert)
Stress Stimulus |
Reason for Utilization |
High Temperature |
- Diffusion processes on silicon
- Oxidation of fractures
- Poor timing margins
|
Low Temperature |
- Component and design margin problems not initiated at elevated temperatures
|
Temperature Cycling |
- Interconnection - solder joints, wire and ball bonds
- Poor timing margins
|
Power Cycling |
- Enhanced rate of temperature ramp-up and ramp-down during temperature cycling
|
Vibration and Mechanical Shock |
- Cabling and connector problems
- Poorly secured large mass components
- Structure cracking
|
Elevated Humidity |
- Isolation breakdown
- Corrosion
|
CERN CH-1211 Genève 23 Suisse
|