164 9781728113883/18/$31.00 ©2018 IEEE
Semiconductor Performance in Terms of Distributions, Bath Tub Curves and Similarity Index
R. Ross IWO Zweerslaan 46 6711GG Ede, The Netherlands TU Delft Mekelweg 4, 2628CD Delft, The Netherlands r.ross@iwo.nl, rob.ross@tudelft.nl
Abstract
Semiconductors are indispensable in modern grids employing HVDC connections. With their growing use the forensic techniques for conventional grids and components are increasingly applied to semiconductor devices as well. A review of techniques is provided with special attention to the use of distributions and to life cycle bathtub models (especially for mixed populations). Next, a method for determining the similarity between distributions is presented.
1. Introduction
The role of high power electronics is increasing in the electric power supply with the ongoing transition to sustainable energy and European crossborder electricity trade. TenneT TSO is an international electric power transmission utility in the Netherlands and a great part of Germany. TenneT employs a High Voltage (HV) grid with AC voltages ranging from 110 kV up to 380 kV and DC voltages up to 900 kV. The 580 km long NorNed submarine cable connects Norway and the Netherlands. Other HVDC submarine cables from the Netherlands are the BritNed cable to the UK and the Cobra cable to Denmark. The NorNed cable connects the hydropower plants of Norway with the wind power generation at the North Sea of Germany. The hydropower lakes can also function as energy storage. Converter stations at the coast connect the offshore DC and the onshore AC systems. Fig.1. Type of semiconductor that is used for the NorNed HVDC interconnection. Other HVDC systems are directly related to wind power generation. At a distance of 32 km tot 205 km out the German coast twelve wind parks have been built in the North sea. The rated power ranges from 62 to 916 MW each with an average of 583 MW (i.e. a total of 7000 MW). These wind parks consist of sectors with wind turbines and a converter station that collects the power from the wind turbines and feed this power into a submarine cable to shore. Nine of these connections are longer than 80 km and require HVDC, because the capacitive currents with AC would become too large. The integration of the wind turbine power, filtering of harmonics as well as the synthesis an appropriate current for the submarine cable connection require high power electronics at sea and on shore for connecting the DC and AC systems. Within utility Asset Management various statistical methods exist to support decisionmaking. In the semiconductor industry also reliability statistics are applied for quality control and describing ageing phenomena. This paper aims at connecting these two worlds that are envisaged to interact increasingly in the coming decade. The electrical reliability of HV systems and semiconductor devices both depend on materials that are able to withstand an applied electric field. The breakdown field strength of the materials is ruling the electrical performance. It is therefore no surprise that similar distributions and techniques are used. On the other hand semiconducting devices also have failure modes that are not very common to the passive HV components and vice versa. This paper attempts to bridge the gap between HV passive and active components. The subjects covered in subsequent sections of the paper are: the use of distributions and their purpose; life cycle stages as described by bath tub curves; and finally evaluating the similarity between distributions.
2. Distributions in use and their application
Semiconductors and the power systems they are applied in endure various degradation mechanisms and the collective failure distribution can be very complicated. However, usually often a single or very few mechanisms are dominant. As a consequence the overall distribution can be approached with much simpler models consisting of a very few distributions if not only one. Different families of fundamental distributions are in use, four of which are discussed below (see Table 1). Next models with combined distributions follow in the life cycle models.
165 9781728113883/18/$31.00 ©2018 IEEE
As the distribution fields of application often remain unacknowledged, these are described here. Table 1. Four fundamental distributions
Distribution Purpose Range
Normal Mean and scatter of
x
x
ϵ
(
∞,∞
) Lognormal Mean and scatter of
x
= ln
t
t
ϵ
[0,
∞)
Weibull Smallest extreme of
t
t
ϵ
[0,
∞)
Exponential Smallest extreme of
t
t
ϵ
[0,
∞)
The Normal or Gauss distribution density
f
N
is given as:
2
1 2
1( ; , ) e2
x N
f x
Eq. 1 The distribution parameters are the mean
μ
and standard deviation
σ
. There is no analytical expression for the cumulative distribution, but it can be approximated with a range of methods [1]. The foundation is the central limit theorem. This states that for large data sets with an arbitrary but identical distribution, the distribution of their sum and mean asymptotically approach of the Normal distribution. At least two facts are noteworthy. The distribution is typically suitable for describing the distribution of a mean (and a sum), but not
per se
for describing the extremities (the tails) of the distribution. Secondly, the variable ranges from 
∞ to +∞. It is
therefore in principle not suitable for
lifetimes which are ≥0,
but it may be suitable for the difference between lifetimes. The Lognormal distribution density
f
L
is given as:
2
1 ln 2
1( ; , ) e2
t L
f t
Eq. 2 This distribution is obtained by substituting ln(
t
) for
x
. The distribution parameters are the mean
μ
and standard deviation
σ
of ln(
t
), not of
t
! For instance the mean of
t
is Exp(
μ
+
σ
2
/2). The Lognormal distribution basically has the same foundation as the Normal distribution. It is suitable for
variables that range from 0 to +∞ of which the sum or the
mean is taken after the logarithmic conversion. The logarithm of lifetimes fall into this category. Also the Lognormal distribution is not designed for describing the extremities (the tails) of the distribution, but rather for the means and their accuracy. The cumulative Weibull distribution
F
W
is analytically expressed as:

( ; , ) 1 e
t W
F t
Eq. 3 The distribution parameters are the scale parameter
and the shape parameter
β
. Both cumulative distribution and distribution density have analytical expressions. The distribution is asymptotically suitable for weakest link in a long chain data. As objects fail at their weakest path, failure behavior can often be described very well by a Weibull distribution. The variable ranges
from 0 to +∞.
An often made mistake is to test objects, determine the mean breakdown time and subsequently characterize the object failure probability by the times of ±
σ
, ±2
σ
, etc. as if the underlying distribution is a Normal distribution. As mentioned above the Normal distribution is the asymptotic distribution for means and sums for large series of data. So, if multiple series of
N
objects are tested, then multiple averages of those
N
sized series can be determined. With increasingly large series with size
N
, the distribution of these averages approaches a Normal distribution; but the distribution of individual failure times does not. Fig.2 and 3 show an example of a wearingout population following a Weibull distribution with
β
=1.7 and
=1.13. Repeatedly testing series of 48 objects yields a distribution of their averages, that approaches a Normal distribution with
μ
1 and
σ
0.076. This Normal distribution can be translated into a distribution of single objects by multiplying
σ
with
48. The resulting distribution is not the same as the true distribution, particularly at the distribution tails. Also note the noticeable probability of negative failure times. For averages the Normal distribution (or Lognormal) is appropriate, for describing the failure of individual objects the true distribution can better be assessed. Fig.2. Distribution densities: true (Weibull), for means of 48 test objects (Normal), and the Normal distribution with
σ
48. Fig.3. Cumulative distribution: true (Weibull), for means of 48 test objects (Normal), and the Normal distribution with
σ
48. The last fundamental distribution described here is the Exponential distribution
F
E
which is analytically expressed as:

( ; ) 1 e
t E
F t
Eq. 4 The distribution parameter is the characteristic life time
θ
. Both cumulative distribution and distribution density have analytical expressions. The distribution is the same as a Weibull distribution if
β
=1 by definition. The hazard rate
h
E
166 9781728113883/18/$31.00 ©2018 IEEE
is a constant, namely 1/
θ
. This feature means that the danger of failure remains constant no matter the age, i.e. the failure is random. This is interesting for systems (e.g. a grid or a converter station) where suspicious or failed parts are replaced or periodic maintenance is carried out, the (e.g. sawtooth shaped) hazard rate hovers about an average hazard rate (see Fig.4). Because of this approximately constant hazard rate, the resulting system distribution is (almost) Exponential. Fig.4. An arbitrary distribution where the hazard rate hovers about an average due to periodic maintenance, approaches a constant hazard rate, thus an Exponential distribution.
3. Bath tub models for the life cycle
Semiconductors consist of various layers and components and can suffer from a range of degradation mechanisms. The failure distributions can therefore be quite complicated. However, if devices are generally well developed, but a batch turns out less reliable, then often one or a very few mechanisms are dominating the failure behavior instead of many. This simplifies the statistical behavior. If not, then a mix of failures can compose the ultimate distribution. The failure behavior is often subdivided into three categories: initial or early failure (also child disease); random failure and wearout failure. These processes are often associated with Weibull distributions of which the shape parameter
β
is respectively <1, =1 or >1. The hazard rate plot takes the shape of a bathtub curve (see Fig.5). Fig.5. Hazard plot of conventional bath tub model. This model consists of the sum of three (or more) hazard curves. This model applies when all test objects suffer from all processes simultaneously. For instance, during the wavering process various defects can be caused in the devices. Some are so serious that the device as a whole will not function, while others cause failure shortly after manufacturing and some of such defects take some more time and at some point such failures are no longer dominating. This is the curve belonging to 'early defect' (Fig.5). Competing processes may exist. After that period a failure type becomes dominant that is not related to ageing time. E.g. cosmic radiation and other external impacts may cause this random failure. This region is generally characterized by an Exponential behavior. The period is usually referred to as the useful life. The initial failures give the devices a bad reputation and strategies have been developed to get rid of devices that fail early and ship only devices that are dominated by random failure. For a short time enhanced stresses are applied to accelerate ageing such that (most) problem devices are taken out quickly. It depends on the identified early defects which laws for acceleration (e.g. power law) apply and what enhanced stresses are most successful. This type of quality control is called screening or the burnin process. From then on random failure dominates. The final stage is wearout. This consists of processes that degrade the materials and compositions of layers, doping etc. Dielectric ageing and breakdown is an example. Although the association of the three stages with the three Weibull shape parameter
β
ranges (<1, =1, >1) is often useful, not all early failures have a distribution with a decaying hazard rate. This is particularly true if a batch of devices consists of different subpopulations. E.g. a minority of devices may have an intact dielectric layer that is thinner or leads may have sharper edges than anticipated. As a result that minority suffers from enhanced electric fields compared to the design and accelerated ageing takes place. This is fast wearout and be characterized with a Weibull distribution with
β
>1, but with a (too) low scale parameter
. This leads to a different type of bathtub model (Fig.6). Fig.6. Hazard plot of mixed population bath tub model. On the left of this bathtub curve is a peak. At the end of the left hazard rate peak (here at about 3 a.u.) the weak subpopulation is depleted. If the hazard rate of only this subpopulation would be drawn, then it would be increasing. By drawing the hazard rate for the mixed population as a whole the hazard rate seemingly decays at 3 a.u. but this is merely due to depletion of the weak population, not because it ageing slows down. The combined hazard rate
h
of subpopulations with fraction
p
i
of the total population is [2]:
i ii iiii ii
p f f f h p R p R R
Eq. 5
167 9781728113883/18/$31.00 ©2018 IEEE
Here
f
i
and
R
i
are the distribution density and the reliability function of the respective subpopulation respectively. Likewise
f
and
R
are the distribution density and the reliability function of the complete population respectively.
4. Similarity of two distributions
If devices fail earlier or with a different rate than expected, it may be tested whether the observed failure behavior matches a reference behavior. The reference may be a distribution (or combination thereof) that is known from earlier tests or experience. The question is to what extent the observed distribution
f
(
t
) is similar to a reference
g
(
t
). Two methods are discussed here: Goodness of Fit tests (GoF) and a new method, the Similarity Index (
S
fg
). GoF concerns the ChiSquare Test which is described in various text books on statistics such as [3]. With the ChiSquare method the data range is subdivided into
K
intervals
I
i
(
i
=1,..
K
), such that each interval contains at least 5 values of the observed sample with sample size
n
. Determine the number of observations
b
i
in each interval Based on the reference distribution, calculate the expected number
e
i
of observations per interval
I
i
. Calculate the deviation
o2
:
221
K i ioi i
b ee
Eq. 6 Choose a significance level
and determine the critical value
c
for the Chisquare distribution that P(
2
≤
c
)=1
. With K1 degrees of freedom. Reject the hypothesis if
o2
>
c
. A prerequisite is that the deviations are Normally distributed. The Similarity Index
S
fg
was designed to rate a similarity between 0 and 1. It is based on group theory. The
S
fg
of two distribution densities
f
and
g
is defined as [4]:
fg
f g S f f g g f g
Eq. 7 Here <
f
∙
f
>, <
g
∙
g
> and <
f
∙
g
> are inner products of continuous or discrete functions. For the continuous case it is:
212 2 21 1 1
1 22 2
( ) ( ),( ) ( ) ( ) ( )
t t fg t t t t t t
f t g t dt S t t f t dt g t dt f t g t dt
Eq. 8 The similarity works with known distribution (which may belong to different families) and can be calculated not only over the full range, but also over a restricted range (e.g. from
t
1
=0 to
t
2
the present time of evaluation). Fig.7 shows an example where a best fit was determined based on 6 observations. This distribution deviated significantly from the reference distribution, but the observed data all fell within the confidence intervals of the reference distribution, so it could be a coincidence. How similar are
f
and
g
? The similarity was quantified with Eq. (8) from 0 up to shortly after the last failure which yielded S(0,40)=0.3. This is closer to 0 than to 1. Extrapolating the fit for the observations yielded a more dramatic image, indicating that also the mean performance would be almost a decade lower than the reference. The similarity over the full range is S(0,
∞)=0.09.
As with the GoF test a significance can be calculated by a Monte Carlo exercise that ranks the similarity among similarities between the reference and fits of generated data. The Similarity Index is a recent technique that will be further explored for comparing failure distributions and other types of fits. The Similarity Index can compare different functions and also combined functions. Fig.7. Example of Similarity Index of a reference with an observed (S(0,40)) and an extrapolated distribution S(0,
∞
)
.
Discussion and conclusions
The present paper discussed a number of reliability matters encountered with failure investigations and developments. Semiconductor devices keep growing in power and voltage. Both for maritime applications as well as onshore HVDC connections they are indispensable in the high power converter. With the analysis of data four fundamental functions are used, each with their own specific purpose. The Normal and Lognormal distribution are suitable for describing means with their accuracy. The Weibull distribution is suitable for ageing behavior particularly in solid insulation materials. The Exponential distribution is particularly useful for random ageing, but also for the behavior of systems that are maintained and or parts are replaced in case of failure. For the life cycle of devices the bathtub curve is a standard model and the Weibull shape parameter is often used to distinguish early failure, random failure and wearout failure. It is shown that this does not always hold. Particularly mixed populations may feature a fast wearing subpopulation. This leads to an alternative bathtub model. Finally, the Similarity Index is explained as a method to rate the similarity between distributions. Its roots are in group theory. A similarity of 1 is complete resemblance and 0 means the distributions have nothing in common. The method is quite recent, but has been very useful in evaluating deviations and in forensic studies after failure. The work is still in progress.
References
1. National Bureau of Standards. Handbook of Mathematical Functions, (ed. M. Abramowitz and I.A. Stegun). Dover Publications (New York, 1970), pp. 931933. 2. Ross, R., Reliability Analysis for Asset Management of Electric Power Grids, Wiley (Hoboken, NJ, 2018), pp.4549, ISBN 9781119125174.
168 9781728113883/18/$31.00 ©2018 IEEE
3. Kreyszig, E., Introductory Mathematical Statistics, John Wiley & Sons (New York, 1970), pp.248257, LCCCN 70107583. 4. Ross, R., Reliability Analysis for Asset Management of Electric Power Grids, Wiley (Hoboken, NJ, 2018), pp.8296, ISBN 9781119125174.