### DIGITAL SYSTEMS LABORATORY

## STANFORD ELECTRONICS LABORATORIES DEPARTMENT OF ELECTRICAL ENGINEERING STANFORD UNIVERSITY. STANFORD, CA 94305



SEL-78-026

# DETECTION OF INTERMITTENT FAULTS IN SEQUENTIAL CIRCUITS

Jacob Savir

Technical Report No. 120

March 1978

This work was supported in part by an IBM Post-Doctoral Fellowship, and in part by the National Science Foundation under NSF Grant No. MCS 76-05327.

SEL-78-026



#### DETECTION OF INTERMITTENT FAULTS IN SEQUENTIAL CIRCUITS

Jacob Savi r

Technical Report No. 120

March 1978

Center for Reliable Computing
Digital Systems Laboratory
Departments of Electrical Engineering and Computer Science
Stanford University
Stanford, California

This work was supported in part by an IBM Post-Doctoral Fellowship, and in part by the National Science Foundation under NSF Grant No. MCS 76-05327.

#### DETECTION OF INTERMITTENT FAULTS IN SEQUENTIAL CIRCUITS

Jacob Savir

Technical Report No. 120

March 1978

Center for Reliable Computing
Digital Systems Laboratory
Departments of Electrical Engineering and Computer Science
Stanford University
Stanford. California 94305

#### ABSTRACT

Testing for intermittent faults in digital circuits has been given significant attention in the past few years. However, very little theoretical work was done regarding their detection in sequential circuits.

This paper shows that the testing properties of intermittent faults in sequential circuits can be studied by means of a probabilistic automaton. The evaluation and derivation of optimal intermittent fault detection experiments in sequential circuits is done by creating a product state table from the faulty and fault-free versions of the circuit under test. Both deterministic and random test procedures are discussed. The underlying optimality criterion maximizes the probability of fault detection.

INDEX TERMS: Sequential circuit, state table, intermittent fault, error latency, Markov chain, deterministic and random testing, input probability.

|  | • |  |
|--|---|--|
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |
|  |   |  |

#### 1. INTRODUCTION

The problem of testing for intermittent faults (I.F.'s) in digital circuits has been given considerable attention in recent years. It has long been known that the major portion of the down-time in several computers and other digital systems is due to the lack of efficient testing and diagnosis techniques which can handle I.F.'s. [Ball and Hardie, 1969] reported that approximately 90% of field failures in several computer systems are intermittent in nature. Several computer and digital systems manufacturers still believe this figure to be true today.

Most of the literature dealing with I.F.'sconcern their testing in combinational circuits. [Breuer, 1973] proposed a two state Markov model to account for the effects of the I.F. The Markov process could have changed from one state to the other with given probabilities. [Kamal and Page, 1974] used a special case of Breuer's model to treat the problem of deterministic testing of single I.F.'sin combinational circuits. [Savir, 1977 and 1978] discussed the random testing properties of I.F.'sin combinational circuits and proposed an algorithm for their detection. However, little attempt was made to deal with the problem of testing for I.F.'s in sequential circuits, which is an important problem because all digital systems have sequential circuits built into them. This is the task of this paper.

Only <u>stationary</u> I.F.'s are considered in this paper, namely faults which produce the same symptom each time they appear. An I.F. due to loose connections is an example of a stationary fault, while a transient fault due to electromagnetic interferences is not considered stationary because, in general, different portions of the circuit may be affected by such an interference at different times. This paper discusses both deterministic

and random test procedures for detecting I.F.'s in sequential circuits. When discussing random testing it is useful to distinguish between time variant random testing and time invariant random testing. Time invariant random testing refers to a situation where the input vectors are applied with a constant probability distribution during the entire test procedure, while time variant random testing refers to a strategy which allows the probability distribution to change as-the test proceeds. Unless otherwise stated, only time invariant random testing will be considered in this paper.

The sequential circuits considered in this paper are those that can be represented by a state table with transition-assigned outputs. The circuits are assumed to be strongly connected, namely a path(not necessarily of length 1) exists from any state to any other state. It is also assumed that the circuit has an asynchronous preset or clear inputs which will henceforth be called a reset input. Most practical sequential circuits meet these assumptions. The following additional assumptions are used throughout the paper:

<u>Assumption 1:</u> The I.F.'s do not affect the reset line; thus it is possible to synchronize the circuit to a known state before the test procedure is initiated.

<u>Assumption 2</u>: The I.F.'s are well behaved [Breuer, 1973] namely, during an application of an input vector the circuit behaves either as being fault-free or else some I.F. is active.

<u>Assumption 3:</u> The applied input vectors are independent of the activity periods of the I.F.'s.

Assumption 1 enables us to derive exact expressions for the test

efficiency (i.e. the probability of fault detection) of any proposed test procedure. The assumption is not essential, in the sense that if we allow the I.F.'s to affect the reset input the calculated test efficiency will be only a lower bound to the actual value. Assumption 2 is quite well approximated since the input vectors are usually applied by a fast machine. Assumption 3 is certainly justified since the activity periods of an I.F. can not be predicted.

The test procedure goes as follows: both the circuit under test (C.U.T.) and a reference unit are first synchronized to their reset state by the reset signal. The reference unit is either a "gold unit" meaning a very reliable copy of the C.U.T. or a logic simulator which calculates the correct response for every applied input vector. After the initialization step, input vectors are applied in parallel to both the C.U.T. and the reference unit and then the outputs are compared. If a discrepancy is observed the C.U.T. is declared faulty and the experiment is stopped. Otherwise the test procedure is continued until a certain confidence that the circuit is fault-free is achieved. In this case the C.U.T. is declared as being fault-free. The input vectors are either deterministically applied or randomly selected, depending on which test strategy is being used.

Section 2 provides a statistical background to the rest of the paper. Section 3 derives the error latency of an I.F. in a sequential circuit. Section 4 discusses the problem of deterministic testing of one I.F. Section 5 repeats this discussion for random testing of one I.F. Section 6 generalizes sections 4 and 5 for the case of testing for a fault set, and section 7 concludes with a brief summary.

#### 2. STATISTICAL BACKGROUND

A probabilistic machine is a system described by the five-tuple

where

$$S_p = \langle I, Q, Z, P, \Phi \rangle$$

 $I = \{0,1,\ldots,J-1\}$  is the set of inputs

 $Q = \{q_1, q_2, \dots, q_s\}$  is the set of states

 $Z = \{ 0,1,...,K-1 \}$  is the set of outputs

$$\Phi = \{ \vec{\phi} = (\phi_1, \phi_2, \dots, \phi_S) \mid \phi_{i} \geq 0 \quad \forall_{i}, \sum_{i=1}^{S} \phi_{i} = 1 \}$$

is a set of all initial state probability distributions associated with the state set Q.

 $P = [Pr(z,q'|i,q) \ 1]$  is a matrix whose elements are the conditional probabilities of going from state q under input i to state q' and producing the output z.

Clearly the dimension of  ${\bf P}$  is JsXKs. Note that the state set Q can be embedded in  $\Phi$  by identifying  ${\bf q_i}$  with the probability vector which has 1 in the i-th place and 0 elsewhere. The conditional probabilities are subject to the constraints

$$\sum_{z \in \mathcal{I}} \sum_{q' \in Q} \Pr(z, q' | i, q) = 1$$
 (1)

and

$$0 \leq \Pr(z,q'|i,q) \leq 1 \text{ for all } i,q,z,q'$$
 (2)

#### 2. 1 RESPONSE OF A PROBABILISTIC MACHINE TO DETERMINISTIC INPUTS

It is useful to arrange the matrix **P** in standard form, namely



Each matrix  $P_i(z)$  is of dimension sXs and describes the probabilities that the output symbol will be z given that the input symbol is i. The (j,k) element of the matrix  $P_i(z)$  is the probability that the current output symbol will be z and the next state will be qk given that the current state is  $q_{ij}$  and the current input i.

Let

$$\vec{e}_{i} = (0,0,...,0,1,0,...,0)$$

i-th component

(4)

and let 
$$\vec{u} = (1,1,...,1)$$
 (5)

According to this notation, the probability of an output sequence  $z_1, z_2, \ldots, z_n$  given an input sequence  $i_1, i_2, \ldots, i_n$  is given by

$$\Pr(z_1, z_2, \dots, z_n | i_1, i_2, \dots, i_n) = \vec{\phi} \; \mathbf{P}_{i_1}(z_1) \mathbf{P}_{i_2}(z_2) \dots \; \mathbf{P}_{i_n}(z_n) \; \vec{u}^{tr}$$
 (6)

where  $\vec{u}^{tr}$  denotes the vector  $\vec{u}$  transposed. Denote now

$$\mathbf{P}_{\mathbf{i}} = \sum_{\mathbf{z} \in \mathbf{Z}} \mathbf{P}_{\mathbf{i}}(\mathbf{z}) \tag{7}$$

The (j,k) element of  $P_i$  corresponds to the probability that the next state will be  $q_k$  given that the present state is  $q_j$  and present input is i. Thus, the probability of being in state  $q_k$  when the input sequence  $i_1, i_2, \ldots, i_n$  is applied is given by

where  $(\cdot)_k$  denotes the k-th component of the vector. Note that both  ${f P}$  and  ${f P}_i$  , i=1,2,... ,J-lare Markov matrices.

#### 2. 2 RESPONSE OF A PROBABILISTIC MACHINE TO RANDOM INPUTS

When analyzing the response of a probabilistic machine to random inputs it is useful to use the matrix  $\mathbf{R}$ = [ Pr(z,q'|q)] instead of the matrix  $\mathbf{P}$ . Clearly there is a relation between Rand  $\mathbf{P}$ . In order to see this, let  $\pi$  be the set of all possible probability assignments to the input set, namely

$$\Pi = \{ \overrightarrow{\pi} = (\pi_0, \pi_1, \dots, \pi_{J-1}) | \pi_i \ge 0 \forall_i , \sum_{i=1}^{J-1} \pi_i = 1 \}$$
 (9)

then for a given input probability assignment

$$Pr(z,q'|q) = \sum_{i \in I} \pi_i Pr(z,q'|i,q)$$
 (10)

it is convenient to present  ${f R}$  in standard form also, namely

$$\mathbf{R} = \begin{bmatrix}
(0, q_1)(0, q_2) \dots (0, q_s) & (z, q_1)(z, q_2) \dots (z, q_s) & (K-1, q_1)(K-1, q_2) \dots (K-1, q_s) \\
q_1 & & & & & & & & & & & \\
\mathbf{R} = \vdots & & & & & & & & & & & & \\
\vdots & & & & & & & & & & & & & & \\
\mathbf{R} & & & & & & & & & & & & & \\
\vdots & & & & & & & & & & & & & \\
\mathbf{R} & & & & & & & & & & & & \\
\vdots & & & & & & & & & & & & \\
\vdots & & & & & & & & & & & \\
\mathbf{R} & & & & & & & & & & \\
\vdots & & & & & & & & & & \\
\vdots & & & & & & & & & & \\
\vdots & & & & & & & & & & \\
\vdots & & & & & & & & & \\
\vdots & & & & & & & & & \\
\vdots & & & & & & & & & \\
\vdots & & & & & & & & & \\
\vdots & & & & & & & & & \\
\vdots & & & & & & & & & \\
\vdots & & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & & & & \\
\vdots & & & & &$$

According to this notation, the probability that an output sequence  $z_1, z_2, \ldots, z_n$  is the response to n randomly applied inputs is given by

$$Pr(z_1, z_2, ..., z_n | n \text{ random inputs}) = \overrightarrow{\phi} \mathbf{R}(z_1) \mathbf{R}(z_2) ... \mathbf{R}(z_n) \overrightarrow{u}^{tr}$$
 (12)

Let 
$$\mathbf{R}^* = \sum_{z \in \mathcal{I}} \mathbf{R}(z)$$
 (13)

The matrix  $\mathbf{R}^{\star}$  is a sXs matrix whose (j,k) element is the probability of going from state  $\mathbf{q}_{\mathbf{j}}$  to state  $\mathbf{q}_{\mathbf{k}}$  under randomly applied inputs. The probability that the machine will end up in state  $\mathbf{q}_{\mathbf{k}}$  after n randomly applied input vectors is given, therefore by

$$Pr(q_k | n \text{ random inputs}) = (\overrightarrow{\phi} (\mathbf{R}^*)^n)_k$$
 (14)

Note that both  $\boldsymbol{R}$  and  $\boldsymbol{R}^{\star}$  are Markov matrices.

#### 3. THE ERROR LATENCY OF AN INTERMITTENT FAULT

When a certain malfunction occurs in the circuit we say that a certain fault is <a href="mailto:present">present</a> in it. When the present fault is intermittent, it can be active sometimes and inactive at other times. We say that an I.F. is active at a certain time if it changes the function of any portion

of the circuit at that time. In this paper we use the model of the activity period of an I.F. described in [ Savir 1977 and 1978 ]. This model assumes that during an application of an input vector the fault is active with a constant probability, p, called the probability of activity.

$$p_i = Pr(fault f_i is active | f_i is present)$$
 (15)

In any digital circuit there is a delay between the time a fault becomes present and the first output error, called the error latency. It is convenient to measure the error latency by a discrete quantity rather than by time, as described in the following definition.

Definition 1:

The <u>error latency</u> [ Shedletsky and McCluskey, 1976 ],  $\mathrm{EL}_{\mathbf{i}}$ , of fault fi is the number of input vectors applied to the circuit while  $\mathrm{f}_{\mathbf{i}}$  is present until the first incorrect output due to fi is observed.

The error latency depends on the circuit, the fault and the applied input sequence. Since the error latency of an I.F. is a random variable even for deterministic inputs, it is possible to quantify it only statistically. For testing purposes, it is useful to describe its statistical behavior by its cumulative distribution function ( c. d. f. ). The c. d. f. of the error latency of fault fk under the input sequence  $i_1, i_2, \dots, i_n$  is defined as

$$F_{EL_k}(i_1, i_2, ..., i_n) = Pr(EL_k \le n | i_1, i_2, ..., i_n)$$
 (16)

namely, the c.d.f. of the error latency is the probability that the fault

will be detected at or before the application of the last input symbol of the sequence. Similarly, when the input vectors are randomly applied, the c.d.f. of the error latency is defined as

$$FEL_{k}(n) = Pr(EL_{k} \le n | n \text{ random inputs})$$
 (17)

Although not explicitly written, it is implicit that (16) and (17) are also conditioned on the event "fault fk is present."

There is a major difference between the testing properties of I.F.'s in sequential circuits and combinational circuits. In combinational circuits an output error can be observed only when the fault is active. In sequential circuits, on the other hand, the first occurrence of an active fault may cause the circuit to enter an incorrect state without producing an immediate output error, and this change in state may cause an output error at a later time, even when the fault has already become inactive. This is the reason [Ball and Hardie, 1969] observed that it is easier to detect I.F.'s in sequential circuits than in combinational circuits, when simulating I.F.'s in the Saturn V Launch Vehicle Aerospace Computer.

Figure 1 describes the notions of presence, activity and error latency.

#### Definition 2:

A deterministic I.F. detection experiment (deterministic I.F.D.E.) is a string of n symbols over the input vector space. A random I.F. detection experiment (random I.F.D.E.) generates the deterministic I.F.D.E.'s with a given probability distribution. The number n is called



FIGURE 1: The error latency of an I.F.in a sequential circuit. the length of the experiment.

Clearly, there are  $\textbf{J}^n$  deterministic I.F.D.E.'s of length n.

Since every I.F.D.E. must have finite length, there is no guarantee that at the end of any experiment a definite conclusion regarding the state of the circuit (faulty or fault-free) could be drawn. Once enough consecutive correct outputs are observed it is necessary to stop and decide with a certain confidence that the C.U.T. is fault-free. Such a stopping decision rule should be based on the confidence that the C.U.T. possesses no I.F.'s, which is a non-decreasing function of the number of consecutive correct outputs that are observed. We use, therefore, the escape probability as the criterion by which the experiment should be stopped.

#### Definition 3:

The <u>escape probability</u> [ Savir 1977 and 1978 ],  $\epsilon_i$ , of an I.F.f<sub>i</sub> is the conditional probability that the fault will go undetected during

an application of a sequence of n input vectors (deterministically or randomly selected), given that fi is present.

The escape probability depends on the circuit, the fault, the I.F.D.E. and is related to the error latency through

$$\varepsilon_{i}(\cdot) = 1 - F_{EL_{i}}(\cdot) \tag{18}$$

The c.d.f. of the error latency has the property of being a non-decreasing function of the experiment length, n, and of having a non-constant value. Thus, the escape probability is a non-increasing function of n. The I.F.D.E. can be stopped, therefore, when n is large enough such that  $\epsilon_{\mathbf{i}}$  is below a preassigned positive threshold, determined by the test designer.

Since testing for I.F.'s usually takes a long time, it is extremely important to try to optimize the test procedure so that the testing time will be as small as possible. The optimality criterion we use in this paper is maximizing the probability of fault detection, or equivalently minimizing the escape probability. This goal is achieved by finding the optimal deterministic I.F.D.E. in the case of deterministic testing, and the optimal random I.F.D.E. in the case of random testing.

In the following analysis we assume that the sequential circuit is described by a <u>state table</u>. A state table has J columns one for each input symbol, and s rows one for each present state. For each combination of present state and input symbol, the corresponding entry in the table specifies the next state to which the machine will go and the output that will be produced by the transition.

A potentially detectable I.F. transforms the state table of the fault-free machine, during the activity period of the fault, to one which differs from it by at least one entry. I.F.'s on redundant lines are not detectable and therefore are excluded from this discussion. Let  $\mathsf{M}_{\overline{\mathsf{A}}}$  denote the state table of the fault-free machine ( or the state table of the faulty machine during the inactivity periods of the fault), and MA the state table of the faulty machine during the activity periods. States in MA with the same internal state variable assignments as those in  $\mathsf{M}_{\overline{\mathsf{A}}}$  are given the same labels. The State table MA must have a row for every state in  $\mathsf{M}_{\overline{\mathsf{A}}}$ . If the internal state variables are not all used in  $\mathsf{M}_{\overline{\mathsf{A}}}$ , MA may have additional rows. Well known techniques [ McCluskey, 1965 ] may be used to reduce the number of states in  $\mathsf{M}_{\overline{\mathsf{A}}}$ . Note that the faulty machine undergoes transitions according to state tables  $\mathsf{M}_{\overline{\mathsf{A}}}$  and  $\mathsf{M}_{\overline{\mathsf{A}}}$  with probabilities p and 1-p respectively.

Example 1: State tables  $M_{\overline{A}}$  and  $M_{A}$  are shown with the sequential circuit of Figure 2. The I.F. being considered is line L intermittently stuck-at-one (L/l).



FIGURE 2: Example of obtaining state tables  $M_{\overline{A}}$  and MA in a synchronous sequential circuit subject to the I.F. L/l. PS=Present State NS=Next State.

In order to determine the error latency of an I.F. it is conveni ent to construct a <u>product state table</u>,  $M_p$ , [ Poage and McCluskey, 1964 ] from the state tables  $M_{\overline{A}}$  and  $M_{A}$ . The product state table provides a tool which allows simultaneous observation of the behavior of the faulty and fault-free machines. The states of the product machine correspond actually to pairs of states. The state  $\textbf{q}_{\textbf{i}\,\textbf{i}}$  indicates that the fault-free circuit should be in state qi while the faulty machine is actually in state  $q_{\mathbf{j}}$ . The product state table is divided into two subtables by a vertical line. The left subtable corresponds to the joint behavior of the fault-free machine and the faulty machine during the inactivity periods of the fault. The right subtable corresponds to the joint behavior of the fault-free machine and faulty machine during the activity periods of the fault. Each subtable lists the input set as column headings, thus there are 2J columns in the product state table. The set  $\{q_{ij} | i=1,2,...,s\}$  is called the <u>set of initial states</u>. Since, the fault-free machine behaves according to the state table  $M_{\overline{\Delta}}$ , it is sufficient to know MA and  $\textbf{M}_{\overline{\textbf{A}}}$  in order to construct  $\textbf{M}_{\textbf{p}}.$  To do this, start by listing all the initial states as row headings and filling out the corresponding entries by assisting tables  $\mathbf{M}_{\mathbf{\bar{A}}}$  and  $\mathbf{M}_{\mathbf{A}}.$  The output is defined as 1, if the corresponding output vectors of the faulty and faultfree machines differ. The next state entries with a corresponding 1 output are denoted by  $q_{ah}$ . Add new rows to the product state table listing all those new states which appeared as next states and which are not initial states. The next state entries of  $\textbf{q}_{\textbf{a}\textbf{b}}$  are all  $\textbf{q}_{\textbf{a}\textbf{b}}$  with a 0 output, making this state a sink state. A product state table is said to be in standard form if the reset initial state is listed as the first row, and

the sink state in the last row. In the rest of this paper we assume that the product state table is presented in standard form.

Example 2: Let  $q_1$  be the reset state of the sequential machine of Figure 2. The standard form of the product state table is derived from tables  $M_{\overline{A}}$  and MA and appears in Table 1.

TABLE 1: Product state table of the sequential circuit of Figure 2 with I.F. L/l.

|                 |                    | М <sub>Р</sub>     |                       |                    |
|-----------------|--------------------|--------------------|-----------------------|--------------------|
| DC              |                    |                    |                       |                    |
| PS              | Fault inactive     |                    | Fault active          |                    |
|                 | i =0               | i=1                | i=0                   | i=]                |
| q11             | 911,0              | q <sub>22</sub> ,0 | q <sub>11</sub> ,0    | q <sub>22</sub> ,0 |
| 9 <sub>22</sub> | q <sub>22</sub> ,0 | 9 <sub>11</sub> ,0 | q <sub>21</sub> ,0    | 911,0              |
| 9 <sub>2</sub>  | ,0                 | q <sub>ab</sub> ,1 | - q <sub>21</sub> ,0- | q <sub>ab</sub> '1 |
| q <sub>ab</sub> | q <sub>ab</sub> ,0 | q <sub>ab</sub> ,0 | q <sub>ab</sub> ,0    | q <sub>ab</sub> ,0 |

It is possible now to define a probabilistic machine, based on the information contained in the product state table, which will enable the testing properties of I.F.'sin sequential circuits to be evaluated. Let

$$Q = \{q_{ij} | q_{ij} \text{ is a state in } M_P, i,j=1,2,...\}U\{q_{ab}\}, |Q|=t$$

where |Q| is the cardinality of the set Q, and

$$Z = \{0,1\}$$

Let also I,  $\Phi$  and  $\mathbf{P}$  be as defined in section 2. Then,  $<\mathbf{I},\mathbf{Q},\mathbf{Z},\Phi$ ,  $\mathbf{P}>$  defines a probabilistic machine. Well known techniques [Booth, 1967] can be used to reduce the number of states in Q. To ease the explanation we assume that the probabilistic machine is already in its reduced form, and that the matrix  $\mathbf{P}$  is expressed in standard form, namely



$$\mathbf{P} = \begin{pmatrix} (0,q_{11}) & (0,q_{22}) & (0,q_{21}) & (0,q_{ab}) & (1,q_{11}) & (1,q_{22}) & (1,q_{21}) & (1,q_{ab}) \\ (0,q_{22}) & (0,q_{22}) & (0,q_{21}) & (0,0) & (0,0) & (0,0) & (0,0) \\ (0,q_{21}) & (0,q_{ab}) & ($$

where p is the probability of activity of the fault.

The product machine describes the behavior of the sequential circuit from the time the fault becomes present. The state  $q_{ab}$  is entered and a 1 output is produced when the first output error is observed. When the product machine enters  $q_{ab}$  it can never leave. Thus, the state  $q_{ab}$  is an <u>absorbing state</u>, and the error latency is the time to absorption. The c.d.f. of the error latency is therefore the probability that the product machine will be absorbed in  $q_{ab}$  at or before time n, namely

$$F_{EL}(\cdot) = Pr(product machine in state q_{ab} at time n|\cdot)$$
 (20)

where the period represents "n random inputs" in the case of random testing, and  $i_1, i_2, \ldots, i_n$  in the case of deterministic testing.

In order to evaluate the test quality of any proposed test procedure it is sufficient to know the matrices  $P_i$ ,  $i=0,1,\ldots,J-1$  since the probability of fault detection is the probability that the product machine will be absorbed in state  $q_{ab}$ . Therefore, the output z does not provide any further information beyond that already given by the matrices

 $\mathbf{P}_{\mathbf{i}}$ . This reduces, to 50%, the number of matrices that have to be considered in the evaluation process.

#### 4. DETERMINISTIC TESTING OF AN INTERMITTENT FAULT

Since, according to our assumption, an I.F. can not affect the reset line, the initial state probabilities of the product machine at the time the test begins is

$$\vec{\phi} = \vec{e}_1 = (1,0,\ldots,0)$$

As previously stated, the probability that the sequence  $i_1, i_2, \ldots, i_n$  will detect the **I.F.** is the c.d.f. of the error latency. Thus, the probability of fault detection is given by

$$F_{EL}(i_1, i_2, ..., i_n) = (\vec{e}_1 P_{i_1} P_{i_2} ... P_{i_n})_t$$
 (21)

Example 4: The probability that the I.F.L/l,in the circuit of Figure 2, will be detected by the sequence 1001 is

$$F_{EL}(1001) = (\vec{e}_1 P_1 P_0^2 P_1)_4$$

where

$$\mathbf{P}_{0} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 - p & p & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \qquad \mathbf{P}_{1} = \begin{bmatrix} 0 & 10 & 0 \\ 10 & 0 & 0 \\ 0 & 0 & 01 \\ 0 & 0 & 01 \end{bmatrix}$$

Hence 
$$F_{F_1}(1001) = p(2-p)$$

#### 4.1 THE OPTIMAL DETERMINISTIC I.F.D.E.

The optimal deterministic I.F.D.E. is that sequence  $i_1, i_2, \dots, i_n$ which maximizes (21). One way of deriving this sequence is to evaluate the c.d.f. of the error latency for all possible  $\textbf{J}^{\textbf{n}}$  deterministic I.F.D.E.'s and then to choose one which attains the maximum value. Clearly, this is not a practical method because of the amount of computation involved. The amount of computation, however, can be greatly reduced since there are, in general, a considerable number of sequences which do not end up in state  $\boldsymbol{q}_{\mbox{\scriptsize ab}}$  and therefore have zero probability of detection. A more practical method can be derived by utilizing a testing graph. A testing graph of order n is a directed graph having n+1 levels where each level (except for the first few) has as vertices all the states of the product machine. There is a directed edge from state  $q._{\mathbf{j_1}k_1}$  in level  $\ell$  to state  $\mathsf{q}_{\mathsf{j}_2\mathsf{k}_2}$  in level  $\ell^+$  if there exists an input vector  $\mathsf{i}$  which when applied to the product machine currently in state  ${\bf q}_{{\bf j}_1{\bf k}_1}$  will transfer it to state  $q_{j_2k_2}$  with a non-zero probability. This edge is labeled by i/Pr( ${}_{j_2k_2}$ |i, ${}_{j_1k_1}$ ) to indicate that the corresponding transition occurs with probability  $\Pr(q.j_2k_2|i,q_{j_1k_1})$  when the input i is applied. These probabilities are the corresponding entries of the  $\boldsymbol{P_i}$  matrix. In level 0 there is only one vertex, namely  $q_{11}$ . The next few levels may be also degenerate depending on the possible successors of  $\mathfrak{q}_{11}$  in the product machine. The edge leaving  $\textbf{q}_{\mbox{\scriptsize ab}}$  is labeled -/1 meaning that once you enter state  $\mathbf{q}_{ab}$  you can never leave. The testing graph of order 4 of the product machine of Table 1 is illustrated in Figure 3.



FIGURE 3: Testing graph of order 4 for I.F. L/l in the circuit of Figure 2.

The problem of determining the optimal deterministic I.F.D.E.is therefore converted to the problem of tracing all the paths originating in  $q_{11}$  at level 0 and ending up in  $q_{ab}$  at level n, and then choosing that sequence which has the largest probability of detection. The probability of detection along a given path is the multiplication of all the probabilities labeled on its edges. It is important to note that different paths in the graph do not necessarily correspond to different input sequences. It is possible to have an input sequence which will have a non-zero probability of detecting the fault for more than one activity

pattern. This possibility corresponds to different paths originating in  $q_{11}$  at level 0 and ending up in  $q_{ab}$  at level n. Note also that the optimal deterministic I.F.D.E. is not unique, since there might be more than one sequence which attains the maximum value for  $F_{EL}(\cdot)$ . The use of the testing graph enables us to trace only the sequences which have a non-zero probability of detection and not spend time calculating those which will end up anyway with a zero value. These latter sequences are those which end up at level n in a vertex different from  $q_{ab}$ . Thus, only a subset of the testing graph has to be searched for any value of n. The procedure is described in the following algorithm.

#### Alaorithm 1:

<u>Step 1</u>: Find the product state table of the given sequential circuit and the prescribed fault.

<u>Step 2</u>: Derive the testing graph which corresponds to the product state table.

<u>Step 3</u>: Use the testing graph to obtain all the input sequences which have a non-zero probability of detecting the fault.

<u>Step 4</u>: Choose one input sequence which has a maximum value of detection probability. This is an optimal deterministic I.F.D.E.

<u>Example 5</u>: The optimal deterministic I.F.D.E.'s of length 3 and 4 for the circuit of Figure 2 and I.F. L/l with their associated test qualities are respectively

$$n = 3$$
 ,  $x = 101$  ,  $F_{EL}(x) = p$ 

$$n = 4$$
,  $x = 1001$ ,  $F_{FI}(x) = p(2-p)$ 

where x denotes the optimal input sequence.

In example 5 the optimal deterministic I.F.D.E. was independent of the value of p. This is not always true as seen in the following example.

Example 6: Consider a sequential circuit and an I.F. whose state tables  $M_{\overline{\Delta}}$  and  $M\!A$  are

| ™Ā             |                   |                        |  |  |
|----------------|-------------------|------------------------|--|--|
| PS             | NS,Z              |                        |  |  |
| ۲3             | i=0               | i=1<br>                |  |  |
| 97             | q <sub>1</sub> ,0 | q <sub>2</sub> ,0      |  |  |
| q <sub>2</sub> | q <sub>2</sub> ,0 | <br> q <sub>1</sub> ,1 |  |  |

| MA             |                   |                   |  |  |
|----------------|-------------------|-------------------|--|--|
| PS -           | · N               | NS,Z              |  |  |
|                | i =0              | i=1               |  |  |
| q1             | q <sub>2</sub> ,0 | q <sub>2</sub> ,0 |  |  |
| q <sub>2</sub> | q <sub>1</sub> ,0 | q <sub>1</sub> ,1 |  |  |

The optimal deterministic I.F.D.E.'s of length 3 are {001} for  $p \le .5$  and {010,011,101} for  $p \ge .5$  since

$$F_{EL}(001) = P(1-P)$$
 and  $F_{EL}(x) = p \text{ for } x \in \{010,011,101\}$ 

#### 5. RANDOM TESTING OF AN INTERMITTENT FAULT

As described in section 2, the matrices Rand  $\mathbf{R}^{\star}$  are used in order to evaluate the test quality of the random test procedure. The matrix  $\mathbf{R}$  can be found by using the matrix  $\mathbf{P}$  and relation (10).

<u>Example 7</u>: For the circuit of Figure 2, let the input probability assignment be

$$\pi_0 = Pr(i=0) = 1-\alpha$$
 ,  $\pi_1 = 1-\pi_0 = a$ 

The matrix  $\mathbf{R}$  is therefore

$$\mathbf{R} = \begin{bmatrix} (0,q_{11})(0,q_{22})(0,q_{21})(0,q_{ab})(1,q_{11})(1,q_{22})(1,q_{21})(1,q_{ab}) \\ -1-\alpha & \alpha & 0 & 0 & I & 0 & 0 & 0 \\ -1-\alpha & \alpha & 0 & 0 & I & 0 & 0 & 0 \\ -1-\alpha & \alpha & 0 & 0 & 0 & 0 & 0 \\ -1-\alpha & 0 & 0 & 0 & 0 & 0 & 0 \\ -1-\alpha & 0 & 0 & 0 & 0 & 0 & 0 \\ -1-\alpha & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}$$

and the matrix  $\boldsymbol{R}^{\star}$  is

$$\mathbf{R}^{*} = \begin{bmatrix} q_{11} & 922 & q_{21} & q_{ab} \\ q_{11} & -1 & -a & \alpha & 0 & 0 \end{bmatrix}$$

$$\mathbf{Q}^{*} = \begin{bmatrix} q_{11} & a_{11} & q_{22} & q_{21} & q_{ab} \\ q_{21} & a_{11} & q_{22} & q_{21} & q_{23} \\ q_{21} & 0 & 0 & 1 & -a & a \\ q_{21} & 0 & 0 & 0 & 1 \end{bmatrix}$$

The c.d.f. of the error latency is the probability of being absorbed in  $\boldsymbol{q}_{\mbox{\scriptsize a}\,\mbox{\scriptsize b}}.$  Hence

$$F_{EL}(n) = (\vec{e}_1 (\mathbf{R}^*)^n)_t$$
 (22)

The matrix  $\mathbf{R}^{\star}$  is an absorbing Markov matrix. The stationary distribution [Karlin and Taylor, 1975] of this stochastic process is  $\vec{e}_t$  namely, the process will absorb in  $q_{ab}$  with probability 1. From a practical point of view this means that the escape probability can go below any positive threshold if enough random inputs are applied. Notice that the stochastic process described by  $\mathbf{R}^{\star}$ , is always aperiodic because the period of state  $q_{ab}$  is 1, and therefore the stationary distribution always exists. Although (22) involves matrix multiplication there is no real need for this since we are interested only in the last component of

 $\vec{e}_1(\mathbf{R}^*)^n$ . The following algorithm provides an efficient tool for obtaining  $F_{F_1}(n)$ .

#### Algorithm 2:

Step 1: Initialization:  $v_{01} = 1, v_{02} = \cdots = v_{0t} = 0$ 

Step 2: For k=1,2,...,t and  $i \ge 1$  do

$$v_{ik} = \sum_{j=1}^{t} v_{i-1,j} r_{jk}^{*}$$
, where  $\mathbf{R}^{*} = [r_{jk}^{*}]$ 

Step 3: If i < n go tostep2; otherwise  $F_{EL}(n) = v_{nt}$  and stop.

The idea behind Algorithm 2 is to evaluate  $F_{EL}(n)$  by successive multiplications of a resultant vector by a matrix and not by evaluating  $(\mathbf{R}^{\star})^n$  first, and then multiplying it from the left by  $\vec{e}_1$ . In other words, we evaluate  $F_{EL}(n)$  by performing  $(\dots((\vec{e}_1\,\mathbf{R}^{\star})\,\mathbf{R}^{\star})\,\mathbf{R}^{\star}\dots)\,\mathbf{R}^{\star}$ . Thus, in every step we multiply a vector by a matrix which is less complicated and time consuming than the multiplication of a matrix by itself. Example 8: The application of Algorithm 2 to the circuit of Figure 2 with I.F. L/l for evaluating  $F_{EL}(4)$  yields the following intermediate results:

Let 
$$\vec{v}_i = (v_{i1}, v_{i2}, \dots, v_{it})$$
 then  $\vec{v}_l = (1-\alpha, \alpha, 0, 0)$   $\vec{v}_2 = (1-2\alpha+2\alpha^2, (2-p)\alpha(1-\alpha), p\alpha(1-\alpha), 0)$   $\vec{v}_3 = ((1-\alpha)[1-2\alpha+(4-p)\alpha^2], \alpha[1-2\alpha+2\alpha^2+(1-p)(2-p)(1-\alpha)^2],$   $p(3-p)\alpha(1-\alpha)^2, p\alpha^2(1-\alpha))$   $F_{EL}(4) = v_{44} = p\alpha^2(1-\alpha)[4-p-\alpha(3-p)]$ 

Note that in order to derive  $F_{EL}(4) = v_{44}$  it is not necessary to find  $\vec{v}_4$  entirely, but only to use step 2 of the algorithm for k=t=4, i=n=4.

#### 5. 1 THE OPTIMAL RANDOM I.F.D.E.

The optimal random I.F.D.E. is specified by the optimal assignment of probabilities to the input set as described by the following definition Definition 4:

The <u>optimal assignment of input probabilities</u> is the input probability distribution which maximizes the probability of fault detection with the constraint that no more than n input vectors are applied.

The optimal assignment of input probabilities is, therefore, that distribution  $\overrightarrow{\pi}$  which maximizes the function  $\Psi$  given by

$$\Psi = F_{EL}(n) + \mu(\sum_{i=0}^{J-1} \pi_i - 1) , \quad \pi_i \ge 0 \quad \forall i$$
 (23)

where  $\mu$  is the Lagrange multiplier.

Example 9: The optimal assignment of input probabilities for the circuit of Figure 2 subject to the I.F. L/l and for experiment of length 4 is obtained by maximizing the function

$$F_{F1}(4) = p\alpha^{2}(1-\alpha)[4-p-\alpha(3-p)]$$
 ,  $\alpha \ge 0$ 

Notice that in this case the Langrange multiplier is not needed because  $F_{EL}(4)$  is expressed in terms of only one independent variable, namely  $\alpha$ . Maximization of  $F_{EL}(4)$  yields

$$\alpha_{\text{opt.}} = \frac{3(7-2p) - \sqrt{4p^2 - 28p + 57}}{8(3-p)}$$

The optimal assignment and its corresponding probability of detection is shown in Figure 4.



FIGURE 4: The optimal assignment and optimal c.d.f. of the error latency for the circuit of Figure 2 subject to I.F. L/l and experiment length n=4.

#### 6. TESTING FOR A FAULT SET

Sections 4 and 5 considered only the case of testing for one I.F. in a sequential circuit. Clearly, this is not a practical case, since digital circuits may suffer from a large variety of faults, especially with the Large Scale Integration (LSI) technology used today.

In this section we generalize the previous methods to the case of testing for single I.F.'sin sequential circuits. In fact, the class of multiple I.F.'s which can be described by the two state tables  $M_{\center{A}}$  and  $M_{\center{A}}$  are also covered by the following treatment. For example, multiple I.F.'sin fanout lines are members of this class. The general case of testing for multiple I.F.'sis more involved, because multiple faulty state tables must be considered to account for the fact that any component of a multiple I.F. can change from active to inactive, or vice versa , regardless of the other components.

Let  $F=\{f_1,f_2,\ldots,f_m\}$  be the fault set being considered, i.e. only one member of F can be present in the circuit when the circuit is faulty. We assume that F is known to the test designer from experience with circuits of the same kind. Let the <u>probability of presence</u> [ Savir, 1977 and 1978 ] of fault  $f_i$  be denoted by  $\omega_i$ ,  $i=1,2,\ldots,m$  then

$$\sum_{i=1}^{m} \omega_{i} = 1 , \text{ where } \omega_{i} = \Pr(\text{fi is present} | \text{circuit is faulty})$$
 (24)

The c.d.f. of the error latency of the fault set F,  $F_{EL}(\cdot)$ , is related to the individual c.d.f.'s,  $F_{EL_j}(\cdot)$  j=1,2,...,m , through

$$F_{EL}(\cdot) = \sum_{j=1}^{m} \omega_j F_{EL_j}(\cdot)$$
 (25)

The problem of evaluating the test quality of a given test procedure is solved, therefore, by evaluating  $F_{EL_j}(\cdot)$ ,  $j=1,2,\ldots$ , musing previous methods and then substituting to (25) to find the overall detection probability.

The problem of proposing an optimal I.F.D.E. (deterministic or random) is solved by similar considerations. In the case of deterministic testing, you list all the input sequences of length n which have a non-zero probability of detecting at least one member of F by using the testing graph. After this task has been fulfilled you choose one such sequence which maximizes (25), and this is an optimal deterministic I.F.D.E. In the case of random testing, the optimal assignment of input probabilities is obtained by maximizing the function

$$\Psi = \sum_{j=1}^{m} \omega_{j} F_{EL_{j}}(n) + \mu(\sum_{i=0}^{J-1} \pi_{i} - 1), \quad \pi_{i} \geq 0 \quad \forall i$$
 (26)

where  $F_{\text{EL}_{,j}}$  (n),  $j=1,2,\ldots$ , m are calculated from (22).

It is important to note that although we showed a method of obtaining the optimal time invariant random I.F.D.E., the test quality achieved by any random test procedure (time variant or invariant) is never better than the one achieved by the optimal deterministic I.F.D.E. This is described in the following theorem.

#### Theorem:

There always exists a deterministic I.F.D.E. which has at most the same escape probability as any proposed random I.F.D.E.

<u>Proof:</u> Similar to the proof described in [ Savir, 1978 ] for combinational circuits.

As a consequence, the random test procedure should be used only when the decrease in test quality is not substantial and when its engineering implementation is cheap and simple.

#### 7. SUMMARY AND CONCLUSIONS

The problem of detecting I.F.'sin sequential circuits is described in this paper. It is shown that the behavior of a sequential machine subject to an I.F. can be studied by means of the theory of probabilistic automata. One way of evaluating the test quality of any proposed test procedure is by defining a product machine from the faulty and faultfree versions of the sequential C.U.T., which has an output 1 whenever the first output error occurs and 0 otherwise. Since, for the purpose of testing, only the first output error is important, consideration of later output errors is avoided by sending the product machine to a sink state (absorbing state) after the first output error is observed. The problem of proposing an optimal deterministic I.F.D.E. is solved by using a testing graph to list all the input sequences which have a non-zero probability of detection, and then choosing one such sequence which attains the maximum value. The problem of proposing an optimal random I.F.D.E. is solved by finding the optimal assignment of input probabilities.

Although the proposal of any optimal I.F.D.E. is conceptually simple, its application to real size circuits is sometimes prohibited because of the amount of computation involved. It is necessary therefore to find other methods, probably not optimal, which will be easier to implement. Currently, the efficiency of compact testing of I.F,'sin sequential circuits is being studied.

#### REFERENCES

| [ Ball and Hardie, 1969 ]        | Ball, M. and F. Hardie, "Effects and Detection of Intermittent Failures in Digital Systems," 1969 Fall Joint Computer Conf., AFIPS Conf., Proc., Vol. 35, pp. 329-335, AFIPS Press, Montvale, New Jersey, 1969. |
|----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [ Breuer, 1973 ]                 | Breuer, M.A., "Testing for Intermittent Faults in Digital Circuits," <u>IEEE</u> <u>Trans. on Computers</u> , Vol. C-22, pp. 241-246, March 1973.                                                               |
| [ Booth, 1967 ]                  | Booth, T.L., Sequential Machines and                                                                                                                                                                            |
| [ Kamal and Page, 1974 ]         | Automata Theory, New York: Wiley 1967. Kamal, S. and C.V. Page, "Intermittent Faults: A Model and Detection Procedure," IEEE Trans. on Computers, Vol. C-23, pp. 713-719, July 1974.                            |
| [ Karlin and Taylor, 1975 ]      | Karlin, S. and H. M. Taylor, <u>A First</u> <u>Course in Stochastic Processes</u> , Second <u>Edition</u> , Academic Press, New York, 1975.                                                                     |
| [ McCluskey, 1965 ]              | McCluskey, E.J., <u>Introduction to the Theory of Switching Circuits</u> , McGraw Hill, 1965.                                                                                                                   |
| [ Poage and McCluskey, 1964 ]    | Poage, J. F., and E. J. McCluskey, "Derivation of Optimum Test Sequences for Sequential Machines," in 5th Annu. Symposium on Switching Theory and Logical Design, 1964.                                         |
| [ Savir, 1977 ]                  | Savir, J., "Optimal Random Testing of Single Intermittent Failures in Combinational Circuits," Digest, 1977 Int'l Symposium on Fault Tolerant Computing pp. 180-185, Los Angeles, Californid, June 1977.        |
| [ Savi r, 1978 ]                 | Savir, J., "Model and Random-Testing<br>Properties of Intermittent Faults in<br>Combinational Circuits," to be published<br>in the <u>Journal of Design Automation</u>                                          |
| [ Shedletsky and McCluskey, 1976 | and Fault Tolerant Computing, July 1978.  ] Shedletsky, J.J., and E.J. McCluskey,  "The Error Latency of a Fault in Sequential Digital Circuit," IEEE Trans. on Computers, Vol. C-25, pp. 655-659, June 1976.   |