Techniques of Obtaining the Quality of the System LSI for the Engine Management ECU

Yasushi Tani  Tomohide Kasame  Kazuhiro Komatsu

Masahiko Fujimoto
1. Introduction
In recent years, automobiles have been equipped with various types of ECUs (electronic control units), which utilize a large number of ICs. An engine management ECU is no exception as the system LSI (large-scale integrated circuit) for the engine management ECU bears principal control. Accordingly, when installing defect free ICs into an ECU to ensure proper engine operation, the attainment of IC quality is an extremely important element.

For that reason, it is important not only to prevent the manufacture of defective ICs but to reject defective products during IC final tests. We have focused on preliminarily rejecting defective ICs during final tests by incorporating quality from the design stage. However, the increasing complexity of testing that accompanies large-scale integration is a problem.

This report will explain actions that have been taken starting from the design stage to improve the quality of a million-transistor large-scale integrated circuit which we have recently developed.

2. Overview of system LSI for engine management ECU
As shown in Fig. 1, the system LSI for the engine management ECU (hereinafter referred to as "the subject IC") includes a 16bit CPU, memory (ROM and RAM), standard resources, and user specified modules for an engine management system.

The user specified modules consist of CMOS digital circuits and high-accuracy CMOS analog circuits.

2.1 Current problems
As shown in Fig. 2, engine management ICs are growing in size year by year. And since corresponding checks are also increasing, testing is become more complicated. The scale of the subject IC is in the million-transistor range, which is about twenty times that of ICs developed three years ago.

And since the subject IC has a built-in micro-controller, it can be used in a variety of ways according to its software, making it difficult to implement tests that take into consideration all usage conditions.

Based on the aforementioned points, in order to pre-
vent defective ICs from making their way into the market, it is important to confirm the coverage of testing for IC-related items that must be checked.

3. Fault coverage and defect level

The probability that an IC manufacturer will ship a defective IC is called the defect level. The defect level is normally expressed in terms of ppm (pieces per million or parts per million). The actual defect level, however, cannot be verified unless a certain amount of product is mass produced (such as several million units). Thus, to implement a measure from the design stage, the after-production defect level must be predicted ahead of time.

3.1 Fault coverage

One of the indicators used to predict the defect level is the fault coverage. The fault coverage numerically expresses the amount of faults that can be detected by a function test, making verification possible during the design stage. The term "fault," as used here, refers to the condition in which the IC internal cell terminals are fixed at either "Hi" or "Low" (stuck-at faults).

3.2 Relationship between defect level and fault coverage

The defect level and fault coverage are generally expressed as indicated by the following model formula. DL represents the defect level, Y the yield, and T the fault coverage.

\[ DL = 1 - Y^{(1 - T)} \]

Fig. 4 shows a graph of this model formula. If \( T = 0\% \), \( DL = 1 - Y \), the manufacturing defect rate is the same as the defect level. If \( T = 100\% \), the defect level is zero.

From Fig. 4, if the fault coverage is 95%, a yield of at least 99.98% would be needed to prevent defective products from being released during our annual production.

3.3 Examination of model formula

The aforementioned model formula is based on two assumptions. The first assumption is that each fault occurs according to independent probabilities, and the second assumption is that each part has an equal probability of a fault occurring.

With the former assumption, only one fault is assumed to occur at a time. It does not assume multiple faults because an inappropriate signal caused by the fault is assumed to pass through another normal circuit and is output to an IC terminal.

With the latter assumption, the probability of a fault occurring during various fabrication processes is assumed to be equal and it does not depend on its location on the IC chip. However, the cause of a fault in this case is usually assumed to be dust that primarily enters during the fabrication process of creating the dielectric film between the aluminum layers. For this reason, the fault occurrence probability for each process is not the same probability. Furthermore, dust is thought to often be the cause of aluminum pattern shorts, so faults that are not stuck-at faults, such as short circuits between layers or patterns, are not predicted by the fault coverage. And compared to faults that occur in the area of each transistor (minor faults in gate oxide film or GDS region, for instance), there is a high potential for spreading to multiple elements when broken aluminum patterns and shorts are involved. These are also related to the former assumption. Moreover, faults that occur due to process-related problems are rejected via means other than test patterns (including process monitor elements), and the actual yield is higher than the product yield.

Based on the above explanation, a correction such as that shown below can be assumed to be necessary.

\[ DL = 1 - Y^{(1 - a T)} \]

However, a and a are currently not calculated.

3.4 Target setting

Thus, a decision was made to set a target fault coverage based on the quality record of micro-controller ICs, a conventional product. This product is fabricated using the
same design rule and process as that of the subject IC, so it was determined that the defect level of the newly developing IC could be predicted from the quality record.

4. Improvement of fault coverage

As previously explained, the fault coverage numerically expresses the amount of faults that can be detected during a function test. Accordingly, the design of function test patterns is the principal activity for improving this rate.

4.1 Work flow

Fig. 5 shows a typical work flow. First, the fault coverage is calculated using test patterns that are based on specifications that are also used for function verification. Then, if the fault coverage does not achieve the target, undetected faults are analyzed and test patterns are again created. These tasks are then repeated.

4.2 Improvement in fault coverage

When the fault coverage was initially calculated, the subject IC did not achieve the target. Thus, activities were initiated; and after two months, improvements were made until the target was nearly achieved. As improvements in the fault coverage were made, however, it became more difficult to cover undetected faults. In fact, the rate of improvement made during the second month was only about one seventh of that achieved during the first month.

5. IDDQ test

Since the pace of fault coverage improvement slowed down, a decision was made to add the IDDQ test, whose adoption by other companies has recently begun.
During an IDDQ test, operations are paused during test pattern input and a circuit's dark current is measured. The subject IC involves a CMOS process, so at a state of quiescence almost no circuit current flows; but when there is an abnormality, such as a fault in the IC, a large amount of circuit current flows.

5.1 Faults that can be rejected

Signals that are input by test pattern turn the IC's internal transistors on and off, confirming the electric potential. When there is a fault, feed through current flows. Fig. 6 shows examples of faults that can be rejected by the IDDQ test. Examples (a) and (b) are VCC and GND short faults, respectively, and stuck-at faults. Example (c) is an open patterning fault. Example (d) is a short between patterns or layers, but to detect it as a fault requires setting the electric potential of the nodes.

5.2 Observation of IDDQ test

During a function test, the internal transistors are operated; then signals of that operation must be sent to the IC output terminal. In contrast, during an IDDQ test, the internal transistors are operated, the IC's dark current is measured, and a determination is made based on the amount of that dark current; thus, this test is superior with respect to Observation.

A disadvantage of the IDDQ test, however, is that it takes longer to perform than a function test because the dark current is measured, which in turn affects the cost of such tests if many are conducted.

5.3 Application to the subject IC

As previously mentioned, the IDDQ test is also effective at rejecting stuck-at faults; thus, this test was implemented for undetected faults that could not be covered during a function test. Results showed that this test could cover 60% of the undetected faults.

6. Effective fault coverage

From the preceding description, it is clear that the effective fault coverage, which combines the subject IC's function test and IDDQ test, achieved the target.

Thus, the target defect level was achieved.

7. User mode test

There were cases, however, in which the target defect level was not achieved even though the target fault coverage was achieved for other micro-controller ICs manufactured under the same fabrication process. These cases occurred as the result of certain special faults.

7.1 Special fault

In a certain case, an IC was mounted onto an ECU and the ECU then failed to operate as it normally would. It became clear that, for some reason, the CPU inside the IC had lost control. Fig. 10 shows an analysis of the IC's internal operations at that time. The CPU exchanges data...
Techniques of Obtaining the Quality of the System LSI for the Engine Management ECU

for each resource while executing branch instructions and interrupts according to the program stored on the ROM.

This IC, however, was not rejected during a normal final test. As shown in Fig. 11, the final test included a ROM test, CPU test, and various resource tests, but no particular problem was detected. During a ROM test, the ROM is fully accessed in the order of address, and the ROM data is read in order. During a CPU test, commands are directly input to the CPU from the outside and then processed. And during the resource tests, data is exchanged from the outside directly to the resource in order to test the functions.

Thus, although individually, tests are conducted for all blocks. There are some fault modes that occur only when certain command combinations are executed along with an actual program. Because of this fact, some sort of corrective action had to be taken.

7.2 User mode test

Thus, it became necessary to conduct a final test that included the same operations that occur with an installed ECU. For this reason, input signals were added to reproduce the conditions found in a vehicle, and the test was conducted under actual operating conditions via internal ROM.

7.3 Effect

We retested micro-controller ICs which had been manufactured under the same design rule and fabrication process as the subject IC, and which were detached from ECUs that were returned from our plant line or marketed as defective.

Although they passed the final test, when a user mode test was conducted, 70% of them were determined to be defective. The results verified that the target defect level of the subject IC will be achieved and that its defects from test omissions will be zero. This shows that the subject IC can be fully expected to be effective.

8. Conclusion

As previously described, through the addition of the function test and IDDQ test, the effective fault coverage can be expected to surpass the target. And with the addition of the user mode test, defective products arising from test omissions can be expected to be zero.

One issue for the future is the growing number of test design man-hours that accompany improvement in the fault coverage. The future process refinement will have less impact on chip size even if we introduce the design techniques that simplify testing, including a scan design that will make it possible for a circuit's internal registers to be easily controlled and observed.
Profiles of Writers

Yasushi Tani
Joined company in 1992. Since that time, has engaged in development of automobile LSI circuits. Is currently involved in L3 project of LSI Circuit Development Department.

Tomohide Kasame
Joined company in 1985. Since that time, has engaged in development of automobile LSI circuits. Is currently involved in L3 project of LSI Circuit Development Department.

Kazuhiro Komatsu
Joined company in 1992. Since that time, has engaged in development of automobile LSI circuits. Is currently involved in L3 project of LSI Circuit Development Department.

Masahiko Fujimoto
Joined company in 1982. Since that time, has engaged in development of motoronics equipment. Is responsible for L3 project of LSI Circuit Development Department as a project manager.