FANDOM


While possibly equivalent in theory, SoC and SiP are very different from each other in terms of which technologies tend to be more easily integrated in package versus on chip and each has very different test implications. Recent advancements in assembly and packaging technologies coupled with the difficulty of optimizing the same wafer fabrication process for different core semiconductor technologies have provided a lot of momentum for SiP, causing some to forecast that SiP will be dominant. It may be that wafer fabrication process improvements and design/DFT needs could push SoC to the fore front or there could be hybrids of the two. One thing is clear: integration is a trend that will continue. The only questions are how fast and in what forms. The next two sections will discuss the test challenges and implications associated with SoC and SiP respectively.

=== 1.1.1    System on a Chip ===

A SoC design mainly consists of multiple IP cores, each of which is an individual design block and its design, its embedded test solution, and its interface to other IP cores are encapsulated in a design database.  There are various types of IP cores (logic, memory, analog, high speed IO interfaces, RF, etc.) using different technologies. This assortment requires a diversity of solutions to test dies of the specific technologies corresponding to these embedded cores. Thus SoC test implies a highly structured DFT infrastructure to observe and control individual core test solutions. SoC test must include the appropriate combination of these solutions associated with individual cores, core test access, and full-chip testing that includes targeting the interfaces between the cores and the top-level glue logic (i.e. a logic not placed within a core) in addition to what is within each core instance. Effective hierarchical or parallel approaches and scan pattern compression techniques will be required to evaluate and adjust the overall quality and cost of the SoC to an acceptable level for customers.

On the other hand, the SoC test technology improvement to handle a progression of design technologies accelerated by the evolving applications is indispensable. The technologies and the requirements of the DFT design (the design intent) are addressed in the Design Chapter. The well-organized roadmap and the potential solutions that reflect these design intents should be reviewed by the readers. For example, low power design methodologies, which improve the chip performance, are widely adopted in various current SoCs. However, it is not easy to test the SoC without deeply understanding its functional behaviors and physical structures. As a result, the conventional DFT that focuses only on the static logic structure is not enough anymore, and the evolution to tackle this issue is strongly required.

The quantitative trends and requirements of a consumer logic chip are shown in Table TST8, which is given later in Logic section, compared with a MPU chip. Table TST4 introduces the guideline for DFT design and the requirements for EDA tools. 

Spreadsheet Table TST4 – DFT Requirements

==== 1.1.1.1      Requirements for Logic Cores ====

Sophisticated DFT methods such as random pattern logic BIST or compressed deterministic pattern test are required to reduce large amount of test data for logic cores. The adopted method should consider the pros and cons regarding DFT area investment, design rule restrictions, and associated ATE cost. DFT area mainly consists of the test controllers, compression logic, core wrappers and test points, which can be kept constant over time by using a hierarchical design approach

Both SoC and MPU devices have an increasing amount of digital logic on the devices.  Table TST7, given later, shows a common view of the DFT techniques which are expected to be used moving forward in an effort to cover the most likely faults (as modeled by the EDA systems) while attempting to keep the test costs low by effectively managing the test data volume.

There are four basic approaches in use for scan test generation: 

1.      The EDA tools can consider the circuit in its entirety and generate what’s called a “flat” test without leveraging the hierarchal design elements nor including pattern compression techniques. Virtually no one does this anymore, but it is useful for comparison purposes with the more appropriate approaches briefly described below.

2.      The EDA tools can consider the hierarchal design elements to achieve an on-die parallel test setup. Parallel test would be applied to instances of wrapped cores to enable multiple instances to be tested in parallel.

3.      The EDA tools can imbed compression and decompression circuitry around the scan chains, allowing for many times more chains internally without increase of ATE scan pin resources, resulting in less data being required to be stored on the ATE for stimulus or output comparison purposes.

4.      The EDA tools can implement a combination of 2 and 3 for a compressed hierarchal approach. This would involve cores being wrapped for isolation and including compression within the cores. Further compression may be utilized by testing multiple instances with the same set of scan-in pins, considered scan pin sharing, to allow testing of multiple instances of cores in parallel. The test data/scan outputs from each core instance may be observed independently or further compressed together and sent to a common set of chip scan-out pins, possibly resulting in more chip scan pin sharing.

The approach used to apply tests to embedded cores will have a large impact to test time and perhaps also test data volume. One approach that has been used in the past is to test a core in isolation and route its stimulus and expected responses up to the SoC pins so as to avoid having to do ATPG for the core at the SoC level. This saves CPU time for running ATPG, but fails to help reduce test time for the SoC. A more effective approach that can be applied when using test compression is to test multiple cores in parallel and not put them into complete isolation from other cores. Thus, while test compression may be used inside of cores, it may also be used above the cores to allow the scan stimulus to be sent to multiple cores in parallel and to compact the output from several cores before sending it off chip.

A tradeoff between test quality and test cost is a great concern. ATPG should support not only stuck-at and transition faults but also small delay and other defect-based faults to achieve a high-level of test quality. Test pattern count will increase over the roadmap as logic transistor count increases. To avoid rising test cost, the test application time per gate should be reduced over the roadmap. Therefore various approaches, such as test pattern reduction, scan chain length reduction and scalable speed-up of scan shift frequency, should be investigated. However the acceleration of scan shift speed may increase the power consumption during scan shift cycles and so it may make the test power problem more serious. Some DFT and ATPG approaches to reduce the power consumption during scan shift cycles are required. There is also the important issue of excessive power consumption during the capture cycle. Several approaches to relax this issue have been proposed, but most of them cause an increase of test pattern counts and consequently make its impact on test application time intolerable. Some low capture power test approaches to minimize the increase of test pattern counts are also required. The impact on test data volume from these low-power scan sequences is shown with a 20% test data volume premium in the low-power rows. This will be too optimistic for cases where very low (e.g. less than 15%) switching is required since that could easily result in a doubling of the pattern count for the same coverage.

Another problem caused by the increase of test patterns is the volume of test data. Even assuming tester memory size will be doubled every three years, high test data compression ratios will be required in the near future; therefore, test data reduction will remain a serious issue that must be tackled. One possible solution for simultaneous reduction of test application time and test data volume is simultaneous test of repeatedly used IP cores in a design that can share a common set of chip scan pins. By broadcasting the same scan-in stimulus to all such core instances, we reduce the bandwidth of data being sent from the ATE onto the chip and need less storage for that data on the ATE. Observing the outputs from each instance independently can aid in diagnosing failures, but by compressing the core instance outputs together and observing them at a common set of chip pins further increases the effective compression that may be within each core.

The increase of power domains may require some additional test patterns. However, the increase of test patterns will be linear to the number of power domains, so it won't have severe impact on overall test pattern counts. Nevertheless the increase of power domains or restrictions on test power may prevent maximum simultaneous test of identical IP cores. The impact of this effect may be investigated for future editions of the roadmap.

The issue of power consumption during test mentioned above is one cause for the increase of test patterns which will increase test data volume. Therefore requirements on test data reduction also take account of this issue.

TST10

Figure TST10 – DFT Compression Factors (Flat with No Compression = 1)

Figure TST12 shows the impact of hierarchy and compression DFT techniques on the problem of test data increase. The current compression technologies mainly utilize the fact that each test vector has many ‘X-values’ (don’t care bits that don’t contribute to the increase of test coverage), and factors of more than 100X compression are often achieved. However, even a 500x compression won’t be enough as shown in Table TST9 for SoC, therefore, more sophisticated technologies will be required in the future. Figure TST12 shows the level of compression anticipated.   The similarity of test vectors applied on scan chains will allow a chance of achieving higher compression ratio. The similarity of test vectors applied in time space may also allow further compression. Thus, utilizing multi-dimensional similarity will be a potential solution.

Note: the SOC-CP data volume and compression factors do not show smoothly changing values due to the sharing of scan pins done to reduce data volume, but the small number of scan pins available for these devices means that as the percentage of scan pins shared goes up, the number of cores that can be tested in parallel goes up due to rounding more like a step function. When more cores can be tested in parallel, you get lower data volume and better compression factors and with only a few scan pins available, not all pin sharing percentages changes result in a change to the number of cores able to tested in parallel – thus some years see less improvement than other years.

In order to map this anticipated test data volume to tester and test time requirements one must take into account the number of externally available scan chains and the data rate used to clock the test data into and out of the device.  Estimates for these important parameters are shown in the SOC and MPU sections of Table TST8, which is given later.    Since these parameters may vary on a part by part basis, the resulting data will need to be adjusted based on the approach taken on one part versus another:

·        Designing more scan chains into a device results in more parallel test efficiency and a proportionally faster test time and less memory per pin in the test system. This assumes the scan chain lengths are proportionately reduced.

·        Clocking the scan chains at a faster speed also results in a faster test time but doesn’t reduce the pattern memory requirements of the ATE.

The other question when looking at the ATE memory requirements is which pattern compression technique is chosen for use on a given device.   This question is impacted by many parameters including device size, personal preference and time to market constraints.   As such, the analysis in Table TST8 shows the minimum patterns per pin necessary to test the most complex devices. Thanks to the usage of more elaborate pattern generation techniques the data suggests that the minimum pattern requirement will only grow by 2x to 3x over the roadmap period.

The test time needed to drive and receive this much test data volume is impacted by the data speed in use.   As cost effective higher speed tester capabilities are deployed we feel that these data speeds will be used to speed the tests and reduce the test time per device.   The analysis calculates this impact and suggests that test times will be dropping over time due to these faster scan shifting speeds. It should be noted that keeping the test application time per gate constant does not immediately mean a stable test application cost. Therefore some approaches to reduce the ATE cost, such as the increase of parallel sites, the use of low-cost ATE, or a speed up of test, are also required to establish the scalable reduction of test cost per transistor.

Concurrent parallel test in the core hierarchy has the potential of reducing test time. ATPG/DFT level reduction technologies should be developed in the future. “Test per clock” means a test methodology that is quite different from the scan test (i.e. non-scan test). The test is done at each clock pulse and the scan shift operation is not needed. There is some research regarding this methodology, however, more research is required for industrial use.

High-level design languages are being used to improve design efficiency, and it is preferable that DFT is applied at the high-level design phase. DFT design rule checking is already available to some extent. Testability analysis, estimation of fault coverage, and DFT synthesis in high-level design that includes non-scan-design approaches are required in the next stage. Yield-loss is a concern. As test patterns excite all possible faults on DUT, it will lead to the excessive switching activity, which does not occur in normal functional operation. This will cause excessive power consumption making the functional operation unstable, and eventually may make the test fail, which will cause over-kill. In addition, signal integrity issues due to resistive drop or crosstalk can also occur which would make the functional operation unstable or marginal and eventually cause failures. Therefore, predictability and control of power consumption and noise during DFT design is required. It is also that the leak current of test circuit itself should be also considered as a part of power consumption.

The discussion so far in this section has focused on the automatically generated scan based testing requirements.   Functional test techniques continue to be broadly deployed in order to enhance the scan based testing techniques in an attempt to confirm the device’s suitability for the desired end-use application.   Additionally, more and more memory arrays are getting embedded inside of both MPU and SOC devices.  

==== 1.1.1.2      Requirements for Embedded Memory Cores ====

As process technology advances, and due to some special application needs, both the number of memory instances and the total capacity of memory bits increases and will cause an increase in area investment for BIST, repair and diagnostic circuitry for memories. As the density and operating frequency of memory cores grow, memory DFT technologies as follow are implemented on SOCs and be factors of area investment increase:

·        To cover new types of defects that appear in the advanced process technologies, dedicated optimal algorithms must be applied for a given memory design and defect set. In some cases, a highly programmable BIST that enables flexible composition of the testing algorithms is adopted.

·        Practical embedded repair technologies, such as built-in redundancy allocation (BIRA) which analyzes the BIST results and allocate redundancy elements, and built-in self-repair (BISR) which performs the actual reconfiguration (hard-repairing) on-chip, are implemented for yield improvement.

·        On-line acquisition of failure information is essential for yield learning. A built-in self-diagnostic (BISD) technology distinguishes failure types such as bit, row, and column failures or combinations of them on-chip without dumping a large quantity of test results and pass the results to ATE to utilize them for the yield learning. The testing algorithm programmability mentioned above has to be more sophisticated to contribute the diagnostics resolution enhancement. It must have a flexible capability to combine algorithm and test data/condition, and a memory diagnostic-only test pattern generation capability which is not used in the volume production testing.

·        All the above features need to be implemented in a compact size, and operate at the system frequency.

The embedded memory test, repair and diagnostic logic size was estimated to be up to 35 K gates per million bits in 2013. This contains BIST, BIRA, BISR, and BISD logic, but does not include the repair programming devices such as optical or electrical fuses. The ratio of area investment to the number of memory bits should not increase over the next decade. This requirement is not easily achievable. In particular, when the memory redundancy architecture becomes more complex, it will be difficult to implement the repair analysis with a small amount of logic. Therefore, a breakthrough in BIST, repair and diagnostic architecture is required. Dividing BIST, repair and diagnostic logic of memory cores into a high-speed and a low-speed portion might reduce the area investment and turn-around-time for timing closure work. A high-speed portion that consists of counters and data comparators can be embedded in the memory cores which will relax the restrictions for system speed operation in testing mode. A Low-speed portion that consists of the logic for scheduling, pattern programming, etc. can be either designed to operate at low-speed or shared by multiple memory cores, which will reduce area investment and ease logical and physical design work. A lot of small-size memory cores are very often seen in modern SoC; however, they require a larger amount of DFT gates than for a single memory core of the same total bit count. Therefore consolidating memory cores into to a smaller number of memory blocks can reduce memory DFT area investment drastically. Testability-aware high-level synthesis should realize this feature in the memory cell allocation process and consider the parallelism of memory access on system operation.

==== 1.1.1.3      Requirements for Integration of SoC ====

Reuse of IP core is the key issue for design efficiency. When an IP core is obtained from a third party provider, its predefined test solution must be adopted. Many EDA tools already leverage a standard format for logic cores (for example, IEEE1500[1]); and this format must be preserved and extended to other core types, such as analog cores. The DFT-ATE interface is going to be standardized (for example, IEEE1450[2]), and it should include not only test vectors but also parametric factors. Automatic design and test development environment is required to construct SOC level test logic structure and generate tester patterns from the test design information and test data of each IP core. This environment should realize concurrent testing described below.

Test quality of each core is now evaluated using various types of fault coverage such as stuck-at fault, transition delay fault, or small delay fault coverage. A unified method to obtain overall test quality that integrates the test coverage of each core should be developed. Conventionally, functional test has been used to compensate structural test’s quality. However, automated test for inter-core or core interface should be developed in the near future. SoC level diagnosis requires a systematic hierarchical diagnosis platform that is available for learning the limiting factors in a design or process (such as, systemic defects). It should hierarchically locate the defective core, defective part in the core, and the defective X-Y coordinate in the part. A menu of supported defect types must be enhanced to meet with the growing population of physical defects in the latest process technology. Smooth standardized interfaces of design tools with ATE or FA machines are also required. Volume diagnosis is required to collect consistent data across multiple products containing the same design cores, which is stored in a data base and is analyzed statistically using data mining methods. The menu of data items is very crucial for efficient yield learning, but it is a know-how issue now.

==== 1.1.1.4      Concurrent Testing ====

For SoC test time reduction, concurrent testing, which performs tests of plural IP (non-identical) cores concurrently, is a promising technology. For instance, the long test time of high speed IO can be mitigated if others tests could be performed at the same time, which would decrease the total test time drastically. To realize the concurrent testing concept, there are items that must be carefully considered in the product design process. These items include the number of test pins, power consumption during test, and restrictions of the test process. These items are classified as either DFT or ATE required features in Figures TST13 and TST14.  IP cores should have a concurrent test capability that reduces the number of test pins (Reduced Pin Count Test: RPCT) without a test time increase and a DFT methodology which enables concurrent testing for various types of cores. As these requirements differ corresponding to the core types on a chip, a standardized integration method of RF, MEMS and optical devices into a single SoC with conventional CMOS devices can be developed. It includes unification and standardization of test specification which are used as interfaces by IP vendors, designers, DFT engineers and ATE engineers that can be combined with breakthrough analog-mixed signal / RF DFT methodologies. (e.g. Integrated efficient interfaces for test of core itself and core test access, or wide adoption of IEEE Std 1500, and its extension to analog etc.)

DFT and ATE must cooperatively consider concurrent testing requirements and restrictions. This may not be an easy task as there are multiple challenges to enable concurrent testing. For instance, ATE software needs to be able to perform concurrent test scheduling after analyzing the power and noise expected during testing based upon design and test information specified for each IP core and chip architecture by the designer.


Figure TST13 – Required Concurrent Testing DFT Features

Features

Contents

External test pin sharing

Each JTAG enabled IP core must use the 5 JTAG interface (TRST, TMS, TCK, TDI, TDO).  Cores that have non-JTAG interfaces must be able to share external test pins with other cores.

Design for concurrent testing

The test structure of an IP core must be operationally independent from that of all other IP cores

Identification of concurrent test restrictions

The presence of any test restrictions for each IP core must be identified to the scheduler. (e.g. Some IP cores are not testable at the same due to noise, measurement precision, etc.).

Dynamic test configuration

Test structures/engines that  can change the order of test and the combination of the simultaneous test to each IP core

Test Data Volume

The test data volume of all IP cores must be able to be stored in the tester memory

Test scheduling

Critical information on each IP must be available to the test scheduler a) Test time of each IP core. b) Peak current and average power consumption for each IP core c) Test Frequency for each IP core

Common core interface

The test access interface of IP cores must be common among all IP cores (e.g. IJTAG)

Defective IP identification

There must be a mechanism to identify defective IP cores prior to and during test

Figure TST14 – Required Concurrent Testing ATE Features

Features

Contents

Numerous Tester Channels with Frequency Flexibility

A large Number of Test channels that cover a wide range of frequencies will enable efficient concurrent testing. Test channels must provide test data such as clocks, resets, data, or control signals to many corresponding IP blocks. Testing can be more flexible if channels assignments are dynamically changeable

Mixed Data type support

Capability of loading / unloading test data that is a mixed combination of digital, analog, and high speed I/O data is required.

IP block measurement accuracy

Measuring accuracy of testing (e.g. high-speed I/O test) should be preserved in concurrent testing to match the specifications.

Test Data Handling Efficiency

Test data loadable to each divided test channel should closely match memory usage efficiency as that of non-concurrent test.

Power supply capability

A Large number of capable power supplies pins will enable large number of IP blocks to be simultaneously tested

Multi-Site Testing capability

Capability to perform both multi-site testing and IP-level concurrent testing at a time will enable efficient testing

Capable Software

Automated test scheduling software that can decide test scheduling configurations while considering many constraints is required.

Multi-site testing is another approach to reduce effective test time per a die or a chip. Effect of cost reduction in each approach depends mainly on the number of test pins and production volume as shown in Figure TST15 – Comparison between Multisite and Concurrent. Larger number of test pins will make the number of multi-sites smaller, and higher production volume will get larger profit by the cost reduction. To estimate an accurate profit, cost of jigs and expense on engineering and designing should be also considered.


Figure TST15 – Comparison between Multisite and Concurrent

Pin

Count

Production

Volume

Efficiency

Multisite Testing

Concurrent Testing

Many

Large

Medium

High

Small

Low

High

Few

Large

High

Medium

Small

Low

Low

Consideration

Cost of Jig (Initial Cost)

- Probe Card, Test Board etc.

Cost of Tester

- Pin Count, Power Supply etc.

Reduction of Test Pins (RPCT)

Cost of Chip

- Impact on area etc.

Cost of Design

1.1.1.5      DFT for Low-Power Design Components Edit

Low-power design is indispensable for battery-powered devices to enhance system performance and reliability. The design includes multiple power domains which are independently controlled by PMU (Power Management Unit), and some special cells used for controlling power at physical level, such as level shifters, isolators, power switches, state retention registers, and low power SRAMs.

However, the design raises new requirements in testing, which are some dedicated functions in test.  For example, an isolator should be tested in both active and inactive mode to fully test its functionality and a state retention cell requires a specific sequence of control signals to check if it satisfies a specification. Please refer to the Figure TST16 -- Low-power Cell Test for more low-power cells.   Some of the defects on the special low power cells may be accidentally detected in an ordinary test flow but it is usually not enough to ensure entire low power features of a design.  These functions have not been treated in the historical scan test that only focuses on the structure of circuits. Therefore a full support of these dedicated test functions for special low power cells is strongly required.

Figure TST16 – Low-Power Cell Test

#

Component

Test Contents

1

Isolator

Generate patterns controlling power-on/off of the power domain

2

Level Shifter

Include the cell faults in the ATPG fault list

3

Retention F/F

Generate patterns to confirm saved data after RESTORE operation

4

LP SRAM

Generate patterns which activates peripheral circuit inside the macro during the sleep mode and confirm the cell data retention

5

Power Switch

Generate patterns to measure IDDQ with domains power on/off

=== 1.1.2    System in a Package ===

In contrast to SoC, SiP offers the option of testing components prior to integration. This is important since integrating one bad component could negate several good components in the SiP, severely limiting SiP yield. In addition, this component testing must typically be done at wafer probe test since integration occurs at assembly and packaging. A key challenge then is identifying good die prior to integration. The term “known good die,” or KGD, was coined during the mid-1990s to designate bare die that could be relied upon to exhibit the same quality and reliability as the equivalent single chip packaged device.

In most instances, testing and screening the device in a single chip package format achieves the outgoing quality and reliability figures for IC products shipping today. Wafer probe test is not generally suitable for performance sorting, reliability screening, or effective parallel contacting, so it is generally more efficient to do these tests at the package level using test and burn-in sockets, burn-in chambers, and load boards. Consequently, KGD processing implies that die will be up-binned at probe or with a subsequent insertion of die level tests and screens to meet acceptable quality and reliability targets. The key short term challenges are to determine the quality and reliability targets required in different market segments, develop cost effective tests and reliability screens that can be applied at the wafer or die level, and to develop quality and reliability methods that provide high confidence regarding quality and reliability levels achieved. Longer-term challenges will be to move to a complete self-test strategy with error detection and correction available in the end application.

==== 1.1.2.1      Stacked Die Testing and Equipment Challenges ====

Stacked Die (SiP and TSV) products can present many unique challenges to backend manufacturing flows because these products can contain die from more than one supplier. This can create problems in the areas of:

·        development of a package test strategy to realize both cost and DPM goals

·        production flows to accommodate the necessary reliability screening methods (burn-in, voltage stress, etc) of diverse product/process technologies

·        failure analysis methodologies for fault localization in order to resolve quality problems and systematic yield issues

Stacked Die test at the package level closely resembles the test problems of complex SoC products, that is, a variety of IP, each with specialized test requirements, which must be consolidated into a single consistent test flow. In the case of SoC, because everything is on one chip and designed together, various block test strategies can be consolidated via the use of test shell wrappers, test control blocks, etc. using strategies such as defined in the IEEE 1500 specifications. In the case of Stacked Die situation, die suppliers may be reluctant to provide information needed to access special test modes (sometimes considered confidential, especially for commodity memory products) and the individual die may not have the necessary test infrastructure overhead to implement test strategies commonly used for SoC.

Even in the case of SiPs that use only KGD, a certain amount of testing is necessary after final assembly to ensure that the die have been assembled properly. When final assembly may include die thinning and stacking, which can damage/change KGD die, additional testing may be necessary. For the case of fault localization, the ability to narrow the failure to a specific die, and further to a small region of that die, may require full understanding of the detailed test strategies for that die, even if not necessary in normal production..

In the case of reliability screens, some die may require burn-in while others may require only voltage stress. Stress conditions for one die may be inconsistent (or even detrimental) to other die in the same package. Resolution is more difficult since the different die in a SiP product often have totally different processes. One solution is to avoid reliability screens after final packaging but this can increase overall costs (for example, wafer level burn-in is typically more costly than package level burn-in).

When heterogeneous die are assembled into a multi-chip package, several test insertions on different platforms may be required to test the assembled module fully. The multiple test insertions may result in test escapes or yield fallout due to mechanical damage. New testing equipment will be required to accommodate contacting the top side of the package for package stacking. For wafer stacking technologies, better redundancy/repair technologies are needed so that the final stack can be “fixed” to achieve yield/cost targets. Design and production of electronic systems that can detect failed components and invoke redundant elements while in service is a key challenge for SiP reliability.

==== 1.1.2.2      Wafer Testing and Equipment Challenges/Concerns ====

The probe card technologies in common use today are less than ideal as a “final test” environment. Since much of the performance based speed critical, RF, delay and analog testing is presently performed at package level, a critical challenge for KGD processing is the development of cost-effective, production worthy, reliable and accurate methods of rapidly identifying devices that are defective or will fail early in an application before those devices are transferred to the next level assembly.

Test time for certain technologies, such as display drivers or state of the art DRAM is exceedingly large. Because of the limitations in the wafer probing process, the test throughput is much less than packaged components. The challenges for fully testing DRAM die in a cost effective manner at the wafer level include development of technology that can probe multiple die on a wafer without overlapping previously probed die or stepping off the wafer, and to avoid wasting test time and power on all previously rejected and obviously non-functional die.

==== 1.1.2.3      Wafer Test for RF Devices ====

A key challenge for applying KGD processes to RF die is development of high performance, fine pitch probe cards. Because of the small size of RF die, the pad pitch is very small. As an example, the pad pitch in some products can go below 75 µm, which is the limit of the actual probe technology today.

In order to obtain good signal integrity during RF probing, a configuration of GND-Signal-GND for RF signals is required. A key challenge for KGD processing of RF devices is to ensure that the GND-Signal-GND configuration is designed into the die to maintain the RF path at controlled impedance, given proper probe card design and RF probing techniques.

==== 1.1.2.4      Reliability Screening at the Wafer or Die Level ====

Voltage and temperature over time are the known stresses for accelerating silicon latent defects to failure. These are more readily applied at package level than at wafer or die level. Applying these stresses prior to packaging the die is a key challenge for KGD.

Development of a cost-effective full-wafer contact technology with the process capability required for manufacturing is a key challenge for the industry. Contact process capability is a function of not only the contactor technology performance but also the burn-in stress requirements for a given product.

==== 1.1.2.5      Statistical Processing of Test Data ====

Techniques using statistical data analysis to identify subtle and latent defects are gaining favor in the industry, especially for device types with low shipping volumes, part number profusion and short product lifetimes that make burn-in an untenable option and for products where intrinsic process variation makes separating good die from defective die impossible using traditional on-tester limits. The advantages of reliability screening at a test insertion instead of burn-in are savings in time, fixtures, equipment, and handling. The KGD implications are that screens can be performed at the wafer level with standard probes and testers, so every device can be considered fully conditioned in compliance with data sheet specifications and shipped quality and reliability targets for that process regardless of the final package in which the device is to be shipped. Using off-tester statistical methods the test measurements (for example, Idd, Vddmin, Fmax) of each die are recorded instead of being binned. These measurements can be recorded for different test conditions, pre and post stress testing, and at different temperatures. Pass or fail criteria are determined based on statistical analysis of the measurements recorded using off-tester post processing algorithms. Outliers to the statistical distribution are graded based on their statistical likelihood of being system failures or early life failing devices, and the inkless wafer maps are modified accordingly. The challenge for testing using statistical methods is to meet an acceptable trade-off between the potential failing population and the intrinsic yield loss.

==== 1.1.2.6      Subsequent Processing Affects the Quality of the Die ====

The processing that occurs during assembly can damage some technologies. Wafer thinning is one example: when DRAM wafers are thinned, a shift in the refresh characteristics has been observed. A die from a wafer that was fully tested at wafer level may fail the exact same test after being thinned and assembled into a SiP or MCP. The thermal processing steps that are present in the assembly process can also lead to a change in the refresh characteristics of individual bits. This phenomenon, known as variable retention time (VRT), is impossible to screen prior to the assembly process.

A key challenge is to re-establish the quality levels achieved by the die supplier. This can be accomplished through additional post assembly testing, invoking redundant elements in the individual failing die within the multi-chip package, or using components that are specifically designed for multi-chip applications.


[1] 1500-2005 IEEE Standard Testability Method for Embedded Core-based Integrated Circuits.

[2] P1450.6, Draft Standard for Standard Test Interface Language (STIL) for Digital Test Vector Data—Core Test Language

(CTL).