| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 21
Chapter 2
MICROELECTRONIC SYSTEM TRENDS AND PACKAGING NEEDS
Electronic sys tems needed in the next few years will require
unprecedented packaging technology. In this chapter, the demands placed on
the packages by anticipated chip technologies are discussed. ~ rem
the approach is to list requirements that, if met, will ensure that the
inherent performance capabilities of the chips can be achieved and will not be
degraded by the package. Some of these requirements deal with the interfacing
of individual chips, whereas others deal with the interconnecting of groups of
chips . In today's technology, these two functions are most often fulfilled by
first-level packages, such as dual in-line packages, and by second-level
packaging, such as printed-circuit boards, respectively.
Packaging requirements for the mid-199Os are
important to avoid any implicit assumption that the
In particular,
of interest here. It is
. _ packages, appropriate at
that time, can be categorized in the same way they are now. Indeed, there is
already strong evidence that combining traditional packaging levels can lead
to ~ moroved performance
For example, the IBM "thermal conduction" module
eliminates one level of packaging by combining two levels into a single
structure. In this chapter, ' ~ ~
~ ~ ~ _ ~t
lnC .lVlC .ua.
several chips;
For ~ --
important:
_
· die attachment
chip pinout
pinout conf figuration
heat removal
signal rise time
power lead inductance
power supply current
interline coupl ing
, that deal with interfacing
c lips are covered, as are those that deal with interconnecting
the terms first-level and second-level packaging are not used.
interlacing to a single chip, the following additional requirements are
those requirements
protection from the environment
21
OCR for page 22
22
For connecting two or more chips, the following additional requirements are
important:
~ wiring configurate on
· propagation delay
· signal rise time
Each of these requirements must be satisfied at acceptable cost and
reliability, without adversely affecting the other requirements. It is
assumed in this chapter that the signal-interconnection techniques do not
employ multiplexing or optics, so there is exactly one electrical connection
per signal (plus, perhaps$ multiple ground and power pins).
It is possible to estimate many of these requirements rather accurately
by extrapolating past characteristics of chips and systems, using scaling
theory as a guide. The various types of chip scaling and the related theories
are discussed in the next section. Scaling theory alone is not sufficient
because different chips will be built using different architectural styles.
An empirical relationship between pinout and circuit complexity, known as
Rent's rule, can be used to characterize architectural style with sufficient
accuracy for this discussion. Dead
discussed in a later section.
`~ ~ rule and related items are also
Three principal system types have been identified, each of which appears
to make different demands on packaging: low-end digital, high-end digital,
and high-speed. Low-end digital systems typically use silicon MOS circuits
packaged separately? with printed wiring boards (PWB) for chip
interconnection. High-end digital systems typically use silicon bipolar
technology, often with packages in modules that carry many chips. The advent
of bipolar complementary (BiCMOS) technology will blur the distinction between
the two digital system types in the future, but for the purposes of this study
it is assumed that BiCMOS will have interconnection requirements similar to
CMOS and power requirements intermediate between CMOS and bipolar. High-
speed circuits typically use gallium arsenide (GaAs) chips. The assumptions
about the chip technologies are presented in later sections in this chapter.
The requirements listed above are qualified, where possible, for each of the
system types.
S CALING THEORY
Scaling theory Is important in understanding the driving forces that
affect the trends of integrated circuit chips. The semiconductor industry
learned, more than 20 years ago, that shrinking the photolithographic
dimens ions on the wafer and increasing the chip and wafer size increased the
productivity of the semiconductor plant. The benefit to the user was lower
cost per circuit, more functions per chip, and higher performance. The end
result has been a quadrupling of the level of integration every 3 years. This
section draws on scaling theory to permit a proj ection of the performance
OCR for page 23
23
trends of bipolar and MOS integrated circuit chips and of generic module and
board configurations. The intent is to understand the evolutionary trends and
try to determine the material properties that may limit or even dead-end those
trends.
Lithography
Lithography is of fundamental importance in semiconductor fabrication,
and, therefore, a look at lithography is needed to get a feel for the future
direction of the semiconductor industry. A key parameter is the minimum
feature size that a given lithographic technology is capable of patterning on
a chip at a given point in time. Minimum feature size has a first-order
bearing on both circuit performance on a chip as well as the circuit density.
To predict future packaging needs in the mid-199Os, it is important to have
some feeling for what minimum feature size can be patterned in production in
that time frame.
Bakoglu (1986) points out that between 1959 and 1983 the minimum feature
size shrank at an average rate of 11 percent per year. Assuming that in 1988
the minimum feature size being patterned in production is about 1. O ~m, then
by mid-1995, or seven years later, the minimum feature size will be about O.S
~m. This assumes that the feature size will shrink about 10 percent per year.
This is optimistic, since this dimension is near what is generally agreed to
be the limit for optical lithography. It is an accepted fact that the rate of
progress decreases as the limit of the technology is approached.
Optical lithography has been used since integrated circuits were first
invented. The minimum feature size that optical lithography is capable of
producing is limited by the wavelength of light used, and it therefore has a
very fundamental limitation. Electron beam lithography can provide very small
feature sizes with the use of proper photoresist material. However, it is
highly doubtful that electron beam systems will ever be used in a production
environment because of their slow throughput caused by their limited
bandwidth. It is proj ected (Kern et al., 1988) that optical lithography
sources are expected to be available for resolutions down to about O. 35 Am and
will be extens ively used into the 19 90s .
The one technology that has the capability of providing a minimum
feature size, smaller than 0.35 Em in volume production, is x-ray lithography.
Whether an x-ray lithographic production system can be developed and installed
by 1995 is dependent on many factors. Two major nontechnical factors are the
need and the return on investment. Memory chips provide both a need, because
of increased density requirements, and a better return on investment than
logic chips, because of higher volume per part number, fewer part numbers, and
increased yield. The increase in yield is a result of the fact that very
small dirt particles are transparent to x-rays. If the very difficult
technical problems associated with introducing x- ray lithography into a
manufacturing operation can be overcome, then it is quite probable that it
will be first used to fabricate memory chips, for these reasons.
OCR for page 24
24
In view of the foregoing discussion, it will be assumed that during the
mid-199Os the minimum feature size, practiced in production volcanoes, will be
about O.5 ~m.
Scaling of lIOSFETs
As photolithographic techniques have improved, it has been possible to
reduce the minimum feature sizes on chips. However, power supply voltages
tend to remain standardized for economic reasons. As a result, it is not too
meaningful to perform a scaling analysis while holding the electric field
constant in the device being scaled. Baccarani and coworkers (1984) and
Dennard (1986) have developed the general scaling relationship shown in Table
2-1. In this analysis, ~ is the factor by which the dimensions are reduced
and c/a is the factor by which the applied voltage and threshold voltage are
multiplied. The depletion regions are scaled down, along with the other
dimensions, by multiplying the doping concentration within the scaled device
by the factor ea.
Table 2-1 Generalized Scaling Relationships
Physical Parameters Scaling Factors
Linear dimensions
Electric field intensity
Voltage (potential)
Impurity concentration
Wiring current density
Gate delay
1/x
· 1
· 1/a
·
2
~ ·O
1/ ~ · 1/~
Powe r /gate ~ 3 · 1/~2
Source: Based on Baccarani et al., 1984; Dennard, 1986
When the power supply voltage is held fixed, then ~ = ke, where k is a
proportional ity cons tent , and the gate delay scaling factor becomes k/~2.
There are limits to how far generalized scaling can be extended, since, as
increases, gate-to-insulator failure increases and hot-carrier mechanisms
produce long-term degradation. In addition, the current density in devices
and metallization increases, which can lead to electromigration-type failures.
Another problem that is aggravated as device dimensions shrink is the effect
of alpha particles.
OCR for page 25
25
Therefore, within limits, as the device dimensions on a chip shrink, the
delay per gate decreases, as does the power per gate. Devices have been made
and tested (Dennard, 1986) that confirm the scaling analysis. The
experimental results show that a loaded NMOS NOR circuit constructed with a
1.0 micrometer channel length and again with a 0.5 micrometer channel length
exhibited gate delays of approximately 1. O and O.5 nsec respectively. The
power per circuit decreased by about a factor of 4, to about 50 low
dissipation. In this experiment, the power supply voltage was scaled by about
a factor of 2, from 2.5 to 1.2 volts, so as to keep the electric field in the
device constant.
These scaled NMOS NOR circuits were patterned with a vector-scan
electron beam exposure system having a capability of producing 0.5 Em features
with a standard deviation of +0.05 Em in both feature size and level-to-level
overlay. Circuits with channel lengths as short as 70 nm have been fabricated
with five levels of electron-beam lithography overlaid, with an accuracy of
better than 30 nm (Kern et al., 1988; Sai-Halasz et al., 1987~. In a ring
oscillator, these silicon field effect transistors (FETs) have a delay per
stage of 13 psec.
In addition to the technical problems associated with shrinking the
dimensions of NMOS devices, this technology has a serious competitor in the
form of CMOS. It is hi ghly probable that CMOS will be the dominant technology
by the mid-1990s, with bipolars and GaAs relegated to specific applications
where their unique properties are superior to those of CMOS.
Scaling of CMOS
The CMOS technology has a very important and positive characteristic
because its circuits dissipate power essentially only during switching. When
these circuits are in a quiescent state, they consume very little power. This
is an important feature, since power that is not consumed does not need to be
removed. In addition, the copper required in the conductors supplying power
to the back planes and modules is greatly reduced. The designer of a CMOS
system must not, however, expect to operate very densely packaged chips at
very high clock rates without due concern for heat dissipation. The nature of
a CMOS gate is that its power dissipation increases in direct proportion to
its clock rate, all other terms held constant. Noise generated by these gates
increases at high clock rates to such an extent that they may not be
sufficiently noise- tolerant in digital systems beyond clock rates of 75 to 100
MHz. It is doubtful that high-clock-rate problems can be corrected by either
scaling or by operation at liquid nitrogen temperatures. (Appendix C gives
projections on operating and structural parameters.)
CMOS had to overcome two major handicaps before it became a very
acceptable technology: latch-up and a complex manufacturing process. Latch-
up, caused by the current gain of parasitic lateral transistors, produces a
high current path between the power supply and ground lines, a feature that
destroyed early CMOS chips. This is no longer a problem. The CMOS
manufacturing process approaches the complexity of the bipolar process. By
slightly increasing the complexity of the CMOS process, bipolar transistors
OCR for page 26
26
can be fabricated on the same chip as CMOS devices. Bipolar transistors have
a much higher transconductance, or di/dv (change of output current to input
voltage change), and as a result take up less area on a chip for a given drive
capability.
Baccarani and associates (1984) applied general scaling theory to a 0.25
Em NMOS FET and calculated that a device with a fan-out of 3 would have a gate
delay of about 200 psec, with a power dissipation of 50 low at a power supply
voltage of 1.0 volt. In relating their results to CMOS, they state, "Due to
the lower hole mobility, and to the larger sheet resistance of pa shallow
junctions, however, quantitatively different results are obtained in this
case, leading to somewhat modified design tradeoffs." They are saying that
the design of the e-channel PET in the CMOS circuit must be optimized
differently than the p-channel PET. The lower hole mobility will adversely
affect performance of the CMOS circuit.
Boudon and associates (1988) describe a 20K two-way NAND equivalent CMOS
gate array prototype with 0.5 pm channel length FETs. The 7.5- x 7.5-mm chip
is designed for high performance, with 245 psec gate delay with a fan-out of
3. A 32-bit reduced-instruction set computer (RISC) processor, with a 16- x
16-bit multiplier implemented on the chip, has been measured at 17 nsec cycle
time with a 3.4-volt power supply. This experimental result is 1.6 times
faster than the same implementation with a CMOS 0.9 ,um gate.
Cong and associates (1988) describe a low-power CMOS dual-modulus
(divide by 128/129) prescalar integrated circuit. They point out that the
prescalar has been traditionally implemented in GaAs or bipolar technologies.
The best prescalar fabricated with 17.5 nm gate oxide, functions at 2.06 GHz
with 25 mw power consumption. The channel length is 0.5 Am and operates on a
supply voltage of 3.5 volts with ring oscillator (unloaded) delay of 110 psec .
When CMOS devices are designed for low-temperature operation, at liquid
nitrogen temperatures of 77 K, the circuit speed is enhanced by a factor of
two. The reasons include decreased leakage, increased carrier mobility,
sharper subthreshold turn-off transition, lower interconnect line resistance,
and improved reliability (Sai-Halasz et al., 1987~. In addition, latch-up
effects are greatly reduced at low temperatures because of lower bipolar
gains. As device dimensions decrease, the benefits of operating at 77 K
become more attractive.
Scaling of Bipolars
The bipolar transistor has had a long history of development. During
the period when it was produced as a discrete device, two techniques were
invented to improve its performance by keeping it out of saturation. They
were the Schottky clamp-c~rcuit, invented in 1953, and the emitter-coupled
logic (ECL) circuit family, invented in 1956. Since the invention of the
integrated circuit, many improvements have been made to the bipolar device
structure In parallel with these improvements have been improvements in
photolithography that have reduced the size of the device, with an attendant
. .
Increase In per :ormance.
OCR for page 27
27
A few of the important structural improvements made to bipolar planar
devices are self-aligned base contact, deep-trench isolation, and a
polysilicon emitter contact. Both the self-aligned structure and the trench
isolation greatly reduce the device area and the associated parasitic
capacitance, and hence significantly reduce the power-delay product and
increase the density of bipolar circuits (Nina and Tang, 1986). Experimental
evidence has overwhelmingly shown that polysilicon emitter contacts make it
possible to vertically scale bipolar transistors and improve circuit
performance without unacceptable degradation in current gain.
Ning and Tang (1986) state, "The trend in bipolar device technology is
then to develop the version or versions of self-aligned structure, deep-
trench isolation, and polysilicon emitter contact that are manufacturable
applicable to both high-speed as well as high-density applications....The
central idea is to reduce the horizontal and vertical dimensions in a
coordinated manner so that all the key delay components are reduced
approximately proportional in scaling."
The scaling rules for ECL circuits are shown in Table 2-2. The
projected delay as a function of the switch current of the scaled circuit is
shown in Figure 2-1. Reduction in gate delay can be expected as chip power is
increased in projected future circuits.
Table 2-2 ECL Scaling Rules -
Parameter Rule*
Base width, Wb
ao.8
Base doping level, Nb Wb~2
Collector current density, Jo an
Collector doping level, Nc
Circuit delay
Jc
a
*a ~ minimum feature dimension and
emitter-stripe width
Source: After Ning and Tang, 1986
It can be seen from Figure 2-1 that the maximum benefit in performance,
from scaling, is obtained when the current, and hence the power, is held
fixed. Naturally, it is possible to reduce the current as the emitter width
is reduced and accept a smaller improvement in performance. This approach has
generally been resisted by the ECL enthusiasts, since they constantly strive
for improved performance. As a result, ECL-based systems consume quite a bit
OCR for page 28
28
of power (2 to 5 mw per circuit), which must be supplied and removed. A
further complication is that the power supply voltage is in the range of 2
volts, which means that a system with a power requirement of 5000 w will
require a current of 2500 A. This magnitude of current requires a copper
conductor of very large cross section between the power supply and the circuit
modules or board.
GALLIUM ARS ENIDE TECHNOLOGY
An additional category of digital components, which has emerged from the
laboratory and entered general use during the 1980s, is that of the extremely
high-speed devices. Such devices, because of the lower level of integration
at this time, typically exhibit a smaller number of signal pins than, for
example, CMOS chips, but in a few years, they can be expected to have the same
pinout needs as today's slower-speed silicon counterparts. Logic gate delays
in these ultrafast chips of as little as 10 to 100 psec make it possible to
design signal processors that are already achieving clock rates as great as 2
GHz. Furthermore, gallium arsenide (GaAs) digital integrated circuits have
been demonstrated in the laboratory that perform useful functions at even
higher clock rates, of up to 25 GHz. Present GaAs chips of 1000 gate
complexity, capable of 6 GHz, are available in experimental quantities (in
1989), 10-GHz digital chips will be available by 1990 to 1991, and 20- to 25-
GHz chips are expected to be readily available by 1995.
GATES/CHIP ( P = 2 WATTS)
1000
cat
a)
in
~ 100
LL
50K 20K dOK 5K 2K 1K
I I ~ I I I I I l I I I I I I L
\ \
_
\ _
2.5~m\
\
0.25~
I 1 1 1 1 1 11
10
0.01 0.1 1.0
CURRENT (mA)
FIGURE 2-1 Gate delay for a 2 watt chip as a function of switch current. The
Awry. refer to emitter stripe width . (After Nine and Tang, 1986)
OCR for page 29
29
Very-high-frequency digital integrated circuits are employed in a wide
range of equipment, including supercomputers, telecommunications transmission
equipment, communications satellites, radar, video image processing, military
electronic countermeasures, image processing, and air traffic control
displays. Silicon ECL components have been used traditionally, and GaAs
components are now being introduced. Clock rates have been increasing
steadily and are now at about 0. 5 GHz . The fast digital portions of these
circuits may be small enough to fit onto a multichip module, which will be the
heart of the system.
Very-high-speed electronics require attention to electromagnetic issues
that are often unfamiliar to digital systems designers. Fast rise-time
devices radiate electrical energy in that portion of the electromagnetic
spectrum traditionally reserved for analog microwave communication channels
and radar systems. Bandwidths must be preserved as the signals propagate
through the packaging and interconnect structures, if the robustness and noise
immunity of the processors are to be maintained. Despite these problems for
interconnect designers, these enabling technologies will be pursued vigorously
during the next decade, because the higher system clock rates can lead to
signal processing rates that are one or two orders of magnitude greater than
those currently available.
RENT'S RULE
About 1960, Edward Rent, working at IBM, observed a relationship between
the complexity of logic circuit (expressed, for example, by the number of
gates in it) and the number of signal wires (pins) connecting to it. [Rent
himself never published an account that bears his name, but two early
references describe the relationship (Logue, 1966; Landman and Russo, 197113.
In its simplest forest, it is
Np Kiev
where Np is ache number of pins, Ng is the number of gates, and Kp and ~ are
constants. The relationship has been applied to a variety of systems,
including digital computer systems, integrated circuits, random logic, and
even animal eyes and brains. Rent's rule is used here to predict the pinout
of future integrated circuits and the interchip wiring complexity of highly
parallel computer architectures of the future.
Two empirical constants, ~ and a, appear in Rent's rule. Of these,
is the more critical. In a two-dimensional world, such as inside an
integrated circuit or on a printed-circuit board, the rule is qualitatively
different, depending on whether ~ is above or below 0.5. If ~ is greater than
0.5, then, as more and more complexity is added to a circuit, the circuit
becomes harder and harder to wire. To appreciate this, consider a chip on
which the perimeter is used for bonding pads, and suppose that all the space
on the (one-dimensional) perimeter is used for these pads. If the size of the
circuit quadruples (e . g. , by making each dimension of the chip twice as
OCR for page 30
30
large) ? the required number of pins more than doubles, yet the perimeter only
doubles, and as a result not all the required pins will fit in the available
space. A similar argument applies to wiring within the chip if the
subcircuits on the chip themselves obey Rent's rule with the same exponent.
Values of ~ below O.S do not pose such difficulties. Incidentally, in a
three-dimensional setting, the critical exponent is 2/3 rather than l/2,
because this is the exponent governing the ratio of surface to volume.
Recent examinations of Rent's rule (Bakoglu, 1986; Ferry, 1985) have
focused on the critical nature of the exponent, and it has been observed that
different styles of system architecture or different types of systems seem to
be characterized by different exponents. The values reported by Bakoglu
(1986) are 0.63 for chip- and module-level design of high-speed computers (in
agreement with Rent's original value), 0.5 for gate arrays, 0.45 for
microprocessors, 0.2S for board- and system-level computers, and 0.12 for
memories. The value reported by Ferry (1985) is 0.21 for a mix of logic,
microprocessors, and memory. Bakoglu's value for microprocessors appears to
be heavily biased by a single early example and two RISC chips; without them,
the value is less than O.2.
Rent's rule is empirical, and empirical observations invite fundamental
explanations. It may be that an exponent of 2/3 can be explained by the
surface-to-volume ratio of a design produced by evolution that is truly three-
dimensional, such as animal brains. It is also obvious that memories should
have a low exponent, since address coding permits the number of address pins
to be a logarithmic function of the size of the memory. For the other types
of systems, however, fundamental explanations seem less satisfactory.
However, one fundamental distinction does seem appropriate. Ferry
(1985) attributes to McGroddy and Solomon (1982) the distinction between
highly partitioned and functionally partitioned circuits. The former are
defined as those for which chip or module boundaries do not tend to coincide
with system or subsystem boundaries. Gate arrays, random TTL logic, and
indeed designs where many components are required for a system, are like this.
Functionally partitioned circuits, on the other hand, are defined as those in
which the chip or module boundaries do coincide with system or subsystem
boundaries. Microprocessors are like this. The definition of what is a
subsystem is a human one, based on the partitioning of a total system for
easier human understanding. Human understanding is more likely to occur when
the subsystems do not have complex interactions but instead interface with
minimal information interchange. It appears that the following values of
constants appear to characterize different types of chips and systems:
· Memory chips, ~ - 6, ~ ~ 0.12
Functionally partitioned chips, ~ ~ 10, ~ ~ 0.2
Modules and boards, ~ z 82, ~ ~ 0.25
· Highly partitioned chips, ~ = 2, ~ ~ 0.5
OCR for page 31
31
These values are generally consistent with the data presented by Ferry
(1985), although they differ from the numbers given by Bakoglu (1986~. It is
likely that the data for gate arrays (highly partitioned) are based on the
fact that, in most present packaging schemes, signal pins are located on the
periphery of chips. As a result, a natural evolution from one gate array to
the next, keeping design style and design tools similar, will necessarily
scale the pinout as the square root of the number of gates. Thus, the pinout
of highly partitioned chips may in fact be limited more by the interconnect
technology available than by the inherent needs of the logic, and therefore,
in designing packages for the future, perhaps higher exponents might be
appropriate.
Other types of chips can be categorized according to whether they are
highly partitioned or functionally partitioned. For example, systolic arrays
and some signal-processing chips may be functionally partitioned, whereas the
"glue logic" that seems to surround microprocessors in many systems is
probably highly partitioned.
CHIP TECHNOLOGIES
The committee's assumptions about chip technology in use in systems in
the mid-1990s are summarized in Table 2-3. These data were supplied by Donald
R. Franck of the Empire Planning Group (personal communication to the
committee, October 1988), except as follows: The linewidth estimates are
justified earlier, and the inter-latch delays are an assumed logic depth (20
for MOS, 15 for bipolar, and 10 for GaAs) times the gate delay, plus an
estimate of on-chip wiring delay. This estimate is the "Elmore
time constant" of an aluminum wire 4 mm long with the linewidth, cited as 0.3 Am
thick, over and under 0.5 Am thick oxide insulators; for GaAs, silicon nitride
insulator above and insulating GaAs below (Elmore, 1947; Rubenstein et al.,
1983~. This wire has a resistance of 700 ohms and a capacitance of 0.4 pF,
for an "Elmore time" of 140 psec (1 pF and 350 psec for GaAs). Even the
relatively long length of 4 mm assumed here will require restraint on the part
of circuit designers, since chip sizes are expected to be up to 3 cm on a side
in 1994, and thus circuit designers will have to use careful placement of
combinational logic blocks and perhaps buffers for long signal paths. The
clock frequency calculated assumes that during half a clock cycle a signal
must settle and the settling time should be at least 1.5 times the inter-
latch delay. In other words, the clock period is three times the inter-latch
delay. The power supply current is the power divided by the assumed supply
voltage, and the power per gate is calculated from the chip power and gate
count.
OCR for page 32
32
Table 2-3 Mid-1990s Integrated Circuit Chip Technologies
Chip Interface CMOS Bipolar GaAs
Linewidth (pm) 0.5 0.5 0.5
Gate count 400,000 20,000 100,000
Power and signal pinout 600 600 300
Pinout configuration Two dimensions Two dimensions One dimension
Device gate delay (psec) 200 40 50
Inter-latch delay (nsec) 4.1 0.74 0.85
Clock frequency (MHz) 80 450 250
Power supply voltage (V) 3 1.3 2
Power (W) 20 40 20
Current (A) 6.7 33 10
Power per gate (up) 50 2000 200
Source: Based on data from D. R. Franck (personal communication to the
committee, October 1988) and some prepared by the committee.
The "pinout configuration" entry in Table 2 - 3 requires an explanation.
Consider the problem of providing pins for integrated circuits, which are
planar (two-dimensional). The "boundary" of a two-dimensional region is one-
dimensional, in this case the perimeter of the chip. Modest pinout (say up :
300) can be satisfied by one row (or two) of pads at the perimeter of the
chip, and, in fact, most chips fabricated today use perimeter bonding pads.
Therefore, without revolutionary reductions in pad size, the larger pinout
that will be needed in 1994 cannot be satisfied with perimeter pads, so the
two-dimensional chip area must be used. Indeed, this technology is in some
use even today. Thus, the demand for more pinout must be satisfied by
''escaping" to a higher dimension.
I f the full performance of the chips, as summarized in Table 2 - 3, is deco
be realized in a system, the packaging requirements listed in Table 2-4 are
necessary. Any deviations from meeting these specifications will force
compromises on chip and system designers and will, therefore, mean that system
performance is limited more by the packaging than by the chips.
OCR for page 33
33
Chip Interface Packaging
For interfaces to the chip not to inhibit the chip performance described
earlier, the following packaging requirements apply (see Table 2-4~. The
package pinout must, of course, equal the chip pinout. The package pinout
configuration cannot be accommodated using the perimeter of the package for
the same reasons that this will not be possible for chips. The chip power
cited is from D. R. Franck (personal communication to the committee, October
1988~. The chip drivers and receivers, together with the package signal lead
inductance, must be capable of responding in the inter-latch delay cited
above--in other words, in about a third of the clock cycle. The power (and
ground) lead inductance is calculated by requiring that the L di/dt voltage
dropped across the pins not exceed 0.1 times the supply voltage, when as much
as 50 percent of the current for MOS, 5 percent of the current for bipolar or
10 percent of the current for GaAs is switched in a time equal to the signal
rise time. Clearly, this requirement cannot be met unless multiple pins are
used for both ground and supply voltage. This requirement can, however, be
relaxed if a multiphase clock or on-chip voltage regulation is used.
Table 2-4 Mid-ls9ns Chin Tnt.-rfn~- T-~hnnl non
Chin Interface Low-End Digital Hi~h-end Digital High-Sneed
Chip pinout 600 600 300
Package pinout configuration Two dimensions Two dimensions One dimension
Heat removal per chip (W) 20 40 20
Signal rise time (nsec) 4.1 0.74 0.85
Power lead inductance (nh) 0.4 0.07 0.17
DC power supply current (A)
6.7 33
10
Environmental protection Essential Essential Essential
Chip Interconnection Packaging
If the signal propagation delay from chip to chip and the signal rise
time for interchip communication at least match the inter-latch delay for the
chips, then signals can be transmitted from one chip to another during a
single clock cycle, and the packaging will not substantially degrade system
performance. The requiremer,ts are given in Table 2-5.
OCR for page 34
34
Table 2 - 5 Mid- l990s Chip Interconnection Technology
Chip Interconnection
Wiring configuration Two dimens ions
Propagation delay (nsec) 4.1
Low- End Digital High- End Digital High- Speed
Three dimens ions Two dimens ions
0.74 0.85
Signal rise time (nsec) 4.1 0.74 0.85
The key requirement for interchip packaging is the ability to have a
large number of interconnect wires between and among chips. This is necessary
whenever an overall system is too complex to be put on a single chip.
Generally, total systems have a relatively small number of signal pins,
because systems with complex interfaces are difficult to understand and
systems are, after all, defined by humans who must understand their input and
output behavior. The need for complex interconnections arises when the
limitations of chip technology force a system to be implemented on more than
one chip.
Systems are conceived in all sizes, and, therefore, it is difficult to
be quantitative in the general sense about the interconnect needs. For this
reason, no estimates are given regarding pinout of modules that perform chip
interconnection. Rent's rule for highly partitioned chips and modules is
probably valid for systems, both high-end and low-end, that are sufficiently
complex so that many chips are necessary.
The entry "wiring configuration" in Table 2-5 requires further
discussion. Today, the most common interchip wiring is done on printed wiring
boards (PWBs) in which a very small number of two-dimensional routing surfaces
are used. This works well only for limited chip pinout and limited board
pinout. It works best for chips with perimeter bonding, or whose first-level
packaging provides perimeter connections, because of the difficulty of using
essentially a two-dimensional scheme to connect to a two-dimensional pinout
array, given the normal wire size, spacing needed to reduce crosstalk and
adj acent- conductor shorts, and the pad or connector size .
Chip pinout for 1994 will require, for systems with several chips, a
correspondingly high number of connections between chips. For example,
consider a system that requires several 1994 chips, each with pinout of 600.
The partition of functionality among the two chips might be "highly
partitioned"' in the sense used earlier, with a Rent's rule exponent of 0.5.
In that case, two chips taken as a unit would, between them, need 850 wires to
connect with the rest of the system. The remaining 350 pins from the two
chips might go between these chips, implying the need for 175 signal paths
between these adj acent chips . (This number might be increased slightly
because some electrical nodes have multiple connections. ~
The required interconnect dens ity, although difficult to quantify in.
general because of the varied size of systems and the degree of partitioning
OCR for page 35
35
necessary, is clearly beyond the capabilities of today's PWB technology and
also will be difficult to satisfy with advanced multichip modules. It is
believed that, for high-end systems with many chips, the wiring congestion can
be overcome only by using a three-dimensional interconnect structure. By this
is meant a structure in which the wiring dens ity in the third dimension is
comparable to that in the other two dimensions. This kind of structure
actually is not as far- fetched as it sounds; the IBM thermal conduction module
and today's best PWBs have horizontal and vertical pitches for horizontal
wires, and horizontal pitches for vertical wires, that are within a factor of
five. In the case of the thermal conduction modules, the horizontal spacing
between wires in one plane is 5 mils, the distance between planes is 10 mils,
or 20 mils if a ground plane lies between for shielding, and ache pitch of vies
is 25 mils.
In contradistinction to the moderate clock rate described throughout
this report, packaging intended for the fastest clock rate devices (both
silicon and GaAs) must assume that the interchip signal connections will be
transmission line in nature. It is difficult to understate the impact on chip
interconnect caused by the need for a transmission line environment on the
substrate, which, as a rule of thumb, arises whenever the off-chip signals
exhibit risetimes of 2 nsec or less. The risetimes of typical silicon ECL and
GaAs components are all less than 1 nsec at present. It should be noted that
the fastest risetimes are currently in the 200-psec range, with 100 psec
achieved on a small subset of the very fastest GaAs devices intended for
communications applications. Although even 30-psec risetimes have been
demonstrated, these ultrashort risetimes will not be necessary until the mid-
1990s, when clock rates exceeding 10 GHz are employed in communications and
radar processors.
Driven by the operating parameters of high-clock-rate systems described
above, all of the parameters of concern for silicon CMOS chips assume even
greater importance for the fastest devices. The use of single-chip surface-
mounted packages is already giving way to the method of placing bare chips
nearly side by side on very dense metal-on-organic dielectric structures
(e.g., copper-on-polyimide or the equivalent). The ability to fabricate very
uniform transmission line structures on these dense "chip-on-board"
substrates, with low DC and AC resistive losses in the lines, will be
important.
To minimize the high-frequency crosstalk between densely-packed
interconnect lines, very low values of dielectric constant (e' about 2.0) will
be required for the materials that separate the signal planes from their
ground reference or shield planes. Such low i' values not only increase
wavefront propagation velocities, but also allow ground planes to have minimum
separation from the s ignal planes for a given line impedance, thereby
decrees ing interline coupling effects . The interplane dielectrics must also
not be lossy and must not become lossy at higher frequencies because of
adsorbed or chemically-bound water. For interconnection between the chips and
the substrates, the frequency limits of wire bonding must be better assessed,
and current TAB technology must be extended (e.g., with "flip-TAB" or modified
''flip-chip'' techniques) to provide improved high-frequency transmission line
behavior and shorter total lengths of the TAB structures. Finally, the
OCR for page 36
36
ability to provide integral high- frequency local power plane decoupling
adj acent to the active chips must be provided in a more cost - effective manner
than i s done now .
The flip-chip techie que protrudes an excellent ultra-high-frequency
connection between chip and substrate. The technique permits the chip to be
removed a limited number of times. However, the ability to provide signal
integrity, with demountability between the bICM and the back panel, becomes
increasingly more difficult as signal risetimes approach 100 psec. When
innovative approaches (e . g., fuzz buttons and elastomeric materials) are
cons idered for solving connector problems of high-performance electronic
systems, materials issues must be considered. (See Appendix D for two
innovative approaches. ~
SOME PACKAGE DES IGN CONS IDERATIONS
The single - chip and multichip modules are described irk this section to
point out the techniques used to handle the interconnects and the problems
encountered in each type.
Single-Chip Modules
Single-chip modules (SCMs) can be divided into two categories: the
first is the surface -mount module, and the second is the pin- through-hole
module. The surface-mount package, of necessity, requires that the leads come
out around the perimeter of the package. This, therefore, requires that the
lead pitch becomes finer because the number of leads supplying signals and
power to the package increases as more circuits are placed on the chip. To
support 400 to 700 signal and power connections to a surface-mount package in
a surface-mount configuration in the 1994 time frame would require about a 12-
mil lead pitch. The leads would have to be staggered, since the card or board
to which they are soldered or otherwise connected must have more than one
plane of connections for signal and power. it is doubtful that vies that make
connections from the surface to internal planes can be placed on ~ 2-mil
centers . The problem is solved by staggerlug the length of the 7 eads exiting
from the surface-mount module. Handling 400 to 70C very fine leads that have
a pitch of 12-mile is a difficult problem in production, as is the problem of
removing 15 to 20 W of heat from the chip in this package.
The pin- through-hole single-chip module has the pins in an area array.
The pins are inserted into holes in the card or board or the next level of
package. It is conceivable that the singlet-chip module need not have pins,
but it could have contacts that mate with contacts on the next level of
package. In any case, it is necessary to have vies under the module that
permit connection to the internal wiring planes in the next level of package.
If pins do not need to be inserted in holes, then these holes can be smaller
in diameter than holes that must contain pins. In any case, there is wiring
congestion under the module. If, as is usually the case, there is a limit to
the number of lines that can pass between two vies, then more wiring planes
OCR for page 37
37
are needed in the next package level to overcome the wiring congestion under
the module.
Multichip Modules
The advantage that a multichip module (MCM) provides is greater density
at both the module level and at the system level, because with the greater
dens ity comes improved performance from the shorter transmission paths. This
improved performance is attained in present MCMs even though the dielectric
constant of the ceramic is approximately nine. Since the velocity of
propagation is proportional to (-as, the velocity of signals propagated on
lines within the ceramic is one-third of the velocity of light. Future MCMs
will require insulating materials whose dielectric constants are as close to 1
as possible. Naturally, the insulating materials used must be patternable;
that is, it must be possible to make via holes in the material that can be
filled with a conductor to provide a connection path from one level of
conductor to the next. The insulator must permit a conductor to be attached
to it by some means, such as evaporation, sputtering, or vapor deposition.
The process of laying down the insulator and patterning should be a dry
process, because dry processes cause fewer ecological problems and provide
high resolution in a patterning process.
The chips, mounted on the MCM, must be attached in such a way as to
provide electrical connection to the 600 signal and power supply connections
that must be made to each chip. Since each chip may dissipate as much as 50 W
of power, large thermal stresses can be set up in the connections made to the
chip. When the system containing these MCMs is cycled on and off, these
cycled stresses in the connections between the chip and the module can and do
fail from fatigue. This requires that the thermal expansion coefficient of
the substrate material of the MCM match that of the silicon chip. The problem
is complicated by the fact that the substrate material dissipates much less
I2R power than does the chip, which results in a short - term transient thermal
mismatch between the chips and the MCM substrate. The use of Peltier-junction
cooling close to the chip is worthy of consideration here. In addition, it
should be noted that the thermal mismatch problem is not peculiar to MCMs, but
is equally important with SCMs.
The mismatch in thermal expansion coefficient between the MCM substrate
material and the chip material is aggravated when the distance between the
extreme farthest connections to the chip increases. Since there are 600 such
connections, there will be an array of connections with about 25 connections
on a side. If the chips are 1 cm on a side, then the array of connections
must exist within the chip footprint, or within about 0.8 cm, which places
connections on 0.32-mm (12.6-mil) centers. If the via holes in the insulating
material, which contain the metal that connects to the wiring plane below are
~ mils in diameter, then the distance between the via walls is 7.6 mils with
no tolerance. If there is an allowed tolerance or a guard band of 1 mil
around each via, only 5.6 mils are left in which to place lines that conduct
signals within the wiring plane. Clearly, something must be done to decrease
the wiring congestion immediately under the chip. Today, this is done by the
OCR for page 38
38
addition of more wiring planes to redistribute the congestion to lower wiring
levels (Blodgett, 19839.
As more ECL circuits are crowded onto each chip, and since the system
designers are reluctant for performance reasons to reduce the power needed per
circuit, the power required and that must be dissipated per chip goes up. For
reliability reasons , the maximum junction temperature on the chip must be
limited to about 85 or 90°C. Assuming that the ECL circuits need 2 mW and
that there are 20,000 circuits per chip, the power required per chip is 40 W.
If the MCM contains 100 chips, the total power required by the MCM is 4000 W.
If the power supply voltage is assumed to be about 2 volts, then the current
required by the module is 2000 A.
The MCM of 1994 with 2,000,000 circuits will require about 5000 signal
and power pins. The idea of separating the pins that supply signals from
those that supply power by connecting bus bars to the MCM for the power and a
smaller number and size of pins for the signals is not workable, because there
must be ground returns for the signal pins. There is no requirement in 1989
for a ground pin for each signal pin; the ratio of signal to ground pins today
is about 4 to 1. In the 19 94 time frame, this ratio will probably decrease
because of the steeper risetime of the signals expected at that time. The
idea of separating the signal pins from the power pins is not a good one
because at least the same number of signal pins is needed in each case.
Certainly, smaller pins can be used if they carry only signal currents and not
power-supply currents. It must be noted that a significant portion of the
circuit current flows from other sources and in other directions, however, the
current and pin problems are major issues to be dealt with.
Removing 40 W of power from a chip is a very challenging problem, as is
removing 4000 W from the MCM. Since Moms are quite expensive, they must be
repairable. It must be possible to remove and replace chips on the MCM and to
reroute signal paths, not only because of failure modes but because of the
need for engineering changes. By 1994, engineering changes might possibly be
made by external electrical signals.
REFERENCES
Baccarani, G., M. R. Wordeman, and R. H. Dennard. 1984. Generalized scaling
theory and its application to 1/4 micrometer MOSFET Design. IEEE Trans.
Electronic Devices, Errol . ED- 31, no . 4, pp . 452 -462 .
Bakoglu, H. B. 1986. Circuit and System Performance Limits on VLSI:
Interconnection and Packaging. Stanford Electronics Laboratories,
Technical Report No. 541-4, Stanford University.
Boudon, G , P. Mollier, J. P. Nuez, F. Wallart, A. Bhattacharyya, and S.
Ogura. 1988. A 20K CMOS array with 200-ps gate delay. IEEE J. Solid-
State Circuits, vol. SC-23, no. 5, pp. 1176-1181.
Blodgett, A. J., Jr. 1983. Microelectronic packaging. Scientific American,
vol. 249, no. 1, pp. 86-96.
OCR for page 39
39
Cong, H. I., J. M. Andrews, D. M. Boulin, S.-C. Fang, S. J. Hillenius, and J.
A. Michej da. 1988. Multigigahertz CMOS dual modulus prescalar IC. IEEE
J. Solid-State Circuits, vol. 23, no. 5, pp. 1189-1194.
Dennard, R. H. 1986. Scaling limits of silicon VLSI technology. Pp. 352-
369 in The Physics and Fabrication of Microstructures and Microdevices, M.
J. Kelly and C. Weisbuch, eds. New York: Springer-Verlag.
Elmore, W. C. 1947. The transient response of damped linear networks with
particular regard to wide-band amplifiers. J. Appl. Phys. vol. 19, no. 1,
pp. 55-63.
Ferry, D. K. 1985. Interconnection lengths and VLSI. IEEE Circuits and
Devices Mag. vol. 1, no. 4, pp. 39-42.
Kern, D. P., T. F. Kuech, M. M. Oprysko, A. Wagner, and D. E. Eastman. 1988.
Future beam-controlled processing technologies for microelectronics.
Science, vol. 241, August.
Landman, B. S., and R. L. Russo. 1971. On a pin versus block relationship
for partitions of logic graphs. IEEE Trans. Computers, vol. C-20, pp.
1469-179.
Logue, J. C. 1966. Large-scale integration--Status and utilization.
Electronica (Munich), October.
McGroddy, J. C., and P. M. Solomon. 1982. Device technology comparison in
the context of large scale digital applications. IEDM Technical Digest,
pp. 2-5.
Ning, T. H., and D. D. Tang. 1986. Bipolar Trends. Proc. IEEE, vol. 74, no.
12, pp. 1669-1677.
Rubenstein, J., P. Penfield, Jr., and M. A. Horowitz. 1983. Signal delay in
RC tree networks. IEEE Trans. Computer-Aided Design, vol. CAD-2, no. 3,
pp. 202-211.
Sai-Halasz, G. A., M. R. Wordeman, D. P. Kern, E. Ganin, S. Rishton, H. Y. Ng,
D. S. Zicherman, D. Moy, T. P. H. Chang, and R. H. Dennard. 1987.
Experimental technique for characterizing of IEEE International Electron
Device Meeting, Technical Digest, pp. 397-400.
OCR for page 40
Representative terms from entire chapter:
minimum feature