| ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 95
A Core CS&E Research
Agenda for the Future
Core CS&E research is characterized by great diversity. Some
core research areas are fostered by technological opportunities, such
as advances in microelectronic circuits or optical-fiber communica-
tion. Such research generally involves system-building experiments.
The successful incorporation of the remarkable advances in technolo-
gy over the past several decades-has been largely responsible for
making computer systems and networks enormously more capable,
while reducing their cost to the point that they have become ubiqui-
tous. For other research areas, computing itself provides the inspira-
tion. Complexity theory, for example, examines the limits of what
computers can do. Computing-inspired CS&E research has often provided
the key to effective use of computers, making the difference between
the impossible and the routine.
The diversity of technical interests within the CS&E research com-
munity, of products from industry, of demands from commerce, and
of missions between the federal research-funding agencies has creat-
ed an intellectual environment in which a broad range of challenging
problems and opportunities can be addressed. Indeed, the subdisci-
plines of CS&E exhibit a remarkable synergy, one that arises because
the themes of algorithmic thinking, computer programs, and infor-
mation representation are common to them all; Box 3.1 provides il-
lustrative examples. Narrowing the focus to a few research topics to
the exclusion of others would be a mistake. Thus the description of
95
OCR for page 96
96
CO~G ME FUTURE
promising research areas that fatuous below should not be regarded
as definitive or exclusive.
As the saying goes/ precise predictions are difficult particularly
about the future. Nevertheless/ the committee ~ confident that tech-
nology-driven advances ~iU be sustained for many more years and
that computing and CS~E It continue to thrive on the philosophy,
OCR for page 97
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
97
well stated by Alan Kay, that the best way to predict the future is to
invert it. Major qualitative and quantitative advances will continue
in several technological dimensions:
· Processor capabilities and multiple-processor systems;
· Available bandwidth and connectivity for data communications
and networking;
· Program size and complexity;
and
· Management of multiple types, sources, and amounts of data;
· Number of people who use computers and networks.
For all of these dimensions, change will be in the same direction:
systems will become larger and more complex. Coping with such
change will demand substantial intellectual effort and attention from
the CS&E research community, and indeed in many ways the overall
theme of "scaling up" for large systems defines a core research agen-
da. (See also Box 3.2.)
Parts of the following discussion incorporate, augment, and ex-
tend key recommendations from recent reports that have addressed
various fields within CS&E: the 1988 CSTB report The National Chal-
lenge in Computer Science and Technology; the 1989 Hopcroft-Kennedy
report, Computer Science Achievements and Opportunities; the 1990 La-
gunita report, Database Systems: Achievements and Opportunities; and
the 1989 CSTB report Scaling Up: A Research Agenda for Software Engi
neerlng.=
PROCESSOR CAPABILITIES AND
MULTIPLE-PROCESSOR SYSTEMS
As noted in Chapter 6, future advances in computational speed
are likely to require the connection of many processor units in paral-
lel; Box 3.3 provides more detail. This trend was recognized in the
Hopcroft-Kennedy report, which advocated research in parallel com-
puting as described in Box 3.4.
Computing performance will increase partly because of faster pro-
cessors. Advances in technology, the good fit of reduced-instruction-
set computing (RISC) architectures with microelectronics, and opti-
mizing compiler technology have permitted processor performance
to rise steeply over the past decade. Continued improvements are
pushing single-processor performance toward speeds of 108 to 109
instructions per second and beyond.
Even larger gains in performance will be achieved by the use of
multiple processors that operate in parallel on different parts of a
OCR for page 98
CO~G ME FEE
11 1111111BO1X11 Il{~IlCEi1~11 111
~!~S,s~ss,s~ss.s~s.s.ss.s.s.s.sss.s~!~s.s.s.s.s.s /1111 11lll m llll l
7 1 1 1 ~ ~ ~ 1 1 1 ~ 1 1 3 1 ~ 1 1 1 1 1 ~ C L - 1 1 ~ ~ #
-
OCR for page 99
OCR for page 100
100
COMPUTING THE FUTURE
demanding application. The Hopcroft-Kennedy report, written in
1986-1987, described a goal of 10-fold to 100-fold speedups.3 But
since that time, technological advances have made this goal far too
modest. Today, it is plausible to aim for increases in speed by factors
of 1000 or more for a wide class of problems. Apart from this point,
the Hopcroft-Kennedy outline remains generally valid.
Nlector supercomputers have gained speed, more from multiple
(10 to 100) arithmetic units than from increases in single-processor
performance. Massively parallel supercomputers (1,000 to 100,000
very simple processors) have now passed vector supercomputers in
peak performance. Workstations with multiple (2 to 10) processors
are becoming more common, and similar personal computers will
not be far behind.
The ability of parallel systems to handle many demanding com-
nutina problems has been demonstrated clearly during the period
since the Hopcroft-Kennedy report was written. It had not been
clear that linear speedups are practically achievable by using proces-
The oracti
1 V 1 ~ r r ~
sors in parallel; indeed on some problems they are not.4
cat difficulty in exploiting parallel systems is that their efficient use
generally requires an explicitly parallel program, and often a pro-
gram that is tailored for a specific architecture. Although such pro-
grams are generally more difficult to sprite than are sequential pro-
grams, the investment is often justified. On well-structured problems
in scientific computing, visualization, and databases, results have been
obtained that would not otherwise be affordable.5 Insights into pos-
sibilities for programming and architectures are provided by the study
and instrumentation of problems that are less regular or algorithmi-
cally more difficult and that "push the envelope" of parallel systems.
Parallel computing will be a primary focus of the high-performance
computing systems component of the HPCC Program.
Distributed computing, another focus of the Hopcroft-Kennedy
report, is perhaps a more pressing concern now than where that re-
port's recommendations were formulated.6 Computing environments
have been evolving from individual computers to networks of com-
puters. Seamless integration of heterogeneous components into a
coherent environment has become crucial to many applications. Cus-
tomers are increasingly insisting on the freedom to buy their com-
puter components-software as well as hardware from any vendor
on the basis of price, performance, and service and still expect these
various components to operate well together. Such pressure from
customers has hastened the movement toward "open system" archi-
tectures. Businesses are becoming more dispersed geographically,
yet more integrated logically and functionally. Distributed computer
systems are indispensable to this trend.
OCR for page 101
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
101
As computing penetrates more arid more sectors of society, reli-
ability of operation becomes ever more important. Many applica-
tions (e.g., space systems, aircraft, air-traffic control, factory automa-
tion, inventory control, medical delivery systems, telephone networks,
stock exchanges) require high-availability computing. Distributed
computing can foster high availability by eliminating vulnerability to
single-point failures in software, hardware, electric service, the labor
pool, and so on.
What intellectual problems arise in parallel and distributed com-
puting? As discussed at length in the Chapter 6 section "Systems
and Architectures," parallel and distributed computing systems are
capable of nondeterministic behavior, producing different results de-
pending on exactly when and where different parts of a computation
happen. Unwanted conditions may occur, notably deadlock, in which
each of two processes waits for something from the other. These
complications are exacerbated when the system must continue to op-
erate correctly in the presence of hardware, communication, and soft-
ware faults.
Sequential programming is already difficult; the additional be-
havioral possibilities introduced by concurrent and distributed sys-
tems make it even harder to assure that a correct or acceptable result
is produced under all conditions. New disciplines of parallel, con-
current, and distributed programming, together with the develop-
ment and experimental use of the programming systems to support
these disciplines, will be a high priority of and a fundamental intel-
lectual challenge for CS&E research for at least the next decade.
DATA COMMUNICATIONS AND NETWORKING
Compared with copper wires, fiber-optic channels provide enor-
mous bandwidths at extremely attractive costs. A 1000-fold increase
in bandwidth completely changes technology trade-offs and requires
a radically different network design for at least three reasons.
One reason is that the speed of transmission, bounded by the
speed of light, is about the same whether the medium of transmis-
sion is copper wire or optical fiber. Current computer networks are
based on the premise that transit time (i.e., the time it takes for a
given bit to travel from sender to receiver) is small compared to the
times needed for processing and queuing.7 However, data can be
entered into a gigabit network so fast that transit time may be com-
parable to or even longer than processing and queuing time, thereby
invalidating this premise. For example, a megabyte-size file can be
queued in a gigabit network in ten milliseconds. But if the file is
transmitted coast to coast, the transit time is about twice as long.
OCR for page 102
102
COMPUTING THE FUTtIRE
Under these conditions, millions of bits will be pumped into the cross-
country link before the first bit appears at the output.
A second reason is that current networks operate slowly enough
that incoming messages can be stored temporarily or examined "on
the fly." Such examinations underlie features such as dynamic route
computation, in which the precise path that a given message takes
through a network is determined at intermediate nodes through which
it passes. In a gigabit network the volume of data is much larger and
the time available to perform "on-the-fly" calculations is much small-
er, perhaps so much so that store-and-forward operation and dynam-
ic routing may not be economically viable design options.
A third reason is that the underlying economics are very differ-
ent. In current networks, channel capacity (i.e., bandwidth) is expen-
sive compared with the equipment that allows many users to share
the channel. Sharing the channel (i.e., "multiplexing," or switching
among many users) minimizes the idle time of the channel. But fiber
optics is based on the transmission of light pulses (photons) rather
than electrical signals. The technology for switching light pulses is
immature compared with that for switching electrical signals, with
the result that switching devices for fiber optics are relatively more
expensive than channel capacity.
As noted in Chapter 6, a complete understanding of networks
based on first principles is not available at this time. Today's knowl-
edge of networking is based largely on experience with and observa-
tion of megabit networks. Gigabit networking thus presents a chal-
lenging research agenda, one that is an important focus of the HPCC
Program. Consider the following kinds of the research problems that
arise in the study of gigabit networking:
· Network stability (i.e., the behavior of the flow of message traf-
fic) is particularly critical for high-speed networks. A network is an
interconnected system, with many possible paths for feedback to any
given node. A packet sent by one node into the network may trig-
ger at some indeterminate point in the future further actions in
other nodes that will have effects on the originating node. The in-
ability to predict just when these feedback effects will occur presents
many problems for system designers concerned about avoiding cata-
strophic positive feedback loops that can rapidly consume all avail-
able bandwidth. This so-called delayed-feedback problem is unsolved
for slower networks as well, but our understanding for slower net-
works is at least informed by years of experience.
· Network response is another issue that depends on empirical
understanding. In particular, the fiber-based networks of the future
OCR for page 103
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
103
will transmit data much more rapidly, the computer systems inter-
connected on these networks will operate much more quickly, the
number of users will be much larger, and computing tasks may well
be dispersed over the network to a much greater degree than today.
All of these factors will affect the behavior of the network.
· Network management itself requires communication between net-
work nodes. Gigabit networks will involve significant quantities of
this "overhead" information (e.g., routing information), primarily be-
cause there will be so many messages in transit. Thus, fast networks
require protocols and algorithms that will reduce to an absolute min-
imum the overhead involved in the transmission of any given mes-
sage.
· Network connections will have to be much cheaper. Scaling a
network from 100,000 connections (the Internet today) to 100 million
connections (the number of households in the United States, and the
ultimate goal of many networking proponents for which the National
Research and Education Network may be a first step) will require
radical reductions in the cost of installing and maintaining individu-
al connections. These costs will have to drop by orders of magni-
tude, a result possible only with the large-scale automation of opera-
tions, similar to that used in the telephone network today.
SOFTWARE ENGINEERING
The problems of large-scale software engineering have been the
focus of many previous studies and reports, in particular the Hopcroft-
Kennedy report (Box 3.5) and the CSTB report Scaling Up (Table 3.1~.
Nevertheless, large-scale software engineering remains a central chal-
lenge, as discussed in Box 3.6.
The committee recommends continuing efforts across a broad front
to understand large-scale software engineering, concurs with the re-
search agendas of the Hopcroft-Kennedy and CSTB reports, and wishes
to underscore the importance of two key areas, reengineering of ex-
isting software and testing.
Reengineering of Existing Software
Large-scale users of computers place great emphasis on reliabili-
ty and consistency of operation, and they have enormous investments
tied up in software developed many years ago by people who have
long since retired or moved on to other jobs. These users often rec-
ognize that their old software systems are antiquated and difficult to
maintain, but they are still reluctant to abandon them. The reason is
OCR for page 104
104
COMPUTING THE FUTURE
that system upgrades (e.g., converting an air traffic control system
that might be written in PL/1 to a more modern one written in Ada)
present enormous risks to the users who rely daily on that system.
The new system must do exactly what the old system did; indeed, a
new system may need to include bugs from the old system that pre-
viously necessitated "work-arounds," because the people and other
computer systems that used the old system have become accustomed
to using those work-arounds. In many cases the current operating
procedures of the organization are only encoded in (often undocu-
mented) programs and are not written down or known completely
by any identifiable set of people.
Thus effective reengineering requires the ability to extract from
code the essentials of existing designs. New technologies that sup-
port effective and rapid upgrade within operational constraints would
OCR for page 105
CORE CSSE RESEARCH ~GE~ FOR ME FUTURE
TABLE 31 The "Scaling Up" Agenda for Software Engineering
Research
703
Short Term (1-5 years)
Long Term (3-10 years)
Perspective
Engineering practice
Research modes
Portray systems
real~Ucally: vow sagas
as systems and recognize
change as intrinsic
Study and preserve
software art/facts
Codify software
engineering knowledge
for dissemination and
reuse
Develop software
~g~e~
handbooks
Foster p~ct~r
researcher ~terachons
Research ~ unifying model
for software development-
for matching programming
languages to appl~abons
domains and design phases
Strengthen mathematical
Id scientific foundations
Automat handbook
knowledge, access and
reuse-and make
development of routine
software more routine
Nurture collaboration among
system developers and
between developers and
users
Legitimize academic
exploration of large software
systems in situ
Clean insights from
behavioral and managerial
sciences
Develop iddibonal research
directions and paradigms
amours common of
review studies, contribution
to handbooks
SOURCE: Reprinted Mom Computer Sconce and Technology Board, Nadonal Re-
search Councit Sag by: ~ Research Banjo jar Sphere En~f~eerfng, National Acad-
emy Press, Washington, D.C., 1989, p. 4.
have enormous value to software engineering, especially in the com-
mercial world. Such technologies.could include graphical problem-
description methodologies that provide visual representations of pro-
gram or data Bow or automated software tools that make ~ easier to
extract specifications from existing code or to compare different sets
of specifications for contradictions or inconsistencies.
OCR for page 106
106
COMPUTING THE FUTURE
...... ... is ~ ~
............ .. .
__ At. . . ... .....
OCR for page 107
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
107
Testing
As noted in Box 3.2, testing is a severe bottleneck in the delivery
of software products to market.8 Moreover, while program verifica-
tion, proofs of program correctness, and mathematical modeling of
program behavior are feasible at program sizes on the scale of hun-
dreds of lines, these techniques are inadequate for significantly larg-
er programs. One reason is sheer magnitude. A second, more im-
portant reason is the inherent incompleteness, if not incorrectness, of
large sets of specifications. Program verification can show that a
program conforms to its specifications or that the specifications con-
tain inadvertent loose ends, but not that the specifications describe
what really needs to be done.
Thus theories and practical methods of software testing that are
applicable to real-world development environments are essential. Some
relevant questions are the following:
· How can competent test cases be generated automatically?
· How can conformity between documentation and program function
be achieved?
· How can requirements be tested and verified?
INFORMATION STORAGE AND MANAGEMENT
The Lagunita report described an important and far-reaching agenda
for database research (Box 3.7~. The committee believes that the La-
gunita research agenda remains timely and appropriate, and also com-
mends for attention:
· Data mining and browsing techniques that can uncover previous
OCR for page 108
108
COMPUTING THE FUTURE
. . .~ . .
ly unsuspected relationships in data aggregated from many sources.
Box 3.8 describes some database research questions that are motivat-
ed by commercial computing.
· Systems architectures, data representations, and algorithms to ex-
ploit heterogeneous, distributed, or multimedia databases on scales
of terabytes and up. Multimedia databases will be especially useful
to modern businesses, most of which make substantial use of text
and images; document and image scanning, recognition, storage, and
display are at the core of most office systems.
Current networks
. , ,
databases, tools, and programming languages do not handle images
or structured text very well. Image searches, in particular, usually
depend on keyword tags assigned to images manually and in ad
~rance.
Distributed databases maintained at different nodes are increas-
ingly common. Integrating data residing in different parts of the
database (e.g., in different companies, or different parts of the same
company) will become more important and necessary in the future.
Thus research on multimedia and distributed databases would have
a particularly high payoff for commercial computing.
OCR for page 109
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
109
OCR for page 110
110
COMPUTING THE FUTURE
......... .................................................... ... . . .
.................... : . ~ :
, , ,,, ........ . . . . '' ' ' '' ' ' ' ' ' ' ' ' ' '' ' ' ' '' ' ' ' '
'.'.','.. '.''"'s"2'u.''"'"'"2'"'"" ""''' ""
1~111)f~141~171~1115~333333333~1;c1~il
chances ro iei~ottiD8 st nouns 21 r to A fist c os PF e" o
i~ a percentage bt~tllt>ricing to actua| F3r~tesl as is currently ha
3~1 fa333i~i3331~1~51~!~1~!ut~;
, ... .
. ........ . . . . .. . . . ..... . . ....... ......
r ~ ~ he ess2 4 a I Q IthmS a O
Me|| Furthermo ei it s t em tabs acme ails the ~eceivel1~3deteil
~ ire crt~|iepeous da:|
........ ............................... ..
IS:
RELIABILITY
Reliability informally defined here as the property of a comput-
er system that the system can be counted on to do what it is sup-
posed to do- is an example of a research area that potentially builds
on, or is a part of, many other areas of CS&E. Distributed systems
provide one promising method for constructing reliable systems.
Assuring that a program behaves according to its specification is one
of the first requirements of software engineering. Large (terabyte-
scale) databases will need to be maintained on line and to be accessi-
ble for periods longer than the time between power failures, the time
between media failures (disk crashes), and the lifetime of data for-
mats and operating software.
As computing becomes a crucial part of more and more aspects
of our lives and the economy, the reliability of computing correctly
comes into question (Box 3.9~. The following technical problems are
often relevant to decisions regarding whether computers should be
used in critical applications:
· As failures in telephone and air traffic control systems have
demonstrated, errors can propagate catastrophically, causing service
OCR for page 111
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
111
~ ~ ~: . ~ ~ ~ ~ ~: ~ ~ :~ . ~ . ~ ~ ~ ~ . . ~ ~: ~ :~ . ~ ~ .~ ~ ~ . ~ :~ ~ ~ ~ ~ ~ . ~: ~ .~ ~
outages out of proportion to the local failures that caused the prob-
lem. The problems of ensuring reliability in distributed computing
are multiplying at least as rapidly as solutions.
· Software systems, particularly successful ones, usually change
enormously with time. Yet almost everything in the system builder's
tool kit is aimed at building static products. Existing techniques do
not contemplate making a system so that it can change and evolve
without being taken off line. Upgrading a system while it runs is an
important challenge. A related challenge is doing large computa-
tions whose running time exceeds the expected "up time" of the computer
or computers on which the computation is executed. Ad hoc meth-
ods of checkpointing and program monitoring are known, but their
use may introduce debilitating complications into the programs.
· Nonexpert users of computers need graceful recovery from er-
rors (rather than cryptic messages, such as "Abort, Retry, or Fail?")
and automatic backup or other mechanisms that insulate them from
the penalties of error.
USER INTERFACES
User interfaces, one dimension of a subfield of CS&E known as
human-computer interaction, offer diverse research challenges.9 The
keyboard and the mouse remain the dominant input devices today.
OCR for page 112
112
COMPUTING THE FUTURE
Talking to a computer is in certain situations more convenient than
typing, but the use of speech as an input medium poses many prob-
lems, some of which are listed in Box 3.10. If it is to cope with
situations of any complexity, a computer must be able to interpret
imprecisely or incompletely formulated utterances, recognize ambi-
guities, and exploit feedback from the task at hand. Analogous prob-
lems exist in recognizing cursive handwriting, and even printed mat-
ter, in which sequences of letters are merged or indistinct.
The experimental DARPA-funded SPHINX system, described more
fully in the Chapter 6 section "Artificial Intelligence," is a promising
start to solving some of the problems of speech recognition, and Ap-
ple Computer expects to bring to market in the next few years (and
has already demonstrated) a commercial product for speech recogni-
tion called "Plaintalk" based on SPHINX.
Recognition of gestures would also increase the comfort and ease
of human-computer interaction. People often indicate what they want
with gestures they point to an object. Touch-sensitive screens can
provide a simple kinesthetic input in two dimensions, but the recog-
nition of motions in three dimensions is much more difficult. Pen-
based computing, i.e., the use of a pen to replace both the keyboard
(for the input of characters) and the mouse (for pointing), is another
form of gesture recognition that is enormously challenging and yet
has the potential for expanding the number of computer users con-
siderably. Indeed, the ability to recognize handwritten characters,
both printed and cursive, will enable computers to dispense entirely
with keyboards, making them much more portable and much easier
to use.
The primary output devices of today adhere to the paper meta-
phor; even the CRT screen is similar to a sheet or sheets of paper on
, by.. ~ ~
OCR for page 113
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
113
which two-dimensional visual objects (e.g., characters or images) are
presented, albeit more dynamically than on paper. People can, how-
ever, absorb information through, and have extraordinary faculties
for integrating stimuli from, different senses.
Audio output systems can provide easily available sensory cues
when certain actions are performed. Many computers today beep
when the user has made a mistake, alerting him or her to that fact.
But sounds of different intensity, pitch, or texture could be used to
provide much more sophisticated feedback. For example, an audio
output system could inform a user about the size of a file being
deleted, without forcing the user to check the file size explicitly, by
making a "clunk" sound when a large file is deleted and a "tinkle"
sound when a small one is deleted.
Touch may also provide feedback. Chapter 6 (in the section ti-
tled "Computer Graphics and Scientific Visualization") describes the
use of force feedback in the determination of molecular "fitting"-
how a complex organic molecule fits into a receptor site in another
molecule. But the use of a joystick is relatively unsophisticated com-
pared to the use of force output devices that could provide resistance
to the motion of all body parts.
Three-dimensional visual output provides other interesting re-
search issues. One is the development of devices to present three-
dimensional visual output that are less cumbersome than the elec-
tronic helmets often used today. A second issue cuts across all problem
domains and yet depends on the specifics of each domain: many
appealing examples of "virtual reality" displays have been proposed
and even demonstrated, but conceiving of sensible mappings from
raw data to images depends very much on the application. In some
cases, the sensible mappings are obvious. A visual flight simulator
simulates the aircraft dynamics in real time and presents its output
as images the pilot would see while flying that airplane. (Along the
lines of the discussions above, the sounds, motions, control pres-
sures, and instruments of the simulated aircraft may also be present-
ed to the pilot. These simulations are so realistic that an air-trans-
port pilot's first flight in a real aircraft of a given type may be with
passengers.~° ~ But in other cases, such as dealing with abstract data,
useful mappings are not at all obvious. What, for example, might be
done with the reams of financial data associated with the stock market?
SUMMARY AND CONCLUSIONS
The core research agenda for CS&E has been well served in the
past by the synergistic interaction between the computer industry,
OCR for page 114
14
COMPUTING THE FUTURE
the companies that are the eventual consumers of computer hard-
ware, software, arid services, and the federal research-funding agen-
cies. As a result, CS&E research exhibits great diversity, a diversity
that is highly positive arid beneficial. Ire turret, this diversity allows a
broad range of challenging problems and opportunities to be ad-
dressed by CS&E research. Thus, although the committee cannot
escape its obligation to address priorities and to provide examples of
research areas that it believes hold promise, it must be guarded in its
judgment of what constitutes today's most important research.
That said, the committee believes that major qualitative and quan-
titative advances in several dimensions will continue to drive the
evolution of computing technology. These dimensions include pro-
cessor capabilities and multiple-processor systems, available band-
width and connectivity for data communications and networking,
program size and complexity, the management of increased volumes
of data of diverse types and from diverse sources, and the number of
people usurp computers and networks. Understanding and manag-
ing these changes of scale will pose many fundamental problems in
computer science and engineering, and using these changes of scale
properly will result in more powerful computer systems that will
have profound effects on all areas of human endeavor.
NOTES
1. The definition of which subareas of CS&E research constitute the "core" is
subject to some debate within the field. For example, the Computer Science and
Technology Board report The National Challenge in Computer Science and Technology
(National Academy Press, Washington, D.C., 1988) identified processor design, dis-
tributed systems, software and programming, artificial intelligence, and theoretical
computer science as the subfields most likely to influence the evolution of CS&E in the
future, noting that "the absence of discussion of areas such as databases does not
mean that they are less important, but rather that they are likely to evolve further
primarily through exploitation of the principal thrusts that [are discussed]" (p. 39). In
its own deliberations, the committee included such areas in the "core" of CS&E, moti-
vated in large part by its belief that their importance is likely to grow as CS&E ex-
pands its horizons to embrace interdisciplinary and applications-oriented work.
2. Computer Science and Technology Board, National Research Council, The Na-
tional Challenge in Computer Science and Technology, National Academy Press, Washing-
ton, D.C., 1988; John E. Hopcroft and Kenneth W. Kennedy, eds., Computer Science:
Achievements and Opportunities, Society for Industrial and Applied Mathematics, Phila-
delphia, 1989; Avi Silberschatz, Michael Stonebraker, and Jeff Ullman, eds., "Database
Systems: Achievements and Opportunities," Communications of the ACM, Volume 34(10),
October 1991, pp. 110-120; Computer Science and Technology Board, National Re-
search Council, Scaling Up: A Research Agenda for Software Engineering, National Acad-
emy Press, Washington, D.C., 1989.
OCR for page 115
A CORE CS&E RESEARCH AGENDA FOR THE FUTURE
115
3. Hopcroft and Kennedy, Computer Science: Achievements and Opportunities, 1989,
p. 72.
4. A simple simulation argument shows in general that super-linear speedup on
homogeneous parallel systems (i.e., systems that connect the same basic processor
many times in parallel) is not possible. Super-linear speedup would involve, for ex-
ample, applying two processors to a problem (or to a selected class of problems) and
obtaining a speedup larger than a factor of two. In addition, for many interesting
applications it turns out that even linear speedup is impossible even when the ma-
chine design should "in principle" allow linear scaleup. Both the limitations of real
machines and the issues of what scaling implies for problems from the physical world
make even linear scaleup impossible for many real problems.
5. For example, parallel processors are emerging as effective search engines for
terabyte-size databases. Automatic declustering of data across many storage devices
and automatic extraction of parallelism from nonprocedural database languages such
as SQL are demonstrating linear speedup and scaleup. Teradata, Inc., has demonstrat-
ed scaleups and speedups of 100:1 on certain database search problems.
6. Distributed computing refers to multiple-processor computing in which the
overall cost or performance of a computation is dominated by the requirements of
communicating data between individual processors, rather than the requirements of
performing computations on individual processors. Parallel computing refers to the
case in which the requirements of computations on individual processors are more
important than the requirements of communications.
7. Transit time is important to gigabit networks because the arrival of messages at
a given node is a statistical phenomenon. If these messages arrive randomly (i.e., if
the arrival times of messages are statistically independent), the node can be designed
to accommodate a maximum capacity determined by well-understood statistics. How-
ever, if the arrival time of messages is correlated, the design of the node is much more
complicated, because "worst cases" (e.g., too many messages arriving simultaneously)
will not be smoothed out for statistical reasons.
In gigabit networks, the network-switching and message-queuing time for small
files will be much smaller than the transit time. The result is that the end-to-end
transmission time for all messages will cluster around the transit time, rather than
spread out over a wide range of times as in the case of lower-speed networks.
8. The impact of software testing on product schedules has been known for a long
time. In 1975, Fred Brooks noted that testing generally consumed half of a project's
schedule. See Frederick Brooks, The Mythical Man-Month, Addison-Wesley, Reading,
Mass., 1975, p. 29.
9. Human-computer interaction is a very broad field of inquiry, some other areas
of which are discussed in Box 2.8 in Chapter 2. Human-computer interaction is highly
interdisciplinary, drawing on insights provided by fields such as anthropology, cogni-
tive science, and even neuroscience to develop ways for computer scientists and engi-
neers to maximize the effectiveness of these interactions.
10. For example, the flight simulator for the A320 Airbus is sufficiently sophisticat-
ed that pilots can receive flight certification based solely on simulator training. See
Gary Stix, "Along for the Ride," Scientific American, July 1991, p. 97.
Representative terms from entire chapter:
distributed computing