| [ Top of Page ] [ Home ] [ Contact Us ] [ Help ] [ The National Academies Home ] | ||
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
PART I
Workshop Summary
OCR for page 2
OCR for page 3
Workshop Summary
INTRODUCTION
In the summer of 2002, the Office of Naval Research asked the Committee on Human Factors to hold a
workshop on dynamic social network modeling and analysis. The primary purpose of the workshop was to bring
together scientists who represent a diversity of views and approaches to share their insights, commentary, and
critiques on the developing body of social network analysis research and application. The secondary purpose of
the workshop was to assist government and private-sector agencies in assessing the capabilities of social network
analysis to provide sound models and applications for current problems of national importance, with a particular
focus on national security. Some of the presenters focused on social network theory and method, others attempted
to relate their research or expertise to applied issues of interest to various government agencies. This workshop is
one of several activities undertaken by the National Research Council that bears on the contributions of various
scientific disciplines to understanding and defending against terrorism a topic raised by Bruce M. Alberts,
president of the National Academy of Sciences, in his annual address to the membership in April 2002.
The workshop was held in Washington, D.C., on November 7-9, 2002. Twenty-two researchers were asked to
prepare papers and give presentations. The presentations were grouped into four sessions, each of which con-
cluded with a discussant-led roundtable discussion among presenters and workshop attendees on the themes and
issues raised in the session. The sessions were: (1) Social Network Theory Perspectives, (2) Dynamic Social
Networks, (3) Metrics and Models, and (4) Networked Worlds. The opening address was presented by workshop
chair Ronald Breiger, of the University of Arizona; Kathleen Carley, of Carnegie Mellon University, offered
closing remarks summarizing the sessions and linking the work to applications in national security. Part II of this
report contains the opening address, the closing remarks, and the papers as provided by the authors. The agenda
and biographical sketches of the presenters are found in the appendixes.
This summary presents the major themes developed by the presenters and discussants in each session and
concludes with research issues and prospects for both the research and applications communities.
WORKSHOP SESSIONS AND THEMES
Overall, the workshop provided presentations on the state of the art in social network analysis and its potential
contribution to policy makers. The papers run the gamut from the technical to the theoretical, and examine such
OCR for page 4
4
DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS
application areas as public health, culture, markets, and politics. Throughout the workshop a number of themes
emerged, all based on the following understandings:
· Both network theory and methodology have expanded rapidly in the past decade.
· New approaches are combining social network analysis with techniques and theories from other research
areas.
· Many of the existing tools, metrics, and theories need to be revisited in the context of very large scale and/
or dynamic networks.
· The body of applied work in social networking is growing.
Several common analytical issues underlie much of the research reported. First, traditional social network
analysis is "data greedy" very detailed data are required on all participants. Questions to be addressed in the
analysis of these data concern how to estimate the data from high-level indicators, how sensitive the measures are
to missing data, and how network data can be collected rapidly and/or automatically. Furthermore, advances
require the development of additional shareable data sets that capture heretofore understudied aspects such as
large-scale networks, sampling errors, linkages to other types of data, and over-time data. Second, traditional
social network analysis (especially prior to the last couple of decades)1 has focused on static networks, whereas
much of the work discussed here focuses on the processes by which networks change or emerge. While ongoing
data collection and analysis are providing key new insights, researchers need new statistical methods, simulation
models, and visualization techniques to handle such dynamic data and to use these data to reason about change.
Third, social network theories are beginning to outstrip the measures and data. For example, theories often posit
ties as being flexible, probabilistic, or scaled; most data and metrics, however, are still based on binary data.
Session I: Social Network Theory Perspectives
Presenters and Papers
Discussant: Ronald Breiger
1. Linton C. Freeman, Finding Social Groups: A Meta-Analysis of the Southern Women Data
2. Harrison C. White, Autonomy vs. Equivalence Within Market Network Structure ?
3. Noah E. Friedkin, Social Influence Network Theory: Toward a Science of Strategic Modification of
Interpersonal Influence Systems
4. David Lazer, Information and Innovation in a Networked World
Themes
The papers in this session illustrate the breadth of areas that can be addressed by social network analysis. On
one hand, the work can be used to explain, predict, and understand the behavior of small groups and the influence
of group members on one another, as seen in the work of Freeman and Friedkin. On the other hand, social network
analysis can be "writ large" and applied at the market or institutional level, as described in the papers by White and
Lazer. Regardless of network size, all four papers demonstrate that a structural analysis that focuses on connec-
tions can provide insight into how one person, group, or event can and does influence another. These people or
groups or events cannot, and do not, act in an autonomous fashion; rather, their actions are constrained by their
position in the overall network, which is in turn constrained by the other networks and institutions in which they
are embedded (the overall ecology). Further, the papers presented in this session review the range of methodologi-
1Early work on dynamic modeling was done by P.S. Holland and S. Leinhardt (A dynamic model for social networks, Journal of
Mathematical Sociology 5:5-20, 1977).
OCR for page 5
WORKSHOP SUMMARY
s
Cal approaches and styles of analysis that are compatible with a social network approach. Unlike many other
scientific methods, the social network approach can be used with ethnographic and field data (Freeman), experi-
mental laboratory data (Friedkin), historical examples (White), and policy/technology evaluation (Lazer).
Freeman provided a framework for assessing comparative analyses of a long-standing object of sociological
theory the small social collectivity characterized by interpersonal ties. White integrated theoretical perspectives
on a network-based sociology of markets and firms. Friedkin advocated applications of social influence network
theory to problems of network modification. Lazer considered governance questions arising from the "informa-
tional efficiency" of different network architectures (spatial, organizational, emergent) and the prospects of "free
riding" governments becoming complacent about innovating in the hope that another government will bear the
cost of a successful innovation. Three major themes crosscut these papers and the resultant discussion.
Scaling up and uncertainty. How well do the different analytical techniques and algorithms "scale up" to large
networks with hundreds or thousands of actors and multiple types of relations? Perhaps a more useful phrasing of
this question is, Under what conditions andfor which analytical purposes do models of social networks scale up,
and how well do existing techniques deal with uncertainty in information? Spirited discussion arose in response
to the question of whether the same social network model may be posed at "micro" and "macro" levels of social
organization, or whether scaling up must involve the addition of substantially more complex representation of
social structure within the network model.
White' s model requires the analyst to account explicitly for the varied circumstances of particular industries.
In his work, markets constructed among firms in networks are mapped into a space of settings with interpretable
parameters that govern the ratio of demand to producers' costs; key parameters pertain to growth in volume and to
variation in product quality. White's model is also distinctive in treating uncertainty not as a technical problem
implicated in parameter estimation (statistical models being a main concern of the second and third sessions of this
workshop), but as a substantive force that drives the evolution of ranking and manipulation as organizing features
of a space of markets. During discussion, White expressed the view that scaling up is indeed a formidable
challenge.
Friedkin, on the other hand, felt that the only practical constraint on social influence models is the problem of
data-gathering ability. Friedkin's social influence network theory describes an influence process in which mem-
bers' attitudes and opinions on an issue change recursively as the members revise their positions by taking
weighted averages of the positions of influential fellow members. One example of an application would be
producers who eye, and orient to, their competitors while figuring out the cost of their product, as in White's
market model. The mathematics is general Friedkin suggested applications to the modification of group struc-
ture such that, for example, outcomes are rendered less sensitive to minor changes in influence structure or to
initial opinions of group members. However, Friedkin's model focuses on convergent interpersonal dynamics
rather than on the structuring of qualitatively distinct network outcomes, as in White's market ecology.
Network outcomes. Both Lazer and Friedkin argued that the partial structuring of interdependence, a definitive
aspect of social networks, must be taken into account in theorizing the production of outcomes. Taking innovative
information, such as knowledge of policies or innovations that work as a desired network outcome, Lazer's paper
asks how interdependence can be governed in large and complex systems. Lazer develops the argument that,
where the production of information requires costly investment, there is a paradoxical possibility that the more
efficient a system is at spreading information, the less new information the system might generate. This suggests
that in networked (as distinct from hierarchical) worlds, incentives should be provided to continue experimenting
with innovations.
Breiger suggested the benefits of a comparative reading of Lazer' s paper and Friedkin' s: In Friedkin' s model
the dependent variable is actor opinions at equilibrium; in Lazer's, it is knowledge created by or held by actors.
Can the free riding that motivates Lazer's rational actors be usefully applied to suggest processes that structure the
interior of Friedkin's influence networks? Conversely, can Friedkin's model provide a concrete format for
specifying Lazer's theories of information interdependency?
OCR for page 6
6
DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS
Meta-analysis of network analysis methods. Freeman was able to locate 21 analyses of the same data set
(concerning the participation of women in social events in a southern city in the 1930s). He examined these studies
by use of a form of meta-analysis, but it is an unusual form in that multiple analytic techniques are applied to a
single data set. There is an underlying dimension on which the analytic methods converge, and the several most
effective techniques allow identification of a single (and, in this convergent sense, most informed) description of
the data. Generally, the "best" of these 21 analytic procedures agree more with one another than with the anecdotal
description supplied by the original data gatherers. The possibility is raised that in certain circumstances (perhaps
in cases where the various methods applied yield results that are not too far from the true network structure) the
intensive application of multiple analytic methods may compensate for problems in data quality.
Session II: Dynamic Social Networks
Presenters and Papers
Discussant: Stanley Wasserman
1. Jeffrey C. Johnson, Informal Social Roles and the Evolution and Stability of Social Networks (coauthors
Lawrence A. Palinkas and James S. Boster)
2. Kathleen M. Carley, Dynamic Network Analysis
3. Tom A.B. Snijders, Accounting for Degree Distributions in Empirical Analysis of Network Dynamics
4. Michael W. Macy, Polarization in Dynamic Networks: A Hopfield Model of Emergent Structure
(coauthors James A. Kitts and Andreas Flache)
5. Martina Morris, Local Rules and Global Properties: Modeling the Emergence of Network Structure
6. H. Eugene Stanley, Threat Networks and Threatened Networks: Interdisciplinary Approaches to Stabiliza-
tion and Immunization (coauthor Shlomo Havlin)
Themes
The papers in this session address the evolution, emergence, and dynamics of network structure. Methods
range across ethnography and participant observation, statistical modeling, simulation studies, and models em-
ploying computational agents. The methods are not mutually exclusive and were often used together to create a
more complete understanding of the dynamics of social networks. Four themes emerged from these papers and the
roundtable discussion.
What makes networks effective or ineffective? This theme added an explicitly dynamic focus to the "network
outcomes" theme of the previous session. Various factors were considered, including the presence or absence of
certain roles, structural characteristics (patterns of ties), and connections to other networks. The cross-cultural
ethnographies of network evolution conducted by Jeffrey Johnson and his colleagues demonstrate that group
dynamics can vary dramatically from one group to another even within the same physical and cultural setting and
in the presence of similar organizational goals and formal structure. Johnson et al., studying Antarctic research
station teams, found five features of emergent social roles that are associated with the evolution of effective
networks: heterogeneity (such that members' roles fit in with one another), consensus (agreement on individuals'
status and function), redundancy (such that removal of a single actor still ensures proper functioning, avoiding
vulnerability), latency (promoting adaptive responses to unforeseen events) and isomornhism of formal and
informal social roles (promoting agreement on group goals and objectives). The studies make use of quantitative
. . . . .
modeling of network structures over time as well as direct observation over extended periods.
Multiagent network models are featured in the dynamic network analysis of Kathleen Carley and in the
computational modeling of Michael Macy and his colleagues. Carley has formulated a highly distinctive approach
to the dynamic modeling of social networks. Her simulation models and her formulation of ties among actors as
probabilistic (such that connections can go away, get stronger, or change in strength dynamically over time) rather
than deterministic allow her to investigate how networks may endeavor to regain past effectiveness (or to "re-
OCR for page 7
WORKSHOP SUMMARY
7
grow") to make up for destabilizing losses. "What if" exercises allow theoretical exploration of assumptions that
an analyst makes about network vulnerabilities and about alternative distributions of resources, with reference to
the system the analyst has constructed. For example, Carley found that removal of the most central node might
leave a network less vulnerable than removal of an emergent leader.
Macy et al. studied conditions in which a group might be expected to evolve into increasingly antagonistic
camps. Their computational model allows the manipulation of qualities attributed to agents for example, their
tendency to focus on a single issue, the degree of their conviction, and their rigidity or openness to influence from
others. A surprising finding of this simulation study is that global alignment along a single polarizing definition of
opposing ideologies is facilitated by ideological flexibility and open-mindedness among local constituents, as seen
in the elegantly simple system of actors and relations postulated by the type of neural network employed by Macy
et al.
Researchers in the statistical physics community have recently focused attention on "scale-free" social net-
works, characterized, as Eugene Stanley pointed out in his presentation, by a power-law distribution of ties
emanating from the nodes, loosely analogous to an airline route map showing a very small number of well-
connected "hubs" and many less well-connected nodes. It has been proven that scale-free networks are optimally
resilient to random failure of individuals; Stanley and Havlin point out at the same time, however, that such
networks are highly susceptible to deliberate attack. Under the assumption that social networks of people exposed
to disease have the scale-free property, the authors review possible strategies for immunization.
Dynamics of local structure. The papers presented by Macy and by Carley also speak effectively to another theme
that crosscut many of the papers in this session: the advancement of modeling techniques that focus on behavior
among small sets of actors and on the implications of such local behavior for the evolution of a network macro-
structure. This theme provides a dynamic cast to the ideas of "scaling up" presented in the first session. Compu-
tational models of social networks are precisely about exploring the evolution of whole systems on the basis of
rules (such as when an actor should form a tie with another) and endowments postulated at the level of individual
actors.
Coming from quite a different direction, that of the formulation of models that allow the fit of models to data
to be assessed within a statistical context, Snijders' paper reports an investigation of network evolution on the basis
of a stochastic, actor-oriented model that embeds discrete-time observations in an unobserved continuous-time
network evolution process. Just one arc is added or deleted at any given moment, and actors are assumed to try to
obtain favorable network configurations for themselves.
The paper by Martina Morris continues the focus on relating local rules to global structure, relying on
empirical data (from sexual partner networks) and statistical modeling as well as simulations. Her question was
whether the overall network structure can be explained by recourse to a small number of partnership formation
rules that operate on the local, individual level. Two such rules were evinced in selective mixing patterns, such as
the degree of matching on race or age, and the timing/sequencing of partnership formation (e.g., serial monogamy
versus concurrency). Morris linked network data to network simulation by means of a statistical modeling
framework: statistical models for random graphs as implemented via the Markov chain Monte Carlo (MCMC)
estimation method. Putting together local rules and simulations based on random graph (statistical) methods
allows empirical modeling of global networks on the basis of local properties.
Key features of networks in modeling evolution. Physicists including Stanley and Havlin, who have made many
recent contributions to the modeling of scale-free networks, are interested in the dynamic evolution of degree
distributions (number of ties emanating from each node). Sociologists tend to emphasize that other features of the
network are also of great importance in network evolution, including the degree of transitivity (which is one way
of measuring hierarchy), cyclicity (a hierarchy-defeating principle), segmentation into subgroups, and so on. In
his paper, Snijders demonstrates that it is possible to formulate a model for network evolution in which the
evolutionary process of the degree distribution is decoupled from other, arguably important, features of the
network's evolution such as those mentioned above. Snijders' paper elaborates a statistical context within which
the contribution of each of several features of network evolution might be comparatively assessed. Carley's paper
OCR for page 8
8
DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS
elaborates mechanisms for evolution that draw on nonstructural properties such as individual learning and re-
source depletion, as well as exogenous changes such as the removal of specific nodes.
Simulation and computational-actor models versus validation and statistical models. A lively roundtable
discussion of the six naners in this session focused in particular on the nrecedin~ theme and on the relative meets
_ _ _ _ ___ _ _ _ _ _ _ _ __ _ _ _ __ _ _ __ __ ~ _ ~ O
~ . . .. .. . . . . . . . . . . .. . .
Ot conceptual modeling versus validation techniques. it was argued, on one hand, that an emphasis on validation
is healthy because the analyst can be in a position to distinguish real patterns from mere noise or from wishful
thinking. On the other hand, it was argued that simulations can help an analyst to explore the logic of postulated
mechanisms that may be driving an empirical result, and to explore realms of the possible rather than predicting
the future of a specific event or action. Such predictions are generally not possible outcomes of simulation studies.
It was argued that each broad project approach (statistical rigor and simulations that explore various insights) is
valuable, as are efforts to integrate them more closely. As more data become publicly available to the social
networking community, they can be used to improve both simulation and statistical techniques.
Session III: Metrics and Models
Presenters and Papers
Discussant: Philippa Pattison
1. Stanley Wasserman, Sensitivity Analysis of Social Network Data and Methods:
Results (coauthor Douglas Steinley)
Some Preliminary
2. Andrew J. Seary and William D. Richards, Spectral Methods for Analyzing and Visualizing Networks:
An Introduction
3. Mark S. Handcock, Assessing Degeneracy in Statistical Models for Social Networks
4. Stephen P. Borgatti, The Key Player Problem
5. Elisa Jayne Bienenstock and Phillip Bonacich, Balancing Efficiency and Vulnerability in Social Networks
6. Christos Faloutsos, ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs (coauthors
Christopher R. Palmer and Phillip B. Gibbons)
Themes
The papers in this session present methodological developments at the forefront of efforts to construct
statistical models and metrics for understanding social networks. Although a number of papers in the other
sessions also contributed significantly to this effort, a key distinction between these papers and those in the other
sessions is that they focus on what can be learned if we only have network data. At least four important themes
guiding the development of new models and metrics can be identified.
Exploratory data analysis (EDAJ for networks. The first is the development of techniques that fall under the
broad class of methods for exploratory data analysis (EDA) for networks. Such methods include descriptive
measures and analyses that assist in summarizing and visualizing properties of networks and in investigating the
dependence of such measures on other network characteristics. Under this general heading, Seary and Richards
provided a comprehensive review of what can be learned from spectral analyses of matrices related to the
adjacency matrix of a network. They also illustrated the application of these analyses to empirical networks
using the computer program NEGOPY (a key concept in this program is that of liaisons). Faloutsos summarized
a number of relationships between node measures (such as degree of connectivity and the "hop exponent") and
their frequency in power-law terms and presented a fast algorithm for computing the approximate neighborhood
of each node. Wasserman and Steinley presented the first stages of a study designed to explore the "sensitivity"
of network measures by assessing the variation in some important network statistics (e.g., degree centralization,
"betweenness" centralization, and proportion of transitive triads) as a function of specified random graph
distributions.
OCR for page 9
WORKSHOP SUMMARY
9
Model development and estimation. The second guiding theme is the value of developing plausible models for
social networks whose parameters can be estimated from network data. Mark Handcock outlined the general class
of exponential random graph models and presented a compelling analysis of difficulties associated with estimating
certain models within the class. He showed how model degeneracy the tendency for probability mass to be
concentrated on just a few graphs interferes with approaches to apply standard simulation-based estimation
approaches, and he described an important alternative model parameterization in which such problems can be
handled.
1 1
Impact of network change on network properties. A third theme underlying the work presented in this session is
the importance of understanding how different measures of network structure change following "node removal" or
"node failure." For example, Borgatti considered two versions of the "key player" problem: Given a network, find
a set of k nodes that, if removed, maximally disrupts communication among remaining nodes or is maximally
connected to all other nodes. He proposed distance-based measures of fragmentation and reach as relevant to these
two versions of the key-player problem and presented an algorithm for optimizing the measures as well as several
applications.
Bienenstock and Bonacich contrasted the notions of efficiency and vulnerability in networks, and of random
and strategic attack, and examined the efficiency and resilience of four network forms random, scale-free,
lattice, and bipartite under both forms of attack. A distance-based efficiency measure similar to Borgatti's
fragmentation measure was proposed and vulnerability was measured as the average decrease in efficiency of
the network after a sequence of successive attacks. Bienenstock and Bonacich found that, of the class of networks
assessed, scale-free networks were most susceptible to attack, and lattice and bipartite models with a small
proportion of random ties offered the best balance of efficiency and resilience.
Finally, Faloutsos examined three types of network node failure: random, in order of degree, and in order of
approximate neighborhood size. He argued that the Internet at the router level was robust in the case of random
node failure but sensitive to the other two forms.
In the roundtable discussion, it was noted that the research topic of network vulnerability appears to be an
emerging area in which there are many useful, and usefully interrelated, results, with reference in particular to the
papers by Borgatti and by Bienenstock and Bonacich in this session, as well as to those by Carley and by Stanley
and Havlin in the previous session. Fast algorithms such as those developed by Faloutsos and colleagues are
necessary to extend these investigations to very large contexts such as the Internet.
Processes or flows on networks. A fourth theme is the importance of distinguishing the structure of a network
from the different types of dynamic processes or flows that the network might support. Borgatti described a
framework for distinguishing different interpersonal processes (e.g., disease transmission, dissemination of knowl-
edge) that might involve network partners and considered the implications of such distinctions for analyses of
network structure.
Session IV: Networked Worlds
Presenters and Papers
Discussant: David Lazer
1. Alden S. Klovdahl, Social Networks in Contemporary Societies
2. David Jensen, Data Mining in Social Networks (coauthor Jennifer Neville)
3. Peter D. Hoff, Random Effects Models for Network Data
4.
5.
Carter T. Butts, Predictability of Large-Scale Spatially Embedded Networks
Noshir S. Contractor, Using Multi-Theoretical Multi-Level (MTML) Models to Study Adversarial Networks
(coauthor Peter R. Monge)
6. Michael D. Ward, Identifying International Networks: Latent Spaces and Imputation (coauthor Peter D.
Hoff and Corey Lowell Lofdahl)
OCR for page 10
10
Themes
DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS
The papers in this session focus on the modeling of large-scale social networks. It is important to stress that
tools and data sets for addressing large-scale networks are still in their infancy. An overarching concern is the
extent to which standard social network metrics provide information in large-scale networks. A related need is
publicly available, large-scale network data sets that can serve as examples for systematic comparative method-
ological analysis. Four themes emerged from the papers and the roundtable discussion in this session.
Understanding the structure of large-scale networks. Klovdahl's paper demonstrates how social structure
can be exploited to obtain a sample of a large-scale network. In random-walk sampling, a small set of persons is
randomly sampled from a large population and interviewed; from the contacts provided by each interviewee, one
is randomly selected for an interview, and this process is repeated until chains of (say) length 2 are constructed. In
essence, this procedure allows observation of random sets of connected nodes, which provide the basis for
statistical inferences of the structural properties of large networks.
The feature of structure addressed in the paper by Butts is geographical distance, which he discusses as a
robust correlate of interaction (e.g., most participants in the 9/11 airplane hijackings were from a particular,
relatively small region of Saudi Arabia). His paper models the predictive power of geographical distance in large-
scale, spatially embedded networks, and argues that in many realistic situations distance explains a very high
proportion of variability in tie density.
Understanding processes in large-scale networks. What are the processes that sustain large networks? Why do
people maintain, dissolve, and reconstitute communication links as well as links to information? Contractor and
Monge systematically reviewed large bodies of empirical literature and distilled many propositions concerning the
maintenance and dissolution of links. To date, much research on social networks has looked at just one of these
mechanisms at a time. Ironically, however, many of the mechanisms contradict one another. For example,
creating a tie with someone because many others do so is consistent with social contagion theory but contradicts
self-interest theories that suggest the marginal return from an additional tie would be slight. In their larger project,
Contractor and Monge develop a framework that tests multiple theories such as these at multiple levels, allowing
many theories to be brought to bear on the same data set.
Understanding data on large-scale networks. Papers by several participants present models, methods, and
illustrative analyses oriented toward the study of large-scale networks. Jensen and Neville joined social network
analysis with data mining and related techniques of machine learning and knowledge discovery in order to
investigate large networks. At the intersection of statistics, databases, artificial intelligence, and visualization,
data mining techniques have been extended to relational data. One example, useful in detecting cell phone fraud,
is that fraudulent telephone numbers are likely to be not one but two degrees away (because various phone
numbers are stolen but they tend to be used to call the same parties). Jensen's and Neville's paper reports their
effort to predict an outcome (the success of a film) on the basis of a data set that interlines features of a large
network (such as movies, studios, actors, and previous awards). Whereas recent work in machine learning and
data mining has made impressive strides toward learning highly accurate models of relational data, Jensen and
Neville suggest that cross-disciplinary efforts that make good use of social network analysis and statistics should
lead to even greater progress.
Hoff presented random effects modeling for social networks, which provide one way to model the statistical
dependence among the network connections. The models assume that each node has a vector of latent character-
istics and that nodes relate preferentially to others with similar characteristics. Hoff employs a Markov chain
Monte Carlo (MCMC) simulation procedure to estimate the model' s parameters.
Ward, Hoff, and Lofdahl report in their paper an application of Hoff's latent spaces model to data on
interactions among primary actors in Central Asian politics over an 11-year period ending in 1999, based on 1
million iterations of the MCMC estimation procedure using geographic distance as the only covariate. Countries
closer together in the dimensional space resulting from model estimation were predicted to have a higher probabil-
OCR for page 11
WORKSHOP SUMMARY
11
ity of connection. Imputation techniques were investigated and found to predict ties that were not sampled. The
paper provides a favorable initial application of the latent spaces model to a data context of interest.
A final observation from the resultant discussion concerned the similarities (which are very great) as well as
the differences (which are nonetheless consequential) among the various statistical models, notably those of
Handcock's and Hoff's papers as well as the random graph models presented by Wasserman, Pattison, and
colleagues. In brief, the similarities concern the increased focus on formulating parametric models for random
graphs within the exponential family. The differences pertain to different paths taken in the estimation of model
parameters.
Understanding the adversary versus understanding ourselves. In the roundtable discussion, Lazer began his
remarks by considering, in the post-9/11 context, the contributions that network analysis might make to decision
makers who confront security challenges and suggested that the first problem to be considered is that of under-
standing the adversary. Severe information overload coupled with a great deal of missing or nonexistent data and
the need for quick, real-time decisions are factors that hamper efforts to understand an adversary's vulnerabilities.
Social network models cannot always lead an analyst to make a prediction that has perfect accuracy, but they can
certainly improve the process of making such predictions by identifying relevant linkages and related sources of
uncertainty that may be easily overlooked. For example, the paper by Ward and his colleagues provides a major
extension to the usual international relations models that paradoxically ignore relational structures. Further, Butts
was able in his paper to project the likely structure of a network based on a small amount of publicly available
information.
Lazer then turned to the question of understanding ourselves, arguing that a different set of challenges is
implicated in this second concern. Here the needs center around questions of who needs to coordinate, communi-
cate, and cooperate. Challenges concern critical self-evaluation (turf issues, organizational cultures, entrenched
constituencies), design challenges (including needs for security as well as coordination), and the ability to be up
and running in real-time, emergent situations. An opportunity in this area is that there are many more chances to
gather data. Therefore, in contrast to a project designed to understand the adversary's vulnerabilities, there are
perhaps more obvious openings for data-analytic approaches and for rigorous research.
RESEARCH ISSUES AND PROSPECTS
In this final section we identify near-term prospects for improving social networks research that emerged from
the workshop papers and discussion. In doing so we focus on three areas: formulation of models; data and
measurement; and research relevant to national security needs. The concerns and questions that we identify, while
voiced by various workshop participants speaking as individuals, do not represent conclusions or recommenda-
tions of the workshop itself.
Formulation of Models
Networks "Plus": Generalized Relational Structures
The social network community has often found it useful to view social networks as "skeletal" abstractions of
a much richer social reality. An important question, though, is whether the extent to which attempts to model and
quantify network properties can rely on the network observations alone or whether they would instead be enhanced
by additional information about actors and ties and their embedding in other social forms (the constellation of
which might be termed "generalized relational data structures". In other words, to what extent do we need to
develop a more systematic (and quantitative) understanding of such generalized relational data structures, such as
the meta-matrix approach? Such an understanding could lead to the development of models and analytic ap-
proaches that reflect the social context in which networks reside and the interaction of network processes with
other aspects of this reality, including the background, intentions, and beliefs of the actors involved and the
OCR for page 12
2
DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS
cultural and geographical settings in which they find themselves. Breiger raised similar points in his opening
address.
Processes on Networks
An important reason to examine network structure is that such examination provides an understanding of the
constraints and opportunities for social and cognitive processes enacted through network ties. Yet there is a
limited understanding of the extent to which we can predict the course of social and cognitive processes from
network topology alone. Should we be engaging in empirical and methodological programs of study that enable
us to articulate more clearly the relationship between network structure and various types of social processes in
which we are interested? Research presented at this workshop demonstrates that models in which the network ties
and other network-based diffusion or contagion processes convolve significantly expand our ability to understand,
interpret, and predict social and cognitive behavior. Further work involving empirical analysis, simulation, and
statistical modeling in this area seems to hold particular promise.
Scaling Up
An evident theme emerging from Sessions I (Social Network Theory Perspectives) and II (Dynamic Social
Networks) is the potential value of juxtaposing methods used to describe large-scale networks, primarily in
physics and computer science, with methods for evaluating models for social networks at a smaller scale, primarily
in the social sciences. A number of statistical models that have been developed for social networks of medium size
have attempted to express network structure as the outcome of regularities in interactive interpersonal processes at
a "local" level. Can we extend the focus of such statistical modeling approaches to develop theoretically prin-
cipled and testable models for social networks at a larger scale and in the process evaluate some of the claims
emerging from the more descriptive analyses? A number of network theories are based on cognitive principles and
small-group social theory. As we move to large-scale networks does this microlevel behavior still appear, or does
the variance inherent in human action cause such microbehavior to be lost as macrolevel bases for relations
become visible?
Model Development and Evaluation
In addition to extending models to include richer sources of relational data, how can we best incorporate
potentially more complex dependence structures and longitudinal observations? Can we develop measurement
models for ties and structural models for networks that take account of measurement and sampling issues? Can we
also develop more rigorous and diagnostic approaches to model evaluation? How can simulation studies be best
utilized to contribute to the resolution of questions of model specification and evaluation for evolving networks?
Can we evaluate the fit of power laws more carefully; indeed, can we construct and evaluate models predicting the
emergence of scale-free networks? Furthermore, can we extend and possibly integrate the very different and
distinctive programs of fruitful work currently being done (and presented at the workshop in papers by Carley,
Faloutsos et al., Friedkin, Macy et al., Morris, Snijders, and Stanley and Havlin) in order to build models for the
convolution of network ties, actor orientations, and actor affiliations?
Modeling: Estimation and Evaluation
Significant issues that emerged from the workshop session on dynamic social networks are the
complementarily of simulation from complex models for dynamic and interactive network-based processes (in
order to understand model behavior) and the task of formulating models in such a way that model parameters can
be estimated from observed data and model fit can be carefully evaluated. The potential value of developing,
estimating, and evaluating models in conjunction with empirical data is evident, and a major research domain is
the development of models for network observations that will allow us to close the gap between what can be
OCR for page 13
WORKSHOP SUMMARY
13
hypothesized from simulation-based explorations of theoretical positions and what can be verified empirically
from well-designed network studies.
Data and Measurement
The Design of Network Studies: Sampling Issues and Data Quality
A significant set of issues surrounds the question of network sampling. Although only a few presentations in
this workshop directly addressed questions of sampling, such issues were always close to the surface. Networks
rarely have boundaries, and almost all empirical networks have been based on sampling decisions or sampling
outcomes of some form. A principled means for handling sampling issues would be very valuable and indeed is
a natural extension of model-based formulations. Several factors that might be considered here include (1) the
biases inherent in the collection of nodes and ties obtained by a given sampling procedure, (2) the tendency to
over- or undersample certain types of relations, and (3) the extent to which such errors are uniformly distributed
over the network or focused in some portion of the overall network.
Related issues concern missing data, unreliable data, and data arising from actors (ranging from school
children to corporate executives) who may strategically misreport their ties. Methods for analyzing network data
exhibiting these properties include intensive application of multiple analytic procedures to compensate for prob-
lems of data quality (see Freeman's paper and related discussion in Session I above); addition of actors by means
of random-walk sampling (see discussion of Klovdahl's presentation in Session IV); and the possibility that
missing links can be implied by the existence of other linkages, as reviewed in the section entitled "Data Quality
and Network Sampling" in Breiger's opening address.
Important questions to address include: What methodological steps can be taken to minimize the conse-
quences of missing nodes and tie measurement errors? Can we deal with missing data in model construction and
develop model-based approaches to estimate missing data? Would it be prudent to develop more effective
measurement strategies for each tie of interest, as well as models for the measurement of ties? How can we go
about providing evidence for the validity of network measurement? Further, should we focus more effort on
characterizing the multifaceted nature of network ties?
~ _
Network Estimation
It is important to note that most network studies are done in a "data greedy" fashion, with the result that the
underlying network is mapped out or sampled at a fairly high level of accuracy. This is practical in some contexts,
such as situations in which archival or observational data are available and reliable; however, it is likely to be
highly impractical in many other contexts. Thus, it would be useful to specify systematically the classes or
dimensions of networks that exhibit fundamentally different behavior. Which high-level indicators can be used to
determine the location of an unobserved network of interest in this space of possibilities? In other words, what can
be done to provide a first-order estimate of the shape of the unobserved network? What kinds of questions can be
answered by having even this high-level estimate? Basic research both on characterizing the impact of networks
and on their fundamental form would be useful in this regard.
Exploratory Data Analysis
In light of the issues summarized above, it will be important to consider how we can augment descriptive
analyses of networks so as to incorporate information from other sources (e.g., node attributes, orientations and
locations, tie properties, group and organizational affiliations). More generally, can we extend these approaches to
more complex and longitudinal relational data structures, and can they be developed so as to assist in the evalua-
tion and development of model-based approaches? Many visualization tools provide valuable means for the
simultaneous presentation of relational and other data forms, but can their capacities be further enhanced? In
relation to the notion of resistance of network statistics, is special treatment required for certain network concepts?
OCR for page 14
4
DYNAMIC SOCIAL NETWORK MODELING AND ANALYSIS
For example, many network analyses have been based on the role of cut points and bridge ties, observations that
lead to statistics that may be inherently nonresistant (e.g., number of components, reachability, etc.~. Should
researchers identify observations associated with lack of resistance (e.g., investigate their measurement quality)?
Research Relevant to National Security
Impact of Network Change on Network Properties
What is learned by integrating an understanding of network "interventions" with a model-based approach?
One of the core features of social networks is arguably their potential to self-organize, which is especially likely in
response to an intervention. Research presented at this workshop illustrates the potential for network models, both
simulation and mathematical, to be used to foreshadow the probable network response to various types of interven-
tions such as the removal of a node that is high in centrality or cognitive load, a "key player." This work suggests
that, to be effective, strategies for altering networks need to be tailored to the processes by which the networks
change, recruit new members, and diffuse goods, messages, or services.
Stabilization and Destabilization Strategies
Basic research is needed to determine the set of factors that influence network stabilization and destabilization
strategies. Papers in each of the workshop's four sessions address these concerns. A problem analogous to the
"key player" problem that was not addressed at the workshop but is equally critical is the problem of the "key tie."
Can we develop metrics to identify key ties and the impact of their removal or addition on the overall behavior of
the network? What are the basic properties that make a group, an organization, or a community resilient, efficient,
and adaptive? Can we identify network structures or roles in networks that optimize these properties? While the
research on dynamic networks, both empirical and simulation, suggests that this is possible, there is still much
work to be done. Can we combine our analysis of the consequences of network change with a model-based
understanding of measurement error? Can we identify a program of empirical research that would evaluate
predictions about the impact of diverse types of intervention under different levels and types of error?
Closing the Gap
As Carley emphasized in her closing address, the ideas, measures, and tools being developed by network
analysts hold promise with respect to the needs of the defense and intelligence community. However, there is still
a large gap. Fundamental new science on dynamic networks under varying levels of uncertainty is needed. To fill
this gap, Carley suggested that new research featuring empirical studies, metrics, statistical models, computer
simulations, and theory building are all needed.
How can the gap between scientific research on networks and national needs be narrowed? Carley put
forward four proposals. First, universities need to produce more master's and Ph.D. students who are trained in
social network analysis and who enter government work. Second, illustrative data sets that are suitable for
dissemination within the community of networks researchers and that suggest the types of problems faced by the
defense and intelligence community need to be made publicly available. Third, effort needs to be made to
establish a dialogue with social networks researchers in which the needs of the defense and intelligence commu-
nity can be articulated without compromising national security. Finally, academicians in this research area need to
continue to strive for clarity in the articulation of the practical implications of their theoretical results.
Representative terms from entire chapter:
social networks