National Academies Press: OpenBook

Proceedings of a Workshop on Statistics on Networks (CD-ROM) (2007)

Chapter: Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles

« Previous: Mixing Patterns and Community Structure in Networks--Mark Newman, University of Michigan and Santa Fe Institute
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 120
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 121
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 122
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 123
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 124
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 125
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 126
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 127
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 128
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 129
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 130
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 131
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 132
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 133
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 134
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 135
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 136
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 137
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 138
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 139
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 140
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 141
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 142
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 143
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 144
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 145
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 146
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 147
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 148
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 149
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 150
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 151
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 152
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 153
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 154
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 155
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 156
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 157
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 158
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 159
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 160
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 161
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 162
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 163
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 164
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 165
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 166
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 167
Suggested Citation:"Dynamic Networks--Embedded Networked Sensing (Redux?)--Deborah Estrin, University of California at Los Angeles." National Research Council. 2007. Proceedings of a Workshop on Statistics on Networks (CD-ROM). Washington, DC: The National Academies Press. doi: 10.17226/12083.
×
Page 168

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Dynamic Networks 120

Embedded Networked Sensing (Redux?) Deborah Estrin, University of California at Los Angeles FIGURE 1 DR. ESTRIN: When I looked at the audience and I didn’t recognize most of you, I added a question mark in my title, because maybe this isn’t something that you have heard discussed, or perhaps not by me. 121

FIGURE 2 Let me start off with some caveats. The first of this is that I don’t know if embedded network sensing or sensor networks are that interesting as networks. I do know that they are very interesting systems and data sources, and through this talk, recognize that I am an engineer so, most of what I focus on is the design of new and, hopefully, useful functionality and unlike the other examples here—power networks, communication networks, social networks, even, biological networks, neural networks— these networks are just now beginning to be prototyped. There are things that can retrospectively be called sensor networks that have been around for a long time, like our seismic grids. Sensor networks of the type I am talking about are relatively new. We have a science and technology center at the National Science Foundation Science and Technology Center. It started three years ago and the purpose was to develop this technology, and there are lots of active programs all over the country and the world of people developing this technology. As I said here, statements that begin sensor networks are, or more sensor networks, should always be called somewhat to question, because there aren’t very many sensor networks, which makes it difficult to talk about statistics on or of these things. One thing is that there clearly want to be statistics in these networks, as I hope will become clearer, in the sense that there are statistics in image processing, and there are statistics in a lot of visualization and data analyses and things that we do. This is a little bit of a mind shift from some of what I heard this morning, and I hope it will be of some use. If not, Rob knew all this when he invited me. So, 122

you can talk to him about it. FIGURE 3 FIGURE 3 Why embedded sensing? Remote sensing has been around for a long time, and generates lots of very interesting data that continues to be a real interesting source of information for scientists, and a source of interesting algorithmic challenges. In situ sensing is a complement to that, where, as everyone knows, a pixel in a remote sensing image represents an average over a very large area. FIGURE 4 123

There are many phenomena, particularly biological phenomena, that simply the average over a large area isn’t what you are interested in. What you are interested in is the particulars at the variations within that larger region. When that is the case you actually want to be able to embed sensing up close to the phenomenon, with the hope that it will reveal things that we were previously unable to observe. For us that means that embedded network sensing or sensor networks—I use that term, embedded network sensing—you already get a little bit of a clue that I don’t think it is so much about the network. FIGURE 5 It is important that it is networked, if that is actually a verb or an adjective or whatever. It is important that you have collections of these things that are networked. That is an important part of the system. What is really important about it is that this is a sensing system and a network system, more so than the details of the communications network. In any case, where this technology has been most relevant is where you actually have a lot of spatial variability and heterogeneity. If you don’t have variability, you don’t need to have embedded sensing because you can take a measurement at a single point, or take an average, and you learn a fair amount from that. If you don’t have heterogeneity, you can develop fairly good models that will allow you to estimate a value at a point where you are not actually measuring. So, embedded sensing is important where you can’t do a good job of that estimation until you understand the system better. In many contexts where we are building this apparatus, this instrumentation for people 124

now, it is for contexts in which they are trying to develop their science. It might not be that they end up deploying long lived permanent system places, rather, they are trying to develop a model for the physical phenomenon that they are studying. They need an instrument that gives them this level of spatial resolution and then, from that, they will develop appropriate models and won’t necessarily need to monitor continuously in all places. In many places early on these systems are of interest to scientists to develop better models and, as the technology matures, we expect it to be increasingly used for then the more engineering side of that problem. Initially, you have scientists who need to study what is going on with the run off that is increasing the nitrate levels in urban streams, that is leading to larger amounts of algal formation and things like that, understanding that whole dynamic, because it is actually a fairly complex problem, the same kind of story in the soils. Right now we are building instruments for scientists to be able to understand and model those processes. For the longer term the regulatory agencies will have models and know what levels of run off you are allowed to have from agriculture and from urban, and they will want to be able to put up systems that monitor for those threshold levels. When we talk about center networks, we are talking about both the design of those instruments initially to develop very detailed data sets, and in the longer term the ability to put out systems that last for long periods of time. Embedded network sensing is the embedding of large numbers of distributed devices, and what we mean by large has changed over time. These get deployed spatially, in a spatially dense manner, and in a temporally dense manner, meaning you are able to take measurements continuously, although there are many interesting systems where you might do this over a short period of time, go out and do a survey to develop a model, and you don’t necessarily need the system to be there and live for a very long period of time. Very important to these systems, as I will be describing is that the devices are networked, and you are not simply putting out a lot of wireless sensors and streaming all the data back to a single location. 125

FIGURE 6 The fact is that there is a tremendous amount of potential data, and there are many modalities that you end up deploying that causes these systems to be fairly interesting, even when we are not talking about tens of thousands, or necessarily even thousands of individual devices. Most of the examples I will be describing are from things we are doing now with scientists, although the expectation is that, over time, the engineering enterprise, and even health-related applications will end up dominating. FIGURE 7 126

The systems that we are building are ones that—our goal is to have them be programmable, autonomous—although I will talk about interactive systems as well—distributed observatories that, for now, address largely compelling science and environmental engineering issues. Where we are right now, sort of three, five, 10, depending on how many years into this, when you think it all began, is that we have really first generation technology, basic hardware and software, that lets you go out and deploy a demonstration sensor network. FIGURE 8 In particular, we have mature first generation examples of basic routing, of reliable transport, time synchronization, energy harvesting. Many of these nodes are operating based on batteries without any form of continuous recharge of that battery although, as I will be describing, in all the systems we deploy, we have some nodes that are larger and more capable, but energy harvesting largely refers to solar. We have basic forms of in-network processing, tasking and filtering. To some extent the systems are programmable in the sense that you can pose different queries to them, and have them trigger based on different thresholds and temporal patterns. We have basic tools for putting these systems together and doing development simulation testing. The things that aren’t in bold are things that are still not very mature, things like localization. What we mean by that is, when you are doing your data collection on a node, you actually want to time stamp each of those data points. You also want to know where it is collected in three 127

spaces. One of the problems that I remember in the early days, one of the things I would say is that, obviously, we are not going to be able to deploy these systems without some form of automated localization of the nodes, meaning a node has to know where it is in three space, so that it can appropriately stamp its data. Fortunately, I was wrong in the sense of identifying that as something that would be a non-starter if we didn’t achieve it, because doing automated localization is as hard as all the others. You have to solve all the sensor network problems in order to do automated localization. Although I was wrong about that, sometimes two wrongs make a right, because I was also wrong about the numbers and the rate at which we would very rapidly be deploying huge numbers of these things. Two wrongs make a right in the sense that the numbers of these things are still in the hundreds when you are doing a deployment. Going out with the GPS device, and identifying where this thing is, and configuring its location, the localization problem isn’t what is stopping us from going to bigger numbers. That is not the biggest thing in our way, but it is a very interesting problem to work on. That is why I don’t have that in bold because localization, while one of the most interesting problems, still isn’t one that has an off the shelf solution, although we are getting nice results from MIT and other places. It is getting closer to that. FIGURE 9 128

This is where we are, and a little slide on how we got there—Lots of work over time that was pretty small and scattered at the beginning, and then started to build up steam. One of the contexts in which it did that actually was with an NRC report. It seems appropriate to mention here, in about 2000, our Embedded Everywhere report. Then, various DARPA programs, while DARPA was still functional in this arena, and now lots of NSF programs in the area, and interesting conferences to go to and venues for people to publish, possibly too many venues for people to publish. I said before that our price to earnings ratio in terms of numbers of papers to numbers of deployment is a little frightening at times, so that is where we are. It is a field that seems to have captured people’s imaginations. People are doing prototyping of a small amount of industry activity, actually growing in the area. When I look back at what our early themes were, and what the themes were in terms of the problems to solve, this summarizes what I think are the primary changes. First of all, early themes, we talked a lot about the thousands—actually, I think I even had slides early on that said tens of thousands—of small devices, and the focus was absolutely on minimizing what every individual node, what each individual node had to do, and exploiting those large numbers, and making sure that the system was completely self configuring because, obviously, at that scale, you have to make the system self configuring. Hopefully, all of those will come back as being key problems to solve. They just aren’t necessarily the first problems to solve and, since I am in the business of actually building and deploying these things, it doesn’t help me to solve future problems before the current ones. FIGURE 10 129

Our focus has been, and has turned to, systems that are much more heterogeneous, and I will talk about a few types of heterogeneity, and particularly the capability of the nodes, having mobility in the system, and then the types of sensing that we do. Also, our systems that we are finding very interesting to design and use early on are not fully autonomous in many of the cases. They are interactive systems where you are providing a surround sound three dimension or N dimensional, when you think of all the different sensor modalities, view to the scientists. It is the ability of the scientists to actually be able to be in the field and combine their human observations, but also their ability to collect physical samples, or go around with more detailed analytical instruments, that is providing a very rich problem domain for us. I mention it here in particular because there are lots of interesting statistical tools that scientists want to be carrying around with them in the field. That is probably when I come back to why I put that in the title. It is because there is this shift from very large numbers of the smallest devices, in our experience, to a more heterogeneous collection of devices, and a shift from a focus on the fully autonomous to interactive systems. FIGURE 11 Let me say a little bit more about each of those things. First of all, what is important about this heterogeneity, and what are we doing to design for it, and what, if anything, would it have to do with, if you are somebody in statistics interested in this technology. All of our deployed systems and all the systems I know of contain several classes of nodes. At the smallest level you have little micro controller based devices that fit micro controllers that operate off of 130

coin cell batteries, and have on them a little bit of processing, memory, a low band width wireless and an interface to sensors, things like basic micro climate sensors and, for soil, a whole array of relatively low sampling rate chemical and physical sensors. That is one class of device we put out there but, with every such collection of modes, we always put out what we refer to as a micro server, or a master node, and usually several of those. These devices are very similar to what you have in your PDA or old lap tops. It is an embedded PC, it is a 32-bit device, runs linux, has your WiFi, has got relatively high band width communications on it. As a result of all of that, currently at least, it needs to be connected to some sort of energy supply, be it a solar panel, which is the case for us most of the time. You can think of that as just being a gate way. So, many systems that do exist, of the systems that do exist, you tend to have those 32-bit devices acting as gateways. FIGURE 12 It turns out increasingly as we are doing more interesting things on the motes themselves, that those micro servers serve a more interesting role than that. Let me give you an example. In the old day we used to talk about doing a lot of in-network and aggregation among this large collection of motes, about the data that they were collecting as they were hopping it back to the edge of the network. Well, there are a few things. First of all, not surprisingly, you can go to P.R. Kumar’s paper and other things, about the scaling limitations of wireless networks. It turns out that we don’t build really deep multi-hop wireless networks. We build networks that the data hops maybe three or four of these low band-width wireless hops, before they get to an Internet 131

connected point. Secondly, it turns out that putting in any communications and having these devices listen to one another interferes with them doing aggressive duty cycling, because you want these small battery operated devices to be able to go to sleep as much as they can and have pretty deterministic schedules. If you add onto that job, you know, route things for your neighbor or aggregate your neighbor’s data coming from any direction and know where your neighbors are, and choose a leader and then be the aggregation point, when you add on all of that stuff, it ends up interfering with their ability of doing the simple job of collect the data, do some local processing in terms of doing temporal compression, look for patterns or something like that. I send that data upstream when I see something, or if it matches a query. We found over time that it became easier for us to design systems where what the little nodes are doing is very simple. It is programmable. You can adapt the thresholds and adapt the little local analysis that it is doing, but it is not a node that is trying to have a view of what is going on across the network. Rather, the micro server that you have embedded out there is the natural point that is seeing the data coming from the different distributed points, and it is a natural place where you can adjust the ambient levels, where you can adjust the thresholds, and where you can effect what is more of a collective behavior. The micro servers in the network form that is more of a peer to peer, sort of any to any network, because they have the resources to do that, but in these clouds of motes, they tend to be doing very simple tree forwarding, localized processing, just based on an individual node and passing that data forward, with the rich interface in terms of what a micro server can tell a mote to do. That is interesting because we are starting to put more interesting sensors on our motes themselves—and I will describe one in a little bit—such as image sensors or acoustic sensors—so the image and acoustic sensors, we are not actually wanting to get an image feed or an acoustic feed. We are actually programming those nodes to look for particular image or acoustic patterns. Those patterns that we want them to look for, change in color histogram, significant change in geometry or size are things that are very sensitive to ambient conditions. It is hard to have a simple algorithm running on this eight bit micro controller that works under all conditions. You need that larger device there that is seeing what ambient levels are, and what the general conditions are, to address that simple local rule that the device is doing. So, that is something that we are just beginning to do, both in the acoustic and image realm, and it is an interesting design problem and an interesting context in which to design algorithms, things that are like image processing problems, but in this interesting sort of heterogeneous, distributed architecture. 132

FIGURE 13 I am going to go back for a moment. One of the shifts here is that it is when you take this heterogeneous collection of devices and you decide, both at design time and at run time what should be done where, who should be doing the processing and the adaptation, we take this perspective now that it is really about this whole system optimization, not just looking at an individual node and seeing how to optimize its particular energy usage. FIGURE 14 133

The second form of heterogeneity that I want to mention that has become very important to us is related to mobility. This is part of a project that is headed up by Bill Kaiser. Something that I neglected to mention earlier is that Bill Kaiser and Greg Potti are really the folks who I consider as being the inventors of wireless internet working technology. Back around 1996 they had a first DARPA project. I wasn’t at UCLA or anyway involved, but they had a first DARPA project where they had this effort of combining computation, sensing and communication, and having the devices coordinate to do in-network processing, to make the system long lived. These same folks, with a collection of other collaborators, had another excellent insight a couple of years ago, which is that we always wanted to be able to include mobility in these systems. Static sensors are static. You are stuck on this manifold on which you place the sensors. Those are the only points at which you are able to do your measurements. In some sense, you are always under-sampling. You are just destined to always under sample, particularly in this mode in which we are trying to create models of phenomena that the scientists don’t yet understand, so they don’t even know at what point—they don’t have a characterization to be able to say whether they are adequately sampling the system. Being able to move things around has always been very attractive but, just like the localization problem; robotics is every bit as hard as sensor networking. Looking to robotics to solve our problems, regular robotics where you navigate around on a surface, looking to robotics to solve our problem in the near term wasn’t a very effective approach. Theoretically it was a good idea, but practically, it wasn’t. The insight that Greg Potti and Bill Kaiser had was that the way you move around more easily and the way you navigate more easily is by using infrastructure, be it roads or rails or what have you. They put the robots up on aerial cables, and the robotic devices are still robots and they are autonomously moving around, but they are elevated above these complex environments in which we tend to deploy things, and they can also lower and elevate sensing and sampling devices. If you put up multiple cables, you start to get the ability to actually sense and sample in a 3-D volume, and you are not just stuck on the manifold on which you have placed your sensors. We actually use mobility and articulation at many levels. At this relatively coarse grained level, we use it in aquatic environments with a robotic boat, a little robo-duck thing that we have that goes around and does automated sample collection. In fact, there are things that you still can’t sense in situ, and it will be a long time before you can actually do in situ biological sensing. What you have to do is do triggered sample collection where you are able to carefully identify the exact location, time and physical conditions under which a sample was collected, but then you 134

have to go and do the analysis back in the lab to identify what organisms are actually there. We use articulation in many contexts, this one of mobility is key. What is interesting about this particular capability—it might sound a gimmick, but aside from being a gimmick—is it has that quality to it, the news stations like to pick it up as Tarzan robot or something like that. It has that appeal but it turns out to be far more than a gimmick than I had expected it to be. What has happened with our scientists is that this has allowed them not to have to leave behind their higher end imaging, spectroscopy and other instrumentation that otherwise you are not going to attach to mote any time. You also don’t want to stick it in one place. The ability to combine these higher end imaging and data collection tools with a statically deployed, simpler sensor ray, has turned out to be of great interest to them, and this is being used by folks who are studying, as I used the example before, of nitrate run off into streams. This is being used not just above ground, but actually to study what is going on inside in the stream as well. It is suspended over the stream, and this hydrolab sensor does a traversal along the base of the stream, and at various elevations. So, we have several fielded systems, and the difference between the top and the bottom is that we have pictures of the real ones at the, and the bottom is depictions of bigger things that we want to do, but they are Power Point because they are not real yet. FIGURE 15 This is an example—this is Media Creek out in Thousand Oaks, not far from Los Angeles. You can’t see it very well, but there is an NMS deployment that is traversing the stream, and that cylindric object is a hydrolab which contains a number of different sensors in it. 135

We started to do some very interesting things such as trying to deal with the problem of calibration. The Achilles heel of sensor networks is and will be calibration, and we are not going to be able to solve it with gross over-deployment, as we thought initially, because it is very hard to over-deploy in this context, both due to cost and practicality. As you get into these applications, all of these sensors have a tremendous amount of drift to them. You are largely taking sensors that were used for analytical instruments, where a human being would go out and do measurements, but would therefore be regularly servicing the sensor. We now have a regime in place, because this is being actively used. This isn’t just a bunch of computer scientists and electrical engineers and mechanical engineers who are trying to do this. We do this with a public health faculty member and a graduate student who is trying to get a thesis out of this and actually needs usable data. One of the things that we have come up with is a calibration procedure, whereby this robotic device comes up and dips itself into some fresh water to clean itself off, dips itself into a known solution of nitrate level to do a calibration level, dips itself back, and then continues on its run. That is a low level mechanism, but it is something that it is hard to figure out how you are going to solve this problem without some form of automated actuation. FIGURE 16 Moving on, these systems actually being more fully three dimensional is one important point. The other very important point is that it is through these systems that we discovered just how many interesting problems there are to address through portable and rapid deployment 136

systems. Initially, when we talked about these sorts of systems, we were always talking about very long lived permanent systems, and how you could put them out there, have them be fully autonomous and last the longest period of time. That is still of great interest, but it turns out that we skipped this and have come back to a capability that is of tremendous interest to scientists and engineers, which is the ability to go out and do a three hour, three day or three week detailed survey and study on some location. If you are focused just on that very long lived system, you leave behind a lot of higher end capability that is very powerful. The last form of heterogeneity that I wanted to mention, and again, I chose these things— I forgot to say something. In this robotic context, one of the reasons I chose to say this, aside from its importance in center networks, is because there is a lot of opportunity for doing statistics in the network here. You don’t want to just—if you have a phenomenon that have temporal variability to it, that is faster than your ability to do just a standard raster scan, what you want to do is adaptor sampling. So, there are very interesting algorithms to be designed here, where the system itself, based on what it is observing, spends more time in places where there is more spatial variability. We do both adaptive sampling and we triggered sampling where you have some static nodes that are temporally continuous, doing their measurements, and that can then give indications to the robotic node as to where there might be interesting phenomena to come and do more observations of. There are all kinds of interesting resource contention problems and lots of algorithms to be designed there. FIGURE 17 137

The third example, and last example of heterogeneity I bring up both because I think it is cool and interesting, and is going to expand the utility of these networks and because I think there are interesting statistical techniques and algorithms to be designed here, and that is when we begin to make use of imagers as sensors in these networks. Occasionally, you might pull back a full frame to look at an image, but the main point here is to enable us to do observations that can currently be observed in the optical domain. Many of my customers or my colleagues on this are biologists and, in general, all I can give them off the shelf are physical and chemical sensors. In some sense, these imagers can act as biological sensors, because you can observe phenomenology events, blooming events, major changes in color histograms, and in size and shape of objects. FIGURE 18 Yet web cams and phone cameras, which are all over the place are not embeddable, because they largely collect rather high data rate images and continuously send them back. What you need is to take that little camera off your cell phone and put it on a small low powered device and program into it simple computations that can allow it to quickly, or not so quickly, locally analyze those images. We are not talking about taking images continuously we are talking about many times a day because these are phenomena that are not that rapid. The same sort of thing applies in the acoustic realm, and this is actually one of those technologies, both on the image and the acoustic side, where we start to see or can think about at least fun, if I am not sure how serious applications are in the consumer realm. You can imagine leaving behind, in places or 138

with people that you care about, something that is going to grab an optical or an acoustic snapshot, not necessarily privacy invasive, because it is not able to—the band width of that radio is small because the battery is small and you have to be low energy, but you can leave behind, near your kid or your pet or your parents, depending on where you are in the life cycle, FIGURE 19 something that gives you some indication, or your favorite restaurant or cafe where you happen to work or, in my case, in my office, where there is construction outside, and I want to know how noisy is it in my office. This is a frivolous thing, but the ability to velcro onto the wall little acoustic and image activity level indicators starts to give us some interesting things with which to play. It certainly gives us something that we want to be able to program with very tightly crafted analyses. If you put these on a graph where you have high position cameras, or clusters of web cams, we are talking about down here where we have really small image sizes, really low algorithmic complexity, but the potential of putting out multiple of these things you can actually deal with things like occlusion and multiple perspectives pretty easily. One of the things that we have to our advantage here is that we are talking about putting these things in well defined environments where you know what you are looking for. They don’t have to solve high end vision problems, and you can have a lot of the information about all sorts of priors and things you need for these things to be able to work well. 139

FIGURE 20 The last thing that I wanted to mention, which could just be called another form of heterogeneity in these systems, is what I referred to before separately as this issue of moving from completely autonomous to interactive systems, is that last sort of tier in our system which is actually the user. In retrospect, this is completely obvious. You know, artificial intelligence went through this as well. Looking for complete autonomous intelligence isn’t what produced the useful technology that AI has produced. It is starting to look at expert systems and all sorts of interactive systems that act as decision making aids to human beings. The same thing applies here. We have no excuse for having made the same mistakes. That being said, again we are 140

FIGURE 21 finding that our scientist customers are finding it very useful to look at these systems as systems that allow them to interact with their experiments and their set ups and their measurements interactively, sort of converting this, if you will, their experimental modes form being batched to interactive. Now what we are trying to put them with is the ability to go out there in the field, have access to their various GIS models and data, to remote sensing data, to their statistical models, to their statistical models. What they get back are these points of measurement FIGURE 22 141

that they can start to combine into their analysis, what they are going to ultimately do with the data anyway, and identify where else might they need to deploy, what physical samples might they need to collect to actually be able to say anything with any certainty about the phenomenon they are studying. This turns out to be an interesting requirement for them. I don’t know how much research there is to do, but there is definitely system building they need for equipping them with pretty nice statistical tool kits that they can take out into the field. Obviously, this in situ data by itself, these points that I say we are always under-sampling, are not most valuable by themselves. They are most valuable in the context of other data and models that the scientists have. FIGURE 23 In conclusion, we have been working with lots of individual and small groups of scientists and then, over this past year, things have been getting a little more interesting in the context of planning for what are really continental scale observatories and, in particular, the national ecological observatory network, whose planning is well underway. Cleaner is another one for environmental engineers, where it really is a multi-scale, multi-modal sensor network that is being planned. 142

FIGURE 24 Again, as you can see, we are back to cartoons; therefore you know these things aren’t real but you have a very large community of serious scientists defining how this technology needs to play out to serve them. Obviously this has lots of relevance to other issues, and there are lots of interesting problems to be solved in here, and lots of statistics gathered throughout it, at least from an engineer’s perspective. There are lots of places to follow up such as conferences in the field. How important are statistics to sensor networks? My answer is, Mark Hansen is a co-PI on our SENS center renewal, and that speaks to the point, because statistics, and Mark in particular, have become quite central. There are lots of planned observatories. You can Google (“NEON, Cleaner, GEOSS”) and there is lots of work going on around the country. 143

FIGURE 25 QUESTIONS AND ANSWERS DR. BANKS: Thank you very much, Deborah. We have time for a few questions while Ravi gets set up. DR. KLEINFELD: Part way through the talk you mentioned static sensors, the problem with undersampling, and the need for mobile sensors. You talked about a lot of things you were sensing—temperature, flora, fauna—but what are the scales? At what point do you know you are no longer undersampling? What are the physical scales for temperature, and the physical scales for observing? DR. ESTRIN: Obviously, that all depends on the question. I can give you specific answers. For the ecologist it depends on who was asking that question about micro climates and how changes in forest canopy will end up changing the life structure and the micro climates that the ground plant species will end up experiencing, there the relevant scales are cubic meters. Am I answering the question? DR. KLEINFELD: Yes. DR. ESTRIN: If you are talking about, for example, our aquatic system, there, for example, these algal blooms, one of the things they are trying to understand is how, with different thermal climes, what it is the combination of light, temperature and current conditions that causes this base transition from small numbers of these algae to one of these algal blooms in the ocean or 144

in streams. There, you are actually talking about what could be smaller scale than even a cubic meter. So, it is very important to send this robotic sample collector around that can get to sampling on even smaller spatial scales than that. For the most part, a cubic meter is about as small we get to in terms of being willing to accept a similar measurement within that volume. Of course we are not going to have uniformity with cubic meters. You are talking about some experimental design that is not uniform. 145

The Functional Organization of Mammalian Cells Ravi Iyengar, Mount Sinai School of Medicine FIGURE 1 DR. IYENGAR: I should tell you a little bit about my background because much of what I am going to tell you probably won’t sound like the language that we all heard this morning. I majored in chemistry and am a biochemist by training. I have spent most of my research career purifying and characterizing proteins that work on cell signaling pathways. Since many of these pathways interact with one another, the interacting pathways are called signaling networks. I come from a biology background, and I am trying to use network analysis to understand cellular functions such as those shown in Figures 1 and 2. FIGURE 2 What I hope to convince you of is that graph theory-based statistical analysis of networks turns out to be a very powerful tool to understand how things are put together within a cell. We are still at the beginning stages of developing such an understanding. Our long-term goal is the 146

functional organization of mammalian cells. The system we are focused on is the neuron, and we want to understand the function of biochemical networks that control how the behavior of neurons changes in an activity dependent manner. Function is manifested in a very tightly regulated manner. There is continuous regulation of the intracellular processes, and this regulation, in turn, allows for many different events to occur in a coordinated way. I have been studying signaling systems for 25 years. Over this time, the numbers of signaling molecules have greatly increased. Proteins of the extracellular matrix which, when I was in graduate school, where thought to be proteinaceous glue that held cells together, have now been shown to be signaling molecules that tell cells about their immediate environment on an ongoing basis. Almost no component in living organisms is really inert. They exist because they communicate constantly with other components, and this constant communications allow the components to work together. This constant communication is an important feature of the living cell. Underlying all of these analyses is the notion that all live processes arise from interacting components. These components are mostly proteins. But cell regulatory networks are not made up of just proteins, because within all our cells, as Dr. Kopell talked about it, ions are important constituents. Along with proteins and ions, nucleotides, sugars and lipids, all come together to form a multifunctional network. So we can approach the problem with the hypothesis that phenotypic behavior that arises from interactions between cellular components. Here my focus is on single neurons. Although networks between neurons are very important for information processing, quite a bit of information processing actually goes on within single neurons. One of the nicest examples of physiological consequences of information processing in single neurons comes from the work of Wendy Suzuki and Emery Brown and others, who published a paper in Science showing that, in live monkeys, during the learning process, there are changes in spike frequency of individual neurons that correlate with learning. FIGURE 3 147

Signaling pathways has been studied for 50 years. This particular signaling pathway, the cAMP pathway which I have studied for a very long time, is a well characterized one. The most famous example of this pathway is adrenaline—“fight or flight” response. When adrenaline binds to its receptor, it initiates a series of steps. The steps eventually lead to change in activity of the enzyme that mobilizes glucose. The glucose released into the blood provides energy to run or to fight. The cAMP pathway is shown in Figure 3. As you can see, the cartoon in Figure 3 encompasses two sorts of information. One is the underlying chemical reactions, which are protein-protein interactions and the ligand-protein interactions. There are also arrows (activating interactions) and plungers (inhibitory interactions) and dumbbells (neutral interactions) that can be used for defining the edges that allow the nodes to come together to form a network. Such networks with direction-defined edges are called directed graphs. The overall organization of the hippocampal neuron is summarized in Figure 4. The organization indicates that that receptors that receive extracellular signals regulate the levels and/or activity of the upstream signaling molecules such as Calcium, cyclic AMP, and small GTPases that in turn regulate the activity of the key protein kinases. The change in activity of the protein kinases modulate the activity of the various cellular machines in a coordinated manner. This change in activity of the cellular machines results in changes in phenotypic behavior. FIGURE 4 This minimal view of cellular organization has been developed from a large number of studies from many investigators. The NIH funding for basic research over the past 50 years has allowed for the collection of a lot of information on binary interactions in many systems. It has been possible, in the last 10 years or so to put this information together and this synthesis gives the high-level schematic representation of what a hippocampal neuron might look like as is shown in Figure 4. 148

FIGURE 5 Such an abstracted representation can be used to depict many different mammalian cell types. Three examples are shown in Figure 5. Neurons, T cells and pancreatic beta-cells. If we are studying electrical properties of neurons and how they change in an activity dependent manner, one might focus on early biochemical modifications such as phosphorylation of channels. With signals flowing through the network one can modify a channel, regulate gene expression, and move components around the cell and so on. This minimal view can be expanded to depict many cellular components and interactions as shown in Figure 6. FIGURE 6 This brings one to a complex picture. Often these complex diagrams are considered to have minimal value because people question just what can be learnt from these diagrams. Actually, the map in Figure 6 has just a few hundred components. It has about 250 nodes but one can make this even bigger. People are struggling to understand how such complex systems can be tackled to make some sense of the functionally relevant organization. This question and 149

approach we use are shown in Figure 7. FIGURE 7 John Doyle in his talk also asked this question: how does one make sense of the extensive parts list? The reasoning we use to answer this question is based on the assumption that you can make sense of the function of complex systems by understanding the underlying regulatory patterns that these systems possess. These patterns are likely to be dynamic. I will explain our use of the term dynamics that in a minute. First, let us focus on what does the word pattern means within the context of cellular networks. This is where I found the papers of Uri Alon and his colleagues on motifs as building blocks of networks relevant. We have used the approach of Milo, Alon and colleagues to define the configuration of motifs as the connectivity propagates through the cellular network to identify patterns. This approach turns out to be a very useful way of understanding the regulatory capability of the cellular network. The occurrence of motifs does not mean that regulation occurs, but that the capability for regulation exists. In an experimental sense these analyses gives us a set of initial ideas and hypotheses that we can go and test in the laboratory. All of what I am going to describe is the work of one graduate student. We have 15 co-authors on the 2005 paper in Science that is cited in Figure 8. Most of the others have made valuable contributions in curating and validating the interaction data set for the construction of the network. All of the analysis has been done by one graduate student, Avi Maayan. The sort of biology area that I come from, everybody keeps saying, the days of individual laboratories are finished, you need this massive network biology. In reality, when one gets one smart graduate student who challenges your way of thinking, brings something new to the lab it is possible to make real progress, as has been the case here. 150

FIGURE 8 Cellular networks can be represented as graphs at varying levels of detail. The various representations are shown in Figure 9. FIGURE 9 Graphs consisting of nodes that are cellular components such as proteins, small molecules such as metabolites and ions are connected by edges that are chemical interactions between nodes. These graphs can be undirected (B), directed (C), directed with weighted nodes and conditional edges (D) and directed with spatially specified nodes and edges (E). In the analysis described here we have used directed graphs (Type C, Figure 9) which are these directed graphs. They are directed in two ways. There are edges that are positive—that means that A stimulates B—and negative—that means A inhibits B. Then there are these neutral edges, that important for cellular networks because they represent interactions with anchors and scaffolds. In this representation we have not actually dealt with space in an explicit way, however these neutral 151

edges give the initial incorporation of spatial information into these networks. In this representation we have made a simplification, which I should state up front. We treated these three classes of edges as independent interactions. In reality, what happens in cells is, when two components come together in a spatially restricted manner they interact, that means there may be an active scaffold protein C that binds both A and B allowing for the A → B interaction. A to B edge exists because the A to and B to C edges are operational. We have not incorporated such conditionality between the edges themselves and hence type C graphs are simplified representation of biological systems. Type E graph would be the case where one would actually incorporate spatial constraints for the same components when they are in different dynamic compartments, interacting different sets of components. There is also the issue of weighting the links. Such weighting is needed to represent the levels of changes in activity of the components leading to differing interactions. These complexities have to be dealt with the underlying assumption that all of biology is a continuum. It is a question of when one needs to move to weighted nodes and edges to capture the behavior of the system. For the initial analysis we have assumed that Type C graphs will suffice. We constructed a network in silico using a function-based approach. For this we have focused on binary interactions since the validity of these interactions is clear from the biological literature. From lots of studies in biochemistry, cell biology, and physiology people have shown that A interacts with B, and there is a consequence to that interaction. These are well- characterized interactions and there is a vast literature on such binary interactions from the literature we have identified that A talks to B, B talks to C and C talks to D to build the network. The approach we have used is shown below in Figure 10. FIGURE 10 In many cases, because groups have studied input-output relationships, it is also possible 152

to constrain the network from the published data. Starting from A, one can get to D, or starting from B one can get to F. Such distal relationships are not always known and there is some ambiguity about pathways within networks. What we did was to use a function based approach to identify direct interactions and develop a network as a series of binary interactions. For this we searched the primary literature. That is a daunting task, so initially we developed a natural language processing program to pull out the papers. We then decided that we could not use the papers directly so, I recruited everybody else in my laboratory to actually read, verify and sort the literature. So we are reasonably confident about the validity of our network. In almost all cases, we searched for input-output relationships to constrain the connectivity relationships with respect to function and we have a database with all the primary papers and the references that go to make up these interactions. What one can do by this process is to generate a large series of subgraphs as is shown in Figure 11. This is one for calcenurin an important cellular enzyme that is a phosphatase. FIGURE 11 In this subgraph, calcenurin, is activated by calmodulin. Activation is represented by the green arrow. Calcenurin is anchored by AKAP. This means calcenurin is held in a certain location because of this protein, and that is that shown by the blue dumbbell. Activated calcenurin in turn can either activate (green arrow) the transcription factor NFAT, or inhibit (red plunger) another transcription factor CREB. You can construct a huge number of these subgraphs and put them all together, and use them to define how a CA1 neuron might work. 153

FIGURE 12 Functionally, as shown in Figure 12, a CA1 neuron changes activity in response to stimulation, and this change can be either long or short term. These changes in activity arise from changes in behavior of its components. This is now well documented. A list of changes in activity and the cellular components associated with these changes are shown in Figure 13. FIGURE 13 There are groups who study changes in the activity of channels, and the NMDA receptors. Others study changes in gene expression and translation, and actually even the way the spines are formed. Karel Svoboda from Cold Spring Harbor Laboratory, has been studying how the morphology of the connections change as the stimulus propagates. There are other groups who have characterized in detail the components that underlie these functional changes. One can put the data from these two groups of investigators together and make a functional network such one shown in Figure 14. 154

FIGURE 14 For this network we know the underlying biochemistry of all of reactions we don’t use this information in the graph theory analysis. Overall we have a system of some 546 components and nearly 1,300 edges. You can parse these nodes out in these functions, as shown in Figures 14 and 15. FIGURE 15 Figure 15 is a modified Pajek diagram of the CA1 neuron network with triangles as nodes and the size of the triangles indicative of the density of associated edges. The characteristics of the network are summarized below in Figure 16. 155

FIGURE 16 This is largely a scale-free, small-world network. The characteristic path length is 4.22. The clustering coefficient is 0.1, indicating that the network is quite highly clustered as compared to a randomized network. The clustering coefficient has turned out to be one of the most informative parameters in understanding the cellular space, because it becomes an estimate for things are put together by anchors and scaffolds. Neurons, especially, have substantial geometry and components are not evenly distributed within the cell. Understanding how things are in close proximity to one another that becomes very useful for understanding local function. FIGURE 17 We used the Uri Alon’s Pathfinder program to characterize the various types of motifs that are present in this network. Motifs are groups of nodes that act as a unit and have the ability to process information to alter input-output relationships. In cell signaling systems information transfer occurs through chemical reactions. Typically information processing by motifs involves changing the input/output relationship such that there is a change in the amplitude of the output 156

signal. There can be a change in the duration of the output signal or the signal can move to a different location within the cell. In many cases, signaling networks produce each of these effects to varying degrees. There are many of these motifs that when organized together gives rise to signal processing. The various configurations of the motif are given in Figure 17. I want to draw your attention to the feedforward motif, which is motif number 44. I really like this motif since it gives you two for the price of one. One, it allows for redundancy of pathways, which is very important in these systems to ensure reliability of signal flow, and two, it essentially works as a positive feedback loop that allows for persistence of the output signal, which almost always alters the interpretation of the signal for the mounting of functional responses. Another noteworthy motif is the bifan motif, which give rise to local interconnectivity and signal integration. The minimal size of these motifs is three or four nodes. We have found that motifs with five and six nodes arise from juxtapositions of the smaller ones. FIGURE 18 To understand how signal flows through this network we conducted a type of Boolean dynamic analysis which we have termed pseudodynamic analysis, as outlined in Figure 18. We used the adjective pseudo to signify that although the analysis represents the dynamics of the underlying coupled chemical reactions, the values of the links are all equal and hence do not capture the different reactions rates for the different equations. This simplifying assumption is largely valid because inside the cell, past the membrane receptors, and not quite at the level of gene expression, the reactions rates are largely similar for 80-85 percent of the reactions. The term pseudodynamics is similar to the term “apparent Kd” in biochemistry and pharmacology. Although rigorous measurements for Kd determination require equilibrium dialysis, most of us 157

often used steady state methods to determine Ki from which apparent Kds can be calculated. Such a simplified approach can also be used to study propagation of signal from the receptor into the cellular network. FIGURE 19 We looked at connectivity propagation going forward from the receptor. Each step signifies the formation of a link (edge) and represents a direct interaction. Such direct interactions may represent a single chemical reaction for noncovalent reversible binding interactions or 2-4 chemical reactions when the interaction is enzymatic. The numbers of links engaged as signal propagates from a many ligands that regulate the hippocampal neuron is shown in Figure 19. This is a complex plot with many ligands that affect the hippocampal neurons. At one end is the major neurotransmitter, glutamate. Signals from glutamate rapidly branch out and by 8 steps engage most of the network. At the other end is the fas ligand that causes apoptosis in neurons. Fas takes nearly 12 steps to engage most of the network. By the time we reach 10 or more steps, we can get about 1,000 links engaged, indicating that most of the network becomes interconnected. We then counted the number of links it takes going from ligand binding to a receptor to get to a component that produces a functional effects such as a channel or a transcription factor that turns on a gene. Eight is the average number for going from receptor to effector. When we are tracking paths from ligand-receptor interactions to channels or transcription factors we can identify the regulatory motifs that emerge as connectivity propagates through the network. This is shown for three important ligands for the hippocampal neuron: glutamate, norepinephrine and BDNF in Figure 20. 158

FIGURE 20 We counted feedback loops, sizes three and four, we then counted size three and four feedforward motifs. One thing we see is that, there are a lot more feed forward motifs than feedback loops in the cells. I think this represents the molecular basis for redundancy. Feedforward loops can arise from the presence of isoforms. The higher we go up in the evolutionaryse isoformshierarchy, there are more isoforms for many signaling proteins. The same protein comes in three, four or five different forms. They often have some what different connections, so we think of these proteins not quite as full siblings, but more like half brothers and sisters with some common connections and some unique connections. What we found most interesting, was non-homogeneous organization of the positive and negative motifs. As we start at the receptor, at the outside of the cell, the first few steps yield many more negative feedback and feed-forward loops, which would tend to limit the progression of information transfer. As you go deeper into the system, in each of these cases—with glutamate, norepinephrine, we pick up the positive feedback loops and positive feedforward motifs. It appears that, if the signal penetrates deep into the cell, it has much more chance of the signal being consolidated within the cell. This consolidation may trigger many of the memory processes by changing the cell state. This was our first big breakthrough that we got in understanding the configuration of the network and may satisfy my biology friends. 159

FIGURE 21 Two brief detours into the standard differential equation based modeling to illustrate the capabilities of the feedback and feed-forward regulatory motifs. Positive feedback loops work as switches. We have shown this a while ago for the MAP-kinase 1,2, system both initially from modeling analysis and subsequently by experiments in a model cell-culture system NIH-3T3 fibroblasts. A comparison of a model and experiments showing the input output relationship due to the presence of a feedback loop is shown in Figure 21. Positive feed forward motifs also give you extended output, and this is shown below in Figure 22. Although what is shown is a toy model, we can see that over a range of rates the presence of a feed-forward motif affects input-output relationships. So, the presence of these regulatory motifs can have real functional consequences. FIGURE 22 The next analysis was to simulate the formation of motifs as signals propagated from the receptor to an effector protein. Generally, in our initial analyses we started at the receptor and 160

allowed connectivity to propagate all over the cell. That does happen to some extent, but in most situations information flow is constrained by the input and output nodes. When we stimulate the hippocampal neurons, it results in changes in the activity of the channels or changes the activity of the transcription factors like CREB that results in altered gene expression. So, we decided to look at the system using a breadth-first of algorithm to go from receptor to effector with progressively increasing number of steps. This is shown in Figure 23. FIGURE 23 This analysis actually yielded the most satisfying part of our observations. It had been known for a long time that glutamate by itself would not allow this neuron to change state and get potentiated for an extended period without engaging the cAMP pathway. When we think about the cyclic AMP pathway, we think of the neurotransmitter norepinephrine, that in the hippocampus facilitates the glutamate dependent potentiation. When neurons become potentiated, they behave differently in response to stimuli. The pesudodynamic analyses showed that for norepinephrine, when the number of feed forward and feedback loops were counted for whether they were positive or negative, far more positive loops were engaged with increasing numbers of steps. The preponderance of positive feedback and feedforward loops from norepinephrine to CREB provides an explanation of why this route is so critical for the formation of memory processes. CREB is often called the memory molecule since its activity is crucial for the formation of memory in animal experiments. In contrast, for glutamate by itself, the numbers of positive and negative motifs are equal, and BDNF actually turned out to be a bonus. For neuronal communication there are always two cells, the presynaptic neuron and the postsynaptic neuron. Neurons can be potentiated by the actions that go on within themselves, and they also can be potentiated by changes in the 161

presynaptic neuron. It has been shown that that BDNF actually works in the pre-synaptic CA3 neuron, and in the postsynaptic CA1 neuron that we have modeled for the network analyses and the regulatory motifs for BDNF induced network evenly balance out. This statistical analysis gives us insight into how the configuration of motifs can affect state change in cells. If we engage more positive feedback and feed forward loops than negative loops we can induce state change (i.e plasticity). If the positive loops and negative loops are balanced then although signals propagate and acute effects are observed there is no state change. This is summarized in Figure 24. FIGURE 24 I want to make two further points. First, when we conducted this pesudodynamic analysis, we actually sampled the system for dynamic modularity. Modularity means different things for scientists in different fields. For those of us who come from a cell biology background, the word module actually means either a functional module, like components of one linear pathway, or it means a group of components in an organelle such as the proteins in the cell membranes, or the nucleus. In our analyses we used a functional approach going from receptor to effector protein. This analysis is shown in Figure 25. 162

FIGURE 25 As we follow the number of links that we engaged when we go from, the NMDA receptor an initial glutamate target, to the AMPA receptor, which is also a glutamate target and the final effector in the system, or from the NMDA receptor to the transcription factor we found that these lines were relatively linear. To validate this profile Avi came up with a clever trick in creating shuffled networks that could be used as controls for this biological networks. What he did was maintain the biological specificity of the first uppermost connection, which is from the ligand to the receptor, and maintained the biological specificity of the last connection that is the links to come into the AMPA channels or CREB. So, there may be 10 components that feed into CREB with seven or eight protein kinases that regulate it and those links did not change. He then randomized everything in the middle. When he tracked paths in these shuffled networks we either got paths that yielded plots that fit either to a power law or an exponential function. Two features of these paths are noteworthy. One is that the path is linear and, two, there are many more links engaged in the CA1 neuron network than in the shuffled networks even though only 10-20 percent of the links are engaged. I point that out to you because if you look at the scale, the number of links here is either 100 or 200 and if you go 10 steps without output constraint, without outward constraint, you engage about 1,000 links. So the input-output constraint and the number of steps (and if we use the number of steps as a surrogate for time) allows us to constrain the number of interactions that can occur starting from the point of signal entry to the final effector target. This boundary defines the module, and everything else on the outside becomes “separate” because one cannot engage these links within the time period defined by the number of steps. This type of analyses indicates that we are likely to have a series of dynamic functional modules. The properties of these functional modules are summarized in Figure 26. 163

FIGURE 26 Second, we were interested in figuring out what these highly connected nodes do as part of the network. Avi started out with a system of mostly unconnected nodes and then asked the question, what happens to the system if we add nodes with four, five, six links, iteratively. He then determined both the number of islands as a measure of networking and the motifs that are formed as the network coalesces. This approach is described below in Figure 27. FIGURE 27 Initially, at four links per node he had around sixty islands, and by the time he reaches 21 links per node, he was able to form one large island (i.e., a fully connected network). This is shown in Figure 28. 164

FIGURE 28 The surprise from this analysis was, none of the major players, the highly connected nodes, and ones that we know are biologically important were not needed to form the network. So what might the role of these biologically important highly connected nodes be? To answer this question we decided to determine what types of motifs are formed as these highly connected links come into play. The results from this analysis are shown in Figure 29. FIGURE 29 What we found is that the highly connected nodes disproportionately contribute to the formation of regulatory motifs. Eighty percent of the feedback loops and feed forward motifs occurred as these highly connected nodes come into play. For this network, it appears that the highly connected nodes are not needed for the structural integrity of the network, rather the highly connected nodes are required for the formation of the regulatory motifs that process information. Thus the psuedodynamic analysis has allowed us to move from thinking about individual 165

components to groups of components within these coupled chemical reaction networks. The location of these regulatory motifs within the network allows us to define areas within the networks that are capable of information processing. Maps specifying the density of motifs at specific locations and their relative positions with respect to receptors and the effector proteins are shown. A heat map representing the density of motifs as a function of steps from the receptor is shown in Figure 30. A detailed distribution of the various types of motifs in the interaction space between receptor and effector proteins is shown in Figure 31. FIGURE 30 FIGURE 31 Detailed analyses for the location of the various motifs indicate that the motifs are densely clustered around the functional center of the network as shown in Figure 31. The map in Figure 31 would suggest that there are no regulatory motifs between channels. Channels pose an interesting representation problem for interactions between each other. Often channels use membrane voltage and membrane resistance to interact with each other. But voltage and resistance are not represented as entities within this network and hence these motifs are not 166

“seen” in our network. So, there are some corrections we need to make for biological networks that use electrical and physical forces as entities. In spite of these limitations we can make a numbers of conclusions from the type of analyses we have conducted. These are summarized in Figure 32. FIGURE 32 The major features of the cellular network within hippocampal cells are 1) highly connected nodes can consolidate information by participating in regulatory loops 2) Early regulation is designed to limit signals and presumably filter spurious signals. As signals penetrate deep into the network the positive loops that are formed favor signal consolidation that leads to state change. Thus description of the statistics of networks for somebody like me who works at a cellular level, is very useful in providing an initial picture of the regulatory capabilities of the cellular network. It allows me to design experiments to test which of these regulatory motifs are operative and develop an overall picture that I would never get from a bottom up approach, if I was just studying one feedback loop or two feedback loops at a time. That is my experience with the statistics of networks. Thank you very much. [Applause.] 167

QUESTIONS AND ANSWERS DR. JENSEN: I am interested in the motif-finding aspect. Usually, with these algorithms you find all frequently occurring patterns, and then at some point you go on threshold. You say, you know things about the threshold, those are unexpected things, but because of the nature of them it is often difficult to do good hypotheses tests and say where we should draw that threshold. So, what approach did you use for saying these are motifs that seem big and interesting? DR. IYENGAR: Actually, I did not have a real initial statistical threshold, because I was going from a biological point of view. The protein that participated was known to have important biology. REFERENCES Bhalla, U.S., and R. Iyengar. 1999. “Emergent properties of networks of biological signaling pathways.” Science 283:5400. Bhalla, U.S., P.T. Ram, and R. Iyengar. 2002. “MAP kinase phosphatase as a locus of flexibility in a mitogen-activated protein kinase signaling network.” Science 297:5583. Eungdamrong, N.J., and R. Iyengar. 2007. “Compartment specific feedback loop and regulated trafficking can result in sustained activation of ras at the golgi.” Journal of Biophysics 92:808- 815. Jordan J.D., E.M. Landau, and R. Iyengar. 2000. “Signaling Networks: Origins of Cellular Multitasking” Cell 103:193-200. Ma’ayan, R., R.D. Blitzer, and R. Iyengar. 2005. “Toward Predictive Models of Mammalian Cells.” Annual Review of Biophysics and Biomolecular Structure 34:319-349. 168

Next: Dynamic Network Analysis in Counterterrorism Research--Kathleen Carley, Carnegie Mellon University »
Proceedings of a Workshop on Statistics on Networks (CD-ROM) Get This Book
×
Buy Cd-rom | $123.00 Buy Ebook | $99.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

A large number of biological, physical, and social systems contain complex networks. Knowledge about how these networks operate is critical for advancing a more general understanding of network behavior. To this end, each of these disciplines has created different kinds of statistical theory for inference on network data. To help stimulate further progress in the field of statistical inference on network data, the NRC sponsored a workshop that brought together researchers who are dealing with network data in different contexts. This book - which is available on CD only - contains the text of the 18 workshop presentations. The presentations focused on five major areas of research: network models, dynamic networks, data and measurement on networks, robustness and fragility of networks, and visualization and scalability of networks.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!