WEBVTT 1 00:00:00.160 --> 00:00:01.919 Welcome back to the deep dive. Just think about it 2 00:00:01.960 --> 00:00:05.280 for a second. You're connected right now to this well 3 00:00:05.480 --> 00:00:11.119 fundamentally chaotic system, thousands, maybe millions of devices, all different software, 4 00:00:11.119 --> 00:00:14.279 talking over fiber, copper or even just thin air, and 5 00:00:14.279 --> 00:00:18.839 they're interacting in milliseconds. It's this huge, kind of unstable beast, 6 00:00:18.879 --> 00:00:20.120 the whole global network. 7 00:00:20.120 --> 00:00:22.480 It really is mind bogglingly complex. 8 00:00:22.559 --> 00:00:25.359 So the big question for engineers when we're tackling today 9 00:00:25.760 --> 00:00:27.960 is how on earth do you predict how it's going 10 00:00:28.000 --> 00:00:30.079 to behave? How do you test a new idea, maybe 11 00:00:30.120 --> 00:00:33.280 a new protocol without accidentally breaking the Internet? 12 00:00:33.439 --> 00:00:35.920 Yeah, good luck testing on the real thing exactly. 13 00:00:36.719 --> 00:00:39.439 So our mission today just for you listening, is to 14 00:00:39.479 --> 00:00:43.600 dive into the essential answer network simulation. We've got some 15 00:00:43.679 --> 00:00:46.840 great sources here really focusing on mobile ad hoc networks. 16 00:00:46.880 --> 00:00:50.200 They call them MANETs, and this Powerhouse simulator NS three. 17 00:00:50.439 --> 00:00:52.759 You really see why simulation is critical when you look 18 00:00:52.799 --> 00:00:55.799 at the sheer scale involved our sources. They point to 19 00:00:55.840 --> 00:00:58.119 Cisco estimates I think it was for twenty twenty one, 20 00:00:58.399 --> 00:01:01.840 global mobile data traffic get something like forty nine exabtes 21 00:01:02.039 --> 00:01:02.920 every single month. 22 00:01:03.119 --> 00:01:06.519 Wow, forty nine exabtes. You just can't replicate that in a. 23 00:01:06.560 --> 00:01:09.239 Lab physically, no way. I mean, your best bet for 24 00:01:09.319 --> 00:01:12.319 mimicking the real world directly is emulation actually setting up 25 00:01:12.319 --> 00:01:15.400 the hardware. But like our guide says, that's just extensive 26 00:01:15.400 --> 00:01:19.560 and expensive, it doesn't scale right. So simulation it gives 27 00:01:19.599 --> 00:01:23.719 us this controlled virtual space we can generate data, test hypotheses, 28 00:01:23.799 --> 00:01:26.599 try out new ideas, all without you know, needing a 29 00:01:26.640 --> 00:01:29.920 warehouse full of routers or spending millions. It's about getting 30 00:01:29.959 --> 00:01:31.439 reliable data for research. 31 00:01:31.599 --> 00:01:34.000 Okay, So that's the plan. Over the next few minutes, 32 00:01:34.040 --> 00:01:36.680 we're going to unpack why simulation is so vital for 33 00:01:36.799 --> 00:01:40.400 taming this chaos. We'll look at what makes these dynamic 34 00:01:40.519 --> 00:01:45.120 wireless systems like manettes so tricky to model. Then we'll 35 00:01:45.120 --> 00:01:47.799 get into how a tool like NS three actually builds 36 00:01:47.799 --> 00:01:51.239 this virtual world, and crucially, we'll talk about the statistics. 37 00:01:51.400 --> 00:01:52.879 How do you make sure the numbers you get out 38 00:01:52.879 --> 00:01:54.519 are actually, you know, trustworthy. 39 00:01:54.640 --> 00:01:58.319 That last part's absolutely key. Garbage in, garbage out even 40 00:01:58.359 --> 00:02:00.200 in simulation, definitely. 41 00:02:00.480 --> 00:02:02.560 So let's start right at the beginning. What do we 42 00:02:02.640 --> 00:02:05.280 even mean by simulation here? Is it different from just 43 00:02:05.280 --> 00:02:06.000 making a model? 44 00:02:06.159 --> 00:02:08.879 Yeah, good question. They're related to the different First, we 45 00:02:08.919 --> 00:02:10.800 need to figure out what kind of system a computer 46 00:02:10.840 --> 00:02:14.599 network actually is. And the foundation here is realizing these 47 00:02:14.639 --> 00:02:16.199 are mostly discrete systems. 48 00:02:16.280 --> 00:02:18.759 Okay, hold on discreete But we talk about things like 49 00:02:18.840 --> 00:02:24.080 bandwidth data flow. Those sound continuous like water in a pipe. 50 00:02:24.680 --> 00:02:26.960 But think about what the computer cares about. It's not 51 00:02:27.039 --> 00:02:30.800 the smooth flow, it's the events. A packet arrives, click, 52 00:02:30.960 --> 00:02:33.759 a link goes down, click, a buffer gets full. Click. 53 00:02:34.479 --> 00:02:38.199 These state variables, they change instantly at specific points in time. 54 00:02:38.319 --> 00:02:40.919 I see, So it jumps from event to event exactly. 55 00:02:41.240 --> 00:02:44.919 The simulator just advances its clock to the next scheduled event. 56 00:02:45.039 --> 00:02:47.400 That's way more efficient than trying to track variables that 57 00:02:47.439 --> 00:02:51.000 are constantly smoothly changing over time, like in say a 58 00:02:51.039 --> 00:02:54.400 physics simulation. Okay, and this focus on events shapes the 59 00:02:54.439 --> 00:02:56.759 kinds of simulation we use. You might hear about Monte 60 00:02:56.800 --> 00:02:59.599 Carlo simulation. That's more for things without a time progression, 61 00:02:59.680 --> 00:03:03.599 like figuring out probabilities the dice rolls. But for networks 62 00:03:03.639 --> 00:03:07.039 we often lean towards trace driven simulation that's more like 63 00:03:07.240 --> 00:03:11.120 replaying a recording of actual events. In sequence. You feeded 64 00:03:11.159 --> 00:03:12.960 a list of things that happen in the real world 65 00:03:13.039 --> 00:03:16.719 and the simulator steps through them. Time and order matter hugely. 66 00:03:16.759 --> 00:03:20.919 There, got it, so discrete events maybe trace driven. That 67 00:03:21.039 --> 00:03:23.919 explains the how. But why is this the only way 68 00:03:23.960 --> 00:03:28.520 for researchers. What's the big Aha moment for you, the listener? 69 00:03:28.639 --> 00:03:31.759 The Aha moment is realizing simulation is often the only 70 00:03:31.919 --> 00:03:36.639 escape route from well analytical paralysis. These networks are so complex, 71 00:03:36.800 --> 00:03:40.240 so nonlinear, you just can't solve them with equations on paper. 72 00:03:40.759 --> 00:03:43.360 Forget trying to write a formula for forty nine xaviets 73 00:03:43.400 --> 00:03:45.599 flowing through connections that appear and disappear, right, it is 74 00:03:45.599 --> 00:03:49.759 too messy, way too messy. Simulation lets researchers make educated guesses, 75 00:03:50.360 --> 00:03:53.280 well founded conjectures about how these systems behave. But and 76 00:03:53.319 --> 00:03:55.719 this is the big butt, the quality of your results 77 00:03:55.759 --> 00:03:58.240 depends entirely on the quality of your model. You have 78 00:03:58.280 --> 00:04:01.000 to build it carefully and rigorously value it. Otherwise it's 79 00:04:01.039 --> 00:04:02.599 just a fancy gas makes. 80 00:04:02.360 --> 00:04:07.039 Sense quality model quality results seeking of complex and messy 81 00:04:07.400 --> 00:04:12.360 Let's pivot to these mobile ad hoc networks MANETs. The 82 00:04:12.439 --> 00:04:16.079 name itself at hoc sounds like pure chaos. What actually 83 00:04:16.079 --> 00:04:16.839 defines one. 84 00:04:17.079 --> 00:04:19.920 Chaos is a pretty good word for it. Basically, a 85 00:04:20.000 --> 00:04:22.560 mana is a temporary network. It's built on the fly, 86 00:04:22.879 --> 00:04:26.399 just using the devices themselves. Could be phones, laptops, sensors, whatever. 87 00:04:26.480 --> 00:04:29.040 The connect directly using wireless interfaces. 88 00:04:28.519 --> 00:04:29.839 No central router or anything. 89 00:04:29.959 --> 00:04:33.680 Nope, that's the key. Two main characteristics self organization. They 90 00:04:33.680 --> 00:04:36.240 have to figure out settings like IP addresses on their 91 00:04:36.240 --> 00:04:40.480 own and decentralized infrastructure, no central control point. And because 92 00:04:40.519 --> 00:04:43.519 the devices the nodes are often moving, the connections are 93 00:04:43.560 --> 00:04:46.639 constantly forming and breaking the whole network map. That topology 94 00:04:46.959 --> 00:04:48.639 is incredibly dynamic. 95 00:04:48.360 --> 00:04:52.959 Constant change, decentralized control sounds hard enough, But the sources 96 00:04:53.000 --> 00:04:55.959 also bring up cooperation. This sounds like we're engineering meets 97 00:04:56.000 --> 00:04:57.160 like behavioral science. 98 00:04:57.399 --> 00:05:00.199 Oh absolutely, This is where it gets super interesting, I think. 99 00:05:00.360 --> 00:05:04.319 Because there's no central authority making rules, nodes in a 100 00:05:04.399 --> 00:05:08.240 mannet have to cooperate altruistically to forward each other's data. 101 00:05:08.560 --> 00:05:11.800 The network only works if everyone plays along. What but 102 00:05:11.920 --> 00:05:15.680 cooperation costs something? Right? It uses up precious battery life, 103 00:05:15.759 --> 00:05:19.560 it consumes bandwidth, so naturally you get selfish behaviors. Some 104 00:05:19.759 --> 00:05:22.560 nodes might just decide, hey, I'll use the network, but 105 00:05:22.639 --> 00:05:25.040 I'm not going to waste my battery forwarding packets for 106 00:05:25.120 --> 00:05:25.639 someone else. 107 00:05:25.759 --> 00:05:31.079 Uh So network engineers have to become like virtual psychologists, 108 00:05:31.439 --> 00:05:34.040 trying to figure out how to make virtual devices play nice. 109 00:05:34.240 --> 00:05:36.120 It seems kind of weird, It sounds weird, but it's 110 00:05:36.160 --> 00:05:41.720 incredibly practical. The research actually simulates concepts like social contracts 111 00:05:41.759 --> 00:05:45.319 to encourage cooperation. Our sources highlight two main ways they 112 00:05:45.319 --> 00:05:49.240 do this in simulations. Okay, Like what one is payment systems. Basically, 113 00:05:49.680 --> 00:05:53.480 nodes earn some kind of virtual token or currency for cooperating, 114 00:05:53.839 --> 00:05:55.360 and they spend it to get their own. 115 00:05:55.279 --> 00:05:58.560 Data forwarded, so incentivizing good behavior exactly. 116 00:05:58.839 --> 00:06:03.079 The other approach is reputeation mechanisms. Cooperate and your reputation 117 00:06:03.160 --> 00:06:06.319 score goes up. Be selfish and it drops. If it 118 00:06:06.399 --> 00:06:09.480 drops too low, other nodes might just start ignoring you, 119 00:06:09.519 --> 00:06:11.439 effectively kicking you out of the network. 120 00:06:11.480 --> 00:06:15.199 Wow. So they're literally building little simulated economies or social 121 00:06:15.240 --> 00:06:18.000 systems just to get packets flowing reliably. 122 00:06:18.120 --> 00:06:21.360 Pretty much. It's about managing resource contention when there's no 123 00:06:21.439 --> 00:06:22.199 central boss. 124 00:06:22.800 --> 00:06:28.000 That's fascinating. Yeah, okay, so how do these volatile menets 125 00:06:28.360 --> 00:06:30.839 fit into the bigger picture? You mentioned the massive scale 126 00:06:30.839 --> 00:06:33.639 of data earlier, especially with the Internet of Things IoT. 127 00:06:34.040 --> 00:06:36.600 Yeah, they're a really important piece of that puzzle. I mean, 128 00:06:36.639 --> 00:06:38.639 look at your own phone right now. It's probably juggling 129 00:06:38.680 --> 00:06:41.879 cellular Wi Fi, maybe Bluetooth. It's its own little dynamic network. 130 00:06:42.480 --> 00:06:45.680 Now scale that up with IoT. You have potentially billions 131 00:06:45.720 --> 00:06:49.800 of sensors and devices. The thing's layer just churning out data. 132 00:06:50.439 --> 00:06:52.680 Setting all of that back to a centralized cloud for 133 00:06:52.759 --> 00:06:57.160 processing creates huge delays, latency, and bottlenecks. Right the cloud 134 00:06:57.199 --> 00:07:00.000 can get overwhelmed totally. So the industry is moving towards 135 00:07:00.079 --> 00:07:03.759 what's called fog computing. The idea is to push competing 136 00:07:03.800 --> 00:07:06.279 power closer to the edge, closer to the things. You 137 00:07:06.360 --> 00:07:10.399 have this intermediate layer sometimes called micronodes that does some processing, 138 00:07:10.480 --> 00:07:12.240 filtering and optimization locally. 139 00:07:12.360 --> 00:07:15.199 Ah, like a mist before the cloud exactly. 140 00:07:15.240 --> 00:07:17.360 It reduces the load on the central cloud and cuts 141 00:07:17.399 --> 00:07:22.480 down latency dramatically and MANETs. They're often the perfect technology 142 00:07:22.519 --> 00:07:26.360 for creating those dynamic local networks that form the fog layer. 143 00:07:26.519 --> 00:07:28.199 They connect the nearby things together. 144 00:07:28.360 --> 00:07:31.120 Okay, that connection makes sense. Yeah, we understand the challenge 145 00:07:31.199 --> 00:07:34.720 these dynamic, sometimes selfish networks. Now let's talk about the 146 00:07:34.720 --> 00:07:37.160 tool dot NS three give us the quick pitch. What 147 00:07:37.319 --> 00:07:37.959 is NS three? 148 00:07:38.160 --> 00:07:42.240 Okay, NS three Network simulator three. It's basically the gold 149 00:07:42.279 --> 00:07:45.399 standard for academic network research. It's open source. It's a 150 00:07:45.439 --> 00:07:48.920 discrete event simulator like we discussed, but its real power 151 00:07:49.160 --> 00:07:51.639 lies in its design to support emulation. 152 00:07:51.959 --> 00:07:53.920 Emulation, how's that different from simulation? 153 00:07:54.319 --> 00:07:57.680 It means NS three can actually run real world network 154 00:07:57.720 --> 00:08:01.879 protocol code. You can take kind from say Linux networking 155 00:08:01.920 --> 00:08:05.879 stack and run it inside the simulation. It blurs the 156 00:08:05.920 --> 00:08:09.199 line between the simulation and reality, making the results much 157 00:08:09.199 --> 00:08:09.839 more credible. 158 00:08:09.920 --> 00:08:12.519 Okay, that's cool. Let's make it really concrete for the listener. Say, 159 00:08:12.560 --> 00:08:15.600 I just want to simulate two computers talking. What are 160 00:08:15.639 --> 00:08:19.399 the basic steps in NS three to build that little 161 00:08:19.480 --> 00:08:20.319 virtual world. 162 00:08:20.560 --> 00:08:22.720 It's kind of like setting up a mini lab experiment. 163 00:08:23.000 --> 00:08:27.040 Three main steps. First, virtual hardware. You create your nodes 164 00:08:27.079 --> 00:08:29.680 node zero and node one, say. Then you define the 165 00:08:29.680 --> 00:08:32.080 connection between them, the channel maybe it's a point to 166 00:08:32.120 --> 00:08:35.320 point link. You set it's properties, data rate like five 167 00:08:35.440 --> 00:08:37.600 mbps and delay, maybe two mirrors. 168 00:08:37.639 --> 00:08:39.480 Okay, got the nodes in the ryer. Step two. 169 00:08:39.799 --> 00:08:42.279 Step two is the logic you need to install the 170 00:08:42.279 --> 00:08:45.559 brains the networking protocols. There's usually a helper function like 171 00:08:45.600 --> 00:08:48.840 Internet Stack helper that installs all the standard stuff IP, 172 00:08:49.080 --> 00:08:52.279 TCPUDP onto your virtual nodes. Now they know how to 173 00:08:52.279 --> 00:08:53.159 talk Internet. 174 00:08:53.000 --> 00:08:55.080 Right, so they have a physical layer in the network logic. 175 00:08:55.159 --> 00:08:59.240 What's left step three the application. You need to give 176 00:08:59.279 --> 00:09:02.360 them something to do. So you'd install say a you'd 177 00:09:02.360 --> 00:09:05.600 peko server application on one node listening on a specific 178 00:09:05.679 --> 00:09:08.120 port like port nine, okay, And on the other node 179 00:09:08.120 --> 00:09:10.600 you install a you'd peko client. You tell the client okay, 180 00:09:11.120 --> 00:09:14.320 at t one second, send one packet of ten hudred 181 00:09:14.360 --> 00:09:17.440 and twenty four bytes to the server's IP address and port, 182 00:09:17.519 --> 00:09:20.440 and then you just hit run pretty much. You call 183 00:09:20.519 --> 00:09:24.960 simulator dot run and NS starts executing those scheduled events. 184 00:09:25.039 --> 00:09:28.240 The client sending the packet, the packet traveling across the link, 185 00:09:28.320 --> 00:09:31.000 the server receiving it, maybe sending a reply. It just 186 00:09:31.159 --> 00:09:32.960 steps through time, event. 187 00:09:32.720 --> 00:09:35.919 By event in the simulated time. Yeah, it's not a 188 00:09:36.000 --> 00:09:37.679 real time, right, That's the point. If I run this 189 00:09:37.759 --> 00:09:41.559 on my old laptop versus a supercomputer, the simulated events 190 00:09:41.559 --> 00:09:43.960 happen at the same simulated time exactly. 191 00:09:44.039 --> 00:09:46.799 That is absolutely critical. NS three uses what's called next 192 00:09:46.799 --> 00:09:51.240 event time advance Netta. The simulated clock only jumps forward 193 00:09:51.240 --> 00:09:54.039 to the timestamp of the very next event in its queue. 194 00:09:54.080 --> 00:09:57.679 Could be nanoseconds, could be seconds. It's completely decoupled from 195 00:09:57.679 --> 00:10:01.320 the real world wall clock. Why is that so important reproducibility? 196 00:10:01.559 --> 00:10:04.679 It guarantees that if you run the exact same script 197 00:10:04.759 --> 00:10:07.440 with the same starting conditions, you will get the exact 198 00:10:07.480 --> 00:10:10.919 same sequence of events and results every single time, no 199 00:10:10.960 --> 00:10:13.919 matter how fast or slow the actual computer running it is. 200 00:10:14.240 --> 00:10:16.200 That's essential for scientific rigor. 201 00:10:16.039 --> 00:10:17.320 Right, Consistency is king? 202 00:10:17.440 --> 00:10:17.960 Ah? 203 00:10:18.000 --> 00:10:21.000 Okay, so we've built our model, We've run the NS 204 00:10:21.080 --> 00:10:24.840 three script. Now the data starts pouring out. How do 205 00:10:24.879 --> 00:10:27.200 we make sure we can actually trust this data? This 206 00:10:27.240 --> 00:10:29.120 sounds like where statistics comes in heavily. 207 00:10:29.240 --> 00:10:31.120 Oh yeah, big time. You can't just run it once 208 00:10:31.120 --> 00:10:33.360 and call it a day. First off, you need a 209 00:10:33.440 --> 00:10:36.639 smart way to explore how different settings affect the outcome. 210 00:10:36.960 --> 00:10:40.480 Researchers often use techniques like factorial designs. Common one is 211 00:10:40.480 --> 00:10:42.279 the two color dollar factorial design. 212 00:10:42.360 --> 00:10:43.440 Due to the k What's that. 213 00:10:43.519 --> 00:10:46.360 It's an efficient way to test the impact of different 214 00:10:46.360 --> 00:10:49.080 input factors, maybe things like packet size, number of nodes, 215 00:10:49.080 --> 00:10:52.120 transmission power. For each factor, you just picked two levels, 216 00:10:52.440 --> 00:10:54.759 a low setting and a high setting. Then you run 217 00:10:54.799 --> 00:10:58.480 simulations for all possible combinations of those low and high settings. 218 00:10:58.759 --> 00:11:02.279 That's two low dollar. Analyzing the results tells you really 219 00:11:02.320 --> 00:11:04.879 quickly which factors have a big impact on your output 220 00:11:04.960 --> 00:11:07.559 metric and which ones don't really matter much. It saves 221 00:11:07.639 --> 00:11:09.919 running tons of unnecessary simulations. 222 00:11:10.080 --> 00:11:12.240 That makes sense, you're testing parameters. But what about the 223 00:11:12.360 --> 00:11:16.919 randomness you mentioned? Networks are chaotic. If the simulation uses 224 00:11:17.039 --> 00:11:22.159 random numbers for things like background noise or packet errors, 225 00:11:22.799 --> 00:11:25.120 doesn't that make each run different. Isn't a single run 226 00:11:25.240 --> 00:11:26.240 kind of useless? Then? 227 00:11:26.679 --> 00:11:30.240 Statistically yes, a single run is pretty much useless on 228 00:11:30.279 --> 00:11:34.679 its own. Because the simulation involves random variables, The observations 229 00:11:34.720 --> 00:11:37.720 within that single run are likely to be autocorrelated. The 230 00:11:37.759 --> 00:11:40.320 outcome at time T might depend heavily on the outcome 231 00:11:40.320 --> 00:11:41.120 at time T one. 232 00:11:41.240 --> 00:11:42.519 Okay, so what's the fix. 233 00:11:42.639 --> 00:11:46.080 You have to perform multiple independent runs. We call these replicas. 234 00:11:46.279 --> 00:11:49.600 The goal is to get observations that are ID independent 235 00:11:49.720 --> 00:11:53.320 and identically distributed ID. Right, how do you ensure that? 236 00:11:53.480 --> 00:11:56.519 For each replica? You start the simulation with the exact 237 00:11:56.639 --> 00:12:01.919 same initial configurations, same network setup, same param but crucially, 238 00:12:02.360 --> 00:12:05.399 you give each replica a different starting seed for its 239 00:12:05.519 --> 00:12:06.759 random number generator. 240 00:12:07.320 --> 00:12:11.200 Ah, so each run uses a different stream of random numbers. 241 00:12:11.440 --> 00:12:14.159 Exactly. It's like flipping a coin multiple times instead of 242 00:12:14.200 --> 00:12:18.200 just once. Running say thirty or fifty independent replicas gives 243 00:12:18.240 --> 00:12:20.360 you a set of results that you can actually analyze 244 00:12:20.360 --> 00:12:23.600 statistically to get confidence intervals and draw meaningful conclusion. 245 00:12:23.720 --> 00:12:27.039 Okay, multiple runs, different seeds, got it. Yeah, But even then, 246 00:12:27.519 --> 00:12:30.159 I remember reading about the startup problem. The simulation has 247 00:12:30.159 --> 00:12:32.559 to kind of warm up right, like the system isn't 248 00:12:32.600 --> 00:12:34.240 behaving normally right at the start. 249 00:12:34.399 --> 00:12:37.320 That's spot on. It's formally called the problem of the 250 00:12:37.320 --> 00:12:41.080 initial transient or the startup problem. Think about it. When 251 00:12:41.080 --> 00:12:43.960 you first start the simulation, the network is probably empty, 252 00:12:44.360 --> 00:12:48.200 buffers are clear, nodes might be in unrealistic starting positions. 253 00:12:48.639 --> 00:12:51.200 The behavior in those first few moments or even minutes 254 00:12:51.240 --> 00:12:55.559 of simulated time isn't representative of the system's normal. 255 00:12:55.519 --> 00:12:58.559 Long run operation, so the early date is biased. 256 00:12:58.480 --> 00:13:01.159 Exactly most of the time. We want to study the 257 00:13:01.200 --> 00:13:04.600 steady state behavior, how the network performs after it's settled 258 00:13:04.600 --> 00:13:08.000 into a typical pattern where the key variables fluctuate around 259 00:13:08.120 --> 00:13:12.440 stable averages. That initial transient phase can skew your results 260 00:13:12.480 --> 00:13:13.799 badly if you include it. 261 00:13:14.000 --> 00:13:17.080 So how do you know when the warm up is over? 262 00:13:17.679 --> 00:13:19.039 When does the real data begin? 263 00:13:19.720 --> 00:13:23.000 That's the million dollar question. There are various methods, but 264 00:13:23.159 --> 00:13:26.120 a very common one, especially because it's visual, is the 265 00:13:26.159 --> 00:13:27.519 Welch graphical method. 266 00:13:27.639 --> 00:13:29.159 Welch, okay, how does that work? 267 00:13:29.399 --> 00:13:31.519 What you do is you take the output data from 268 00:13:31.519 --> 00:13:34.919 all your independent replicas, say packet delay measured over time. 269 00:13:35.559 --> 00:13:38.799 Then for each point in time, you calculate the average 270 00:13:38.799 --> 00:13:42.120 delay across all the replicas. You do this calculation using 271 00:13:42.120 --> 00:13:45.960 a moving window, creating a smoothed average curve over time. 272 00:13:46.000 --> 00:13:47.279 Okay, smoothing out the noise. 273 00:13:47.600 --> 00:13:50.879 Right. You plot this smoothed average curve. Typically you'll see 274 00:13:50.879 --> 00:13:53.720 it start somewhere, maybe weirdly high or low, and then 275 00:13:53.759 --> 00:13:56.720 it will drift and eventually settle down, fluctuating around a 276 00:13:56.759 --> 00:14:01.159 relatively constant level. That point where it visually appears to stabilize. 277 00:14:01.360 --> 00:14:04.440 That's your estimated end of the initial transient. Let's call 278 00:14:04.480 --> 00:14:08.080 that time index L. And then you simply delete all 279 00:14:08.120 --> 00:14:11.679 the simulation data collected before time, all from every single replica. 280 00:14:12.080 --> 00:14:15.039 This is called the replication deletion approach. You throw away 281 00:14:15.080 --> 00:14:18.320 the biased startup data to make sure your final analysis 282 00:14:18.360 --> 00:14:21.120 only uses observations from the steady state period. 283 00:14:21.360 --> 00:14:24.919 Wow. Okay, so it's quite a process. Build the model, carefully, 284 00:14:25.200 --> 00:14:28.759 run many independent replicas with different seeds, analyze the output, 285 00:14:28.799 --> 00:14:31.919 define the steady state, delete the initial data, and then 286 00:14:31.960 --> 00:14:34.600 you can finally calculate your averages and confidence intervals. 287 00:14:34.600 --> 00:14:37.519 Precisely. It requires a lot of statistical rigor to go 288 00:14:37.639 --> 00:14:43.879 from running the simulation to actually having prussworthy, scientifically valid results. 289 00:14:44.000 --> 00:14:46.519 That really brings us full circle, doesn't it. We started 290 00:14:46.519 --> 00:14:50.720 with this image of real world network chaos, these complex, heterogeneous, 291 00:14:50.840 --> 00:14:54.240 fast moving systems, and we've seen how engineers use tools 292 00:14:54.279 --> 00:14:58.039 like NS three to create a virtual manageable version. But 293 00:14:58.120 --> 00:15:01.200 getting reliable answers from that virtual world is an automatic. 294 00:15:01.639 --> 00:15:05.200 It demands careful experiment design like those factorial methods, and 295 00:15:05.279 --> 00:15:08.639 sophisticated output analysis like the Welsh method to handle the 296 00:15:08.679 --> 00:15:10.440 inherent randomness in startup effects. 297 00:15:10.720 --> 00:15:13.639 Absolutely, and it's amazing what you can study. If we 298 00:15:13.679 --> 00:15:17.000 loop back to that men At cooperation problem. Remember the 299 00:15:17.000 --> 00:15:21.279 payment and reputation systems. Researchers are essentially using simulation to 300 00:15:21.399 --> 00:15:26.240 explore fundamental ideas about social interaction and resource allocation, sometimes 301 00:15:26.279 --> 00:15:30.200 even integrating things like agent based simulation, maybe using frameworks 302 00:15:30.240 --> 00:15:34.720 like open AIGIM, where nodes can learn better cooperative strategies. 303 00:15:34.320 --> 00:15:37.000 Which brings us to a really interesting final thought for you, 304 00:15:37.120 --> 00:15:40.519 the listener to chew On, We've just discussed how network 305 00:15:40.559 --> 00:15:43.519 researchers can build these virtual worlds and get nodes to 306 00:15:43.559 --> 00:15:47.840 cooperate solving complex resource problems by giving them really simple 307 00:15:47.879 --> 00:15:50.519 social rules, earned tokens, build reputation. 308 00:15:50.759 --> 00:15:53.519 Yeah, basic rules driving complex emergent behavior. 309 00:15:53.759 --> 00:15:56.919 So if researchers can successfully model and predict network behavior 310 00:15:57.000 --> 00:16:00.399 tackling things like selfishness and cooperation with these relative simple 311 00:16:00.480 --> 00:16:04.480 rule sets, what might that imply about the underlying complexity 312 00:16:04.840 --> 00:16:08.159 or perhaps the surprising simplicity of the decision making processes 313 00:16:08.360 --> 00:16:11.919 that govern how resources get allocated in our own vastly 314 00:16:11.960 --> 00:16:15.919 more complex, highly connected human society. Something to think about,