WEBVTT 1 00:00:00.120 --> 00:00:04.000 Welcome to the deep dive, your shortcut to truly understanding 2 00:00:04.040 --> 00:00:07.960 complex topics. Today we're taking a fresh look at something 3 00:00:08.080 --> 00:00:13.039 often dismissed as well tedious, maybe just a box checking activity. 4 00:00:13.880 --> 00:00:14.759 Software testing. 5 00:00:14.960 --> 00:00:18.079 Yeah, it definitely gets that reputation sometimes. 6 00:00:17.600 --> 00:00:20.359 But what if I told you it's actually a deeply creative, 7 00:00:20.559 --> 00:00:24.480 intricate craft, almost like what a skilled artisan brings to 8 00:00:24.519 --> 00:00:25.320 their masterpiece. 9 00:00:25.440 --> 00:00:28.320 It's absolutely true. Software testing is far more than just 10 00:00:28.600 --> 00:00:31.839 you know, finding bugs. It's really about understanding the very 11 00:00:31.879 --> 00:00:35.280 fabric of how software is built, how it behaves, and 12 00:00:35.399 --> 00:00:37.359 ultimately how it's validated for quality. 13 00:00:37.479 --> 00:00:39.799 That's exactly right. Today we're going to pull back the 14 00:00:39.880 --> 00:00:43.000 layers on the craft of software testing, drawing insights from 15 00:00:43.039 --> 00:00:46.320 Pusey Jorgensen's Software Testing a crafts Fund's approach. 16 00:00:46.399 --> 00:00:48.000 It's a foundational text in the field. 17 00:00:48.200 --> 00:00:51.439 Our goal here is to unpack the fundamental concepts, the 18 00:00:51.479 --> 00:00:55.399 diverse techniques, and maybe some surprising real world applications, so 19 00:00:55.439 --> 00:00:58.280 you walk away with a much richer understanding and maybe 20 00:00:58.320 --> 00:01:01.679 even a new appreciation for this qui discipline that really 21 00:01:02.039 --> 00:01:04.400 underpins almost everything digital around us. 22 00:01:04.519 --> 00:01:06.599 Sounds like a great plan. Where should we start? 23 00:01:06.799 --> 00:01:09.879 So let's start right at the beginning, what testing really is. 24 00:01:10.519 --> 00:01:15.760 Jorgensen gives us this clear progression of terms that are fundamental. 25 00:01:16.079 --> 00:01:19.319 It all begins with a human error, a. 26 00:01:19.319 --> 00:01:21.799 Mistake, right, someone makes a mistake while coding. 27 00:01:21.959 --> 00:01:24.359 That's the error, and that human error then shows up 28 00:01:24.359 --> 00:01:25.439 in the software as. 29 00:01:25.319 --> 00:01:29.079 A fault, which is the bug or defect people talk about. Yeah, 30 00:01:29.120 --> 00:01:31.920 it's a representation of that mistake in the code. 31 00:01:31.959 --> 00:01:34.519 And it's crucial to remember this human element, isn't it. 32 00:01:35.239 --> 00:01:37.640 We test because we know that we're fallible. 33 00:01:37.359 --> 00:01:39.400 Especially in something as complex as software. 34 00:01:39.439 --> 00:01:42.760 We make mistakes, and it continues from there. When that 35 00:01:42.840 --> 00:01:45.879 code containing the fault actually gets executed. 36 00:01:45.400 --> 00:01:47.280 That's when you get a failure. The software doesn't do 37 00:01:47.319 --> 00:01:48.120 what it's supposed to do. 38 00:01:48.280 --> 00:01:51.560 Okay, And finally, the incident is what the user actually 39 00:01:51.599 --> 00:01:52.840 sees exactly. 40 00:01:52.920 --> 00:01:55.719 It's the symptom, the alert, the thing that makes someone say, hey, 41 00:01:55.959 --> 00:01:58.280 something went wrong here. So it's this chain reaction. 42 00:01:58.640 --> 00:02:02.519 Error leads to fault, leads to failure, leads to incident. 43 00:02:02.719 --> 00:02:07.799 You got it, human mistake, bugging code, wrong behavior, user notices. 44 00:02:08.280 --> 00:02:12.240 So given that chain, a test becomes this deliberate act 45 00:02:12.560 --> 00:02:15.360 right exercising the software with specific. 46 00:02:14.919 --> 00:02:17.680 Test cases, right and it has two main goals. Either 47 00:02:17.759 --> 00:02:20.560 you're trying to find those failures hut them down, or 48 00:02:20.759 --> 00:02:25.280 you're trying to demonstrate confidently that the software is working 49 00:02:25.319 --> 00:02:27.120 correctly under certain conditions. 50 00:02:27.439 --> 00:02:30.520 And a test case. It's not just throwing random stuff 51 00:02:30.560 --> 00:02:31.120 at the program. 52 00:02:31.240 --> 00:02:34.199 No, no, not at all. A craftsman builds a test 53 00:02:34.240 --> 00:02:38.599 case carefully. It needs an identity, a clear purpose like 54 00:02:39.039 --> 00:02:43.120 testing a specific business rule. Okay, it needs to find preconditions, 55 00:02:43.360 --> 00:02:47.560 specific inputs, and crucially the exact expected outputs. 56 00:02:47.719 --> 00:02:49.319 You need to know what right looks. 57 00:02:49.039 --> 00:02:51.639 Like absolutely, and even the expected state of the system 58 00:02:51.759 --> 00:02:54.280 after the test runs. Plus you keep track of its 59 00:02:54.280 --> 00:02:56.879 execution history. It's a complete, thoughtful construction. 60 00:02:57.159 --> 00:02:59.400 So that's the what and why. Now how does a 61 00:02:59.439 --> 00:03:04.479 craftsman actually approach testing? Jorgensen highlights too sort of fundamental philosophies. 62 00:03:04.680 --> 00:03:07.719 Yeah, specification based testing versus code based testing. 63 00:03:07.840 --> 00:03:10.840 Let's unpack specification based first. That's also known as black 64 00:03:10.879 --> 00:03:11.919 box testing. 65 00:03:11.879 --> 00:03:14.919 Right, or functional testing. The core idea is you're designing 66 00:03:14.919 --> 00:03:18.800 your tests based purely on the software's requirements the specs, 67 00:03:19.400 --> 00:03:21.199 without looking at the internal code. 68 00:03:21.439 --> 00:03:23.840 Like testing a car, you use the steering wheel of 69 00:03:23.840 --> 00:03:25.240 the pedals, check the. 70 00:03:25.280 --> 00:03:27.639 Lights, But you don't need to know how the engine 71 00:03:27.719 --> 00:03:30.240 or the wiring actually works inside. You just know what 72 00:03:30.280 --> 00:03:33.840 it should do based on the user manual. Basically, that sounds. 73 00:03:33.560 --> 00:03:36.840 Really useful during development. What are the big advantages there? 74 00:03:37.120 --> 00:03:39.840 Well, A huge one is that the test cases are 75 00:03:39.919 --> 00:03:44.599 independent of the actual implementation, meaning even if the developers 76 00:03:44.639 --> 00:03:47.800 completely rewrite a section of code, as long as it's 77 00:03:47.800 --> 00:03:50.400 supposed to do the same thing according to the spec, 78 00:03:50.520 --> 00:03:52.080 your test case is still valid. 79 00:03:52.199 --> 00:03:54.159 Oh okay, that's powerful. 80 00:03:53.800 --> 00:03:56.800 And it also means testing can start earlier, maybe even 81 00:03:56.879 --> 00:04:00.240 happen in parallel with development, which can speed things up. 82 00:04:00.400 --> 00:04:01.800 But I'm guessing there's a downside. 83 00:04:02.240 --> 00:04:05.280 I catch there is because you're not looking inside, you 84 00:04:05.280 --> 00:04:08.400 can end up with redundant tests, multiple tests checking the 85 00:04:08.400 --> 00:04:12.840 same underlying logic without realizing it wasted effort, and more critically, 86 00:04:13.080 --> 00:04:16.439 you can have gaps, big blind spots where parts of 87 00:04:16.480 --> 00:04:20.160 the software might just never get tested because the spec 88 00:04:20.199 --> 00:04:23.240 didn't explicitly cover some weird internal case. 89 00:04:23.600 --> 00:04:25.759 Okay, So that leads us to the other side. Code 90 00:04:25.800 --> 00:04:27.199 based testing right. 91 00:04:27.000 --> 00:04:30.560 Sometimes called white box testing. Here you are looking at 92 00:04:30.600 --> 00:04:33.560 the source code. The tests are designed based on the 93 00:04:33.600 --> 00:04:36.920 program's structure, its paths its conditions. 94 00:04:37.279 --> 00:04:40.040 What's the main limitation there? Then it sounds more thorough. 95 00:04:40.360 --> 00:04:43.600 It can be for certain things, but its main weakness 96 00:04:43.720 --> 00:04:46.879 is identifying behaviors that were never programmed at all but 97 00:04:46.959 --> 00:04:47.600 should have been. 98 00:04:47.720 --> 00:04:51.639 Ah omissions, things missing from the requirements in the code exactly. 99 00:04:51.720 --> 00:04:54.720 Or think about something malicious like a trojan warp someone 100 00:04:54.800 --> 00:04:57.240 slipped in if it wasn't in the spec and it's 101 00:04:57.279 --> 00:05:01.040 just extra code doing bad stuff. Spec based won't find it, 102 00:05:01.079 --> 00:05:03.519 and code based testing might just test the paths through 103 00:05:03.560 --> 00:05:07.199 it without realizing its malicious intent. It struggles with things 104 00:05:07.240 --> 00:05:09.120 that aren't there but should be, or things that are 105 00:05:09.160 --> 00:05:11.240 there but shouldn't be if they weren't specified. 106 00:05:11.360 --> 00:05:13.839 So this sounds like a classic debate, black box versus 107 00:05:13.839 --> 00:05:15.319 white box? Is one just better? 108 00:05:15.680 --> 00:05:17.879 Well, if you step back, you see pretty quickly that 109 00:05:18.000 --> 00:05:19.759 neither one alone is really sufficient. 110 00:05:19.920 --> 00:05:20.279 Why not? 111 00:05:20.560 --> 00:05:23.959 Code based testing, like we said, won't find requirements that 112 00:05:24.000 --> 00:05:27.759 were completely missed in the lamentation, and spec based testing 113 00:05:27.839 --> 00:05:31.879 won't find extra, maybe unwanted behaviors that got coded in 114 00:05:31.959 --> 00:05:33.199 but weren't in the spec. 115 00:05:33.759 --> 00:05:36.800 So the craftsman's approach isn't about picking a side in 116 00:05:36.839 --> 00:05:37.839 this great debate. 117 00:05:38.040 --> 00:05:41.920 Exactly, it's not either. The real answer, the craftsman's answer 118 00:05:42.079 --> 00:05:43.240 is a smart combination. 119 00:05:43.600 --> 00:05:44.720 Okay, how does that work? 120 00:05:45.120 --> 00:05:48.199 You use both You design tests based on the specification 121 00:05:48.319 --> 00:05:52.399 to ensure functionality, and you use code based techniques, especially 122 00:05:52.439 --> 00:05:55.399 coverage metrics, to see what parts of the actual code 123 00:05:55.439 --> 00:05:56.959 those tests are exercising. 124 00:05:57.199 --> 00:05:59.639 Ah, So the code coverage tells you about the gaps 125 00:05:59.639 --> 00:06:02.040 and redencies in your spec based tests. 126 00:06:02.079 --> 00:06:05.399 Precisely, it gives you that measurement, that confidence. You get 127 00:06:05.399 --> 00:06:09.120 the functional assurance from spec based testing and the structural 128 00:06:09.120 --> 00:06:12.399 assurance and efficiency check from code based testing. It's about 129 00:06:12.480 --> 00:06:14.319 blending the strengths of both views. 130 00:06:14.639 --> 00:06:17.680 That makes a lot of sense, combining perspectives for a 131 00:06:17.720 --> 00:06:22.399 fuller picture. Okay, So with these foundational ideas, the error chain, 132 00:06:23.199 --> 00:06:26.600 the two approaches, how does the craftsmen apply this in 133 00:06:26.680 --> 00:06:31.319 a real project. Jorgensen talks about levels of testing often 134 00:06:31.360 --> 00:06:32.920 shown using the V model. 135 00:06:33.199 --> 00:06:35.399 Right. The V model is a variation of the older 136 00:06:35.439 --> 00:06:39.120 Waterfall model, but it's really useful because it visually emphasizes 137 00:06:39.560 --> 00:06:43.879 how testing activities should mirror development activities. How So, well, 138 00:06:43.920 --> 00:06:45.360 on the left side of the V you have the 139 00:06:45.360 --> 00:06:49.439 development phases going down requirements high level design, detailed design 140 00:06:49.680 --> 00:06:52.240 coding okay, and on the right side. Going up, you 141 00:06:52.279 --> 00:06:56.040 have the corresponding testing levels. Unit testing validates the code 142 00:06:56.040 --> 00:07:00.480 from detailed design, Integration testing validates the interfaces from high 143 00:07:00.519 --> 00:07:03.680 level design, and system testing validates the whole thing against 144 00:07:03.720 --> 00:07:04.920 the initial requirements. 145 00:07:05.120 --> 00:07:08.560 So each test level connects back to a design level exactly. 146 00:07:08.600 --> 00:07:11.399 It builds in quality checks at each stage rather than 147 00:07:11.439 --> 00:07:14.319 waiting until the very end. The idea is to catch 148 00:07:14.360 --> 00:07:16.240 faults as close as possible to where they. 149 00:07:16.199 --> 00:07:19.519 Were introduced, cheaper to fix them earlier, much much cheaper. Okay, 150 00:07:19.519 --> 00:07:23.079 So let's dive into that first level unit testing. This 151 00:07:23.120 --> 00:07:26.160 is where the craftsman is working on individual components, right, 152 00:07:26.279 --> 00:07:28.399 like a single function or class. 153 00:07:28.120 --> 00:07:31.759 Yep, the smallest testable pieces of the software. And there's 154 00:07:31.800 --> 00:07:34.000 a whole toolbox of techniques. 155 00:07:33.480 --> 00:07:36.079 For this level. Let's start with boundary value testing. You 156 00:07:36.120 --> 00:07:38.480 called these the off by one detectives earlier. 157 00:07:38.560 --> 00:07:40.240 Ah, yeah, that's a good way to think about it. 158 00:07:40.480 --> 00:07:43.879 The core idea is simple. Experience shows that programmers often 159 00:07:43.920 --> 00:07:47.199 make mistakes right at the edges the boundaries of input ranges, 160 00:07:47.319 --> 00:07:48.160 like using when. 161 00:07:48.000 --> 00:07:50.600 They meant or starting a loop counter at one instead 162 00:07:50.639 --> 00:07:51.079 of zero. 163 00:07:51.360 --> 00:07:54.000 Exactly those kinds of things off by one errors. So 164 00:07:54.160 --> 00:07:57.319 boundary value testing says, Okay, if an input is valid 165 00:07:57.319 --> 00:08:00.120 between one and one hundred, don't just test fifty. 166 00:08:00.040 --> 00:08:00.639 Test the edges. 167 00:08:00.680 --> 00:08:04.439 Test the minimum one, minimum plus one two a nominal 168 00:08:04.519 --> 00:08:08.319 value like fifty, maximum one ninety nine, and the maximum 169 00:08:08.319 --> 00:08:08.920 one hundred. 170 00:08:09.160 --> 00:08:11.360 Makes sense, and what about robust testing? 171 00:08:11.759 --> 00:08:14.439 Robust boundary value testing goes one step further. It says, 172 00:08:14.639 --> 00:08:17.160 let's also test values just outside the valid range, so 173 00:08:17.240 --> 00:08:20.519 minimum one zero in our example, and maximum plus one 174 00:08:20.560 --> 00:08:21.160 one oh one. 175 00:08:21.240 --> 00:08:23.319 Why test invalid inputs. 176 00:08:23.160 --> 00:08:27.279 Because that's often where unexpected crashes or even security vulnerabilities happen. 177 00:08:27.879 --> 00:08:31.240 How does the system handle bad data? Robust testing checks that. 178 00:08:31.920 --> 00:08:35.639 There's also worst case testing, which gets pretty complex testing 179 00:08:35.679 --> 00:08:37.440 combinations of boundary values. 180 00:08:37.519 --> 00:08:40.720 Okay, boundaries are critical. Yeah, but testing every boundary for 181 00:08:40.759 --> 00:08:43.960 every variable sounds like it could create a lot of tests. 182 00:08:43.960 --> 00:08:46.759 It can't, And that leads nicely into the next technique, 183 00:08:46.960 --> 00:08:48.519 equivalence class testing. 184 00:08:48.759 --> 00:08:49.480 Oh did that help? 185 00:08:49.840 --> 00:08:54.080 It's all about smart simplification. It tackles the potential redundancy 186 00:08:54.120 --> 00:08:57.759 you might get from just boundary testing. The idea comes 187 00:08:57.799 --> 00:09:01.679 from math, from partitions. Okay, you try to identify groups 188 00:09:01.799 --> 00:09:04.679 or classes of inputs that the program should treat exactly 189 00:09:04.720 --> 00:09:07.840 the same way. If you put in three, four, or five, 190 00:09:08.039 --> 00:09:10.440 and the code follows the exact same logic path for 191 00:09:10.480 --> 00:09:12.600 all of them, they form an equivalence class. 192 00:09:12.799 --> 00:09:14.879 Ah, so you don't need to test all three exactly. 193 00:09:14.919 --> 00:09:17.919 The assumption is if you test one representative value from 194 00:09:17.960 --> 00:09:20.960 that class, say four, it tells you how the program 195 00:09:21.000 --> 00:09:23.240 behaves for all the other values in that class three 196 00:09:23.240 --> 00:09:23.639 and five. 197 00:09:23.759 --> 00:09:25.440 That sounds much more efficient. 198 00:09:25.240 --> 00:09:28.919 Hugely efficient. It aims for completeness. Every possible input belongs 199 00:09:29.000 --> 00:09:32.840 to some class, and non redundancy no input belongs to 200 00:09:32.919 --> 00:09:36.000 more than one class. Ideally you get great coverage with 201 00:09:36.080 --> 00:09:36.960 fewer tests. 202 00:09:37.440 --> 00:09:40.200 Are there different types like with boundary testing. 203 00:09:40.240 --> 00:09:44.600 Yes, similar ideas. Weak normal tests one value from each 204 00:09:44.720 --> 00:09:49.159 valid class, Strong normal tests combinations of valid classes, and 205 00:09:49.200 --> 00:09:52.279 then weak robust and strong robust ad testing for the 206 00:09:52.320 --> 00:09:55.559 invalid equivalence classes inputs that should cause an error. 207 00:09:55.720 --> 00:09:59.399 So boundary values handle the edges. Equivalence classes handle the 208 00:09:59.440 --> 00:10:03.519 broad range efficiently. What if the logic itself is really complicated, 209 00:10:03.600 --> 00:10:06.360 lots of nested if statements or complex conditions. 210 00:10:06.399 --> 00:10:09.320 That's where decision table based testing shines. It's a very 211 00:10:09.399 --> 00:10:12.679 rigorous logical way to approach complex decision logic. 212 00:10:12.759 --> 00:10:14.440 How does it work? You literally build a. 213 00:10:14.440 --> 00:10:17.240 Table you do you list all the conditions, the inputs 214 00:10:17.320 --> 00:10:19.720 or system states that affect the decision, and all the 215 00:10:19.799 --> 00:10:23.120 possible actions what the software should do. Then you systematically 216 00:10:23.159 --> 00:10:27.480 map out every possible combination of condition outcomes true, false, 217 00:10:27.919 --> 00:10:29.039 and the corresponding action. 218 00:10:29.200 --> 00:10:30.440 Sounds like it could get huge. 219 00:10:30.679 --> 00:10:33.960 It can initially, but the real power comes when you 220 00:10:34.039 --> 00:10:39.600 analyze the table. You often find don't care conditions, situations 221 00:10:39.639 --> 00:10:42.519 where the outcome of one condition doesn't actually matter if 222 00:10:42.559 --> 00:10:43.679 another condition is met. 223 00:10:43.879 --> 00:10:45.320 Ah, So you can simplify the. 224 00:10:45.240 --> 00:10:49.799 Table exactly you collapse rules. Jorgensen uses the next date 225 00:10:49.840 --> 00:10:53.320 function exam, a function to calculate the date after a 226 00:10:53.320 --> 00:10:56.320 given date. The initial table might seem to have hundreds 227 00:10:56.360 --> 00:10:59.720 of rules when you consider day, month, year, leap yer rules. 228 00:10:59.799 --> 00:11:01.840 Yeah yeah, that sounds complicated. 229 00:11:01.360 --> 00:11:04.759 But by using decision tables and identifying those don't care 230 00:11:04.879 --> 00:11:07.679 conditions they could reduce it down to just a handful 231 00:11:07.720 --> 00:11:10.559 of essential test cases that covered all the logic. It 232 00:11:10.639 --> 00:11:13.799 shows how testing can actually improve the program's design by 233 00:11:13.799 --> 00:11:14.960 clarifying the logic. 234 00:11:15.360 --> 00:11:19.840 Pesting clarifying the code, not just finding bugs. I like that. Okay, 235 00:11:19.840 --> 00:11:22.279 so we've looked at inputs in logic. What about the 236 00:11:22.279 --> 00:11:24.440 actual flow of the code, the paths. 237 00:11:24.080 --> 00:11:28.159 It takes, right, that's path testing here. The graftsmanship's focus 238 00:11:28.200 --> 00:11:30.000 to the control flow through the program. 239 00:11:30.159 --> 00:11:31.080 How do you visualize that? 240 00:11:31.519 --> 00:11:35.279 You often use program graphs. Think of nodes as chunks 241 00:11:35.320 --> 00:11:38.799 of code like statement fragments, and edges as the flow 242 00:11:38.799 --> 00:11:41.879 of control between them, like an if statement creating a branch. 243 00:11:42.600 --> 00:11:45.840 We often look at ddpaths decision to decision. 244 00:11:45.480 --> 00:11:47.639 Paths, okay, paths between decisions. 245 00:11:47.679 --> 00:11:50.759 The goal is to design tests that exercise different paths 246 00:11:50.840 --> 00:11:53.799 through this graph. There are different levels of coverage you 247 00:11:53.879 --> 00:11:58.000 might aim for, Like what the simplest is node coverage 248 00:11:58.120 --> 00:12:01.679 or statement coverage. Just make making sure every single statement 249 00:12:01.720 --> 00:12:04.039 in the code gets executed at least once by some. 250 00:12:04.360 --> 00:12:06.600 Test seems like a minimum baseline. 251 00:12:06.679 --> 00:12:10.919 It is stronger is edge coverage or branch coverage. Making 252 00:12:10.960 --> 00:12:15.000 sure every possible outcome of every decision, like the shrewd 253 00:12:15.159 --> 00:12:18.360 and false branches of an if statement, gets executed at 254 00:12:18.440 --> 00:12:18.879 least once. 255 00:12:19.039 --> 00:12:21.080 That sounds more thorough it generally is. 256 00:12:21.159 --> 00:12:23.879 And this is where we often hear about cyclomatic complexity 257 00:12:24.039 --> 00:12:24.600 right VG. 258 00:12:24.879 --> 00:12:26.600 What does that number actually tell us? 259 00:12:26.679 --> 00:12:29.679 It's a metric calculated from the program graph number of 260 00:12:29.759 --> 00:12:32.559 edges minus number of nodes plus one. Essentially, it gives 261 00:12:32.559 --> 00:12:34.919 you the number of independent paths through the code. The 262 00:12:35.039 --> 00:12:38.679 independent paths basically the minimum number of paths you'd need 263 00:12:38.720 --> 00:12:41.519 to test to ensure you've covered every edge at least once. 264 00:12:42.120 --> 00:12:45.000 It's a measure of the code's structural complexity. 265 00:12:45.600 --> 00:12:46.879 Is there a rule of thumb for it? 266 00:12:47.360 --> 00:12:50.679 A common guideline is that if the cyclomatic complexity HERMBG 267 00:12:51.080 --> 00:12:54.519 gets above ten for a single function or module, that 268 00:12:54.639 --> 00:12:58.639 code is getting pretty complex. It'll likely be harder to understand, 269 00:12:58.759 --> 00:13:01.879 harder to test, and potentially more prone to errors. 270 00:13:02.200 --> 00:13:05.639 So it's a warning sign for developers and testers exactly. 271 00:13:06.000 --> 00:13:10.080 It suggests maybe breaking the code down into smaller, simpler pieces. 272 00:13:10.600 --> 00:13:12.720 But wait, can you have paths in the graph that 273 00:13:12.799 --> 00:13:15.320 look possible but you can't actually execute them? 274 00:13:15.519 --> 00:13:18.840 Absolutely? Those are called infeasible paths. It's a major headache 275 00:13:18.840 --> 00:13:21.759 in path testing. Why way happen because the structure of 276 00:13:21.799 --> 00:13:25.679 the graph doesn't always capture the semantic dependencies. Maybe path 277 00:13:25.720 --> 00:13:29.120 A set's a variable to true and path B requires 278 00:13:29.120 --> 00:13:31.840 that variable to be false. You can draw the path, 279 00:13:31.919 --> 00:13:34.159 but you can never actually make the program follow it. 280 00:13:34.440 --> 00:13:37.440 Designing tests for infeasible path is wasted effort. 281 00:13:37.759 --> 00:13:41.120 Tricky, and you mentioned something even more rigorous for critical systems. 282 00:13:41.440 --> 00:13:45.320 Yes, for safety critical stuff like aviation software level A, 283 00:13:45.759 --> 00:13:49.879 they often require modified condition decision coverage or MCDC. 284 00:13:50.159 --> 00:13:50.960 What does that involved. 285 00:13:51.080 --> 00:13:54.799 It's pretty intense. For every decision with multiple conditions like 286 00:13:55.080 --> 00:13:57.639 A and B or C, you need test cases that 287 00:13:57.720 --> 00:14:01.720 show that each individual condition A, B, and C can 288 00:14:01.799 --> 00:14:05.159 independently affect the outcome of the entire decision, while the 289 00:14:05.240 --> 00:14:06.799 other conditions are held constant. 290 00:14:06.879 --> 00:14:10.120 Wow, that's ensuring every part of the logic really matters precisely. 291 00:14:10.320 --> 00:14:13.639 It's about preventing situations where a condition seems to be 292 00:14:13.720 --> 00:14:18.480 tested but its effect is masked by other conditions. Extremely thorough. 293 00:14:18.639 --> 00:14:21.320 Okay, so path testing covers the flow, but what about 294 00:14:21.320 --> 00:14:24.000 the data itself? What happens to variables is they move 295 00:14:24.039 --> 00:14:24.840 along these paths? 296 00:14:24.919 --> 00:14:28.200 Excellent question. That brings us to data flow testing. This 297 00:14:28.360 --> 00:14:31.759 technique shifts the focus from just the control flow paths 298 00:14:32.080 --> 00:14:34.759 to the life cycle of variables within those paths. 299 00:14:34.879 --> 00:14:35.279 LIFECYC. 300 00:14:35.440 --> 00:14:38.559 Yeah, where does a variable get defined, get a value, 301 00:14:38.720 --> 00:14:40.799 and where does it get used? Its value is read. 302 00:14:41.360 --> 00:14:44.279 Data flow testing looks for paths between a definition of 303 00:14:44.320 --> 00:14:47.120 a variable and a subsequent use of that same variable. 304 00:14:47.159 --> 00:14:48.879 These are called definition use. 305 00:14:48.840 --> 00:14:50.960 Paths or do paths. Why is that important? 306 00:14:51.279 --> 00:14:54.600 Well, it helps catch errors like using a variable before 307 00:14:54.639 --> 00:14:57.559 it's been initialized, or defining a variable and then never 308 00:14:57.600 --> 00:15:00.519 actually using its value. It acts as a kind of 309 00:15:00.559 --> 00:15:02.799 reality check on pure path. 310 00:15:02.600 --> 00:15:05.320 Testing, so it connects the control flow with what's actually 311 00:15:05.320 --> 00:15:06.960 happening to the data exactly. 312 00:15:07.279 --> 00:15:11.320 There's a whole hierarchy of dataflow coverage criteria like all 313 00:15:11.440 --> 00:15:14.799 deaths tests at least one path from every definition, all 314 00:15:14.879 --> 00:15:18.000 uses test paths to every use. All the upaths test 315 00:15:18.039 --> 00:15:21.639 every simple definition use path. It's particularly good for object 316 00:15:21.720 --> 00:15:24.200 oriented code where data interactions can be complex. 317 00:15:24.480 --> 00:15:27.039 Makes sense, okay? One more in the unit testing toolbox 318 00:15:27.159 --> 00:15:29.159 program slicing. This sounds different. 319 00:15:29.399 --> 00:15:32.600 It is a bit different, but incredibly useful, especially for 320 00:15:32.679 --> 00:15:37.039 debugging and understanding code. A slice of a program relative 321 00:15:37.080 --> 00:15:39.879 to a specific variable at a specific point is the 322 00:15:39.919 --> 00:15:43.519 subset of program statements that could possibly affect the value 323 00:15:43.559 --> 00:15:44.919 of that variable at that point. 324 00:15:45.000 --> 00:15:47.919 So it's like highlighting only the relevant code exactly. 325 00:15:48.440 --> 00:15:52.480 You can do a backward slice starting from a vary goal, 326 00:15:52.919 --> 00:15:56.080 trace back everything that could have influenced its value, or 327 00:15:56.159 --> 00:15:58.919 a forward slice starting from where a variable is defined, 328 00:15:59.120 --> 00:16:01.720 see everything that it could possibly influence later on. 329 00:16:01.879 --> 00:16:04.080 I can see how that would help with debugging. Focuses 330 00:16:04.080 --> 00:16:05.440 your attention tremendously. 331 00:16:05.639 --> 00:16:08.320 It helps eliminate all the irrelevant detail and lets the 332 00:16:08.360 --> 00:16:12.120 craftsmen focus precisely where the problem might be. Jorgensen even 333 00:16:12.159 --> 00:16:15.720 suggests that developing programs in terms of compilable slices could 334 00:16:15.759 --> 00:16:18.919 be a powerful way to build and understand complex software. 335 00:16:19.240 --> 00:16:21.919 Interesting idea. Okay, so that's a lot of techniques just 336 00:16:21.960 --> 00:16:26.799 for unit testing, boundary values, equivalence classes, decision tables, path testing, 337 00:16:26.879 --> 00:16:29.960 data flow slicing. How does the craftsmen put it all together? 338 00:16:30.000 --> 00:16:32.720 You mentioned Jorgensen's testing pendulum earlier, right. 339 00:16:32.679 --> 00:16:35.879 Let's bring that back. It's that metaphor of testing swinging 340 00:16:35.919 --> 00:16:39.759 between the specification based view, the black box. 341 00:16:39.559 --> 00:16:41.840 High level functional requirement. 342 00:16:41.320 --> 00:16:44.440 Focused and the code based view, the white box. 343 00:16:44.279 --> 00:16:47.399 Low level structural implementation focused. 344 00:16:47.320 --> 00:16:50.519 And the pendulum highlights that neither extreme is sufficient on 345 00:16:50.559 --> 00:16:54.799 its own. The real skill the craft lies in the combination. 346 00:16:55.399 --> 00:16:56.360 How does that play out. 347 00:16:56.360 --> 00:16:59.480 Practically, A common approach is to start by choosing a 348 00:16:59.480 --> 00:17:04.519 specification based technique, maybe equivalence classes or boundary values, to 349 00:17:04.599 --> 00:17:07.160 define an initial set of tests based on what the 350 00:17:07.160 --> 00:17:07.880 system should do. 351 00:17:08.079 --> 00:17:10.119 Okay, get the functional coverage first. 352 00:17:10.599 --> 00:17:13.559 Then you run those tests and use code coverage tools 353 00:17:13.559 --> 00:17:16.160 which come from the code based world, to measure which 354 00:17:16.200 --> 00:17:18.119 parts of the code were actually executed. 355 00:17:18.480 --> 00:17:21.279 Ah, And that measurement reveals the gas exactly. 356 00:17:21.519 --> 00:17:23.400 It shows you the parts of the code your spec 357 00:17:23.480 --> 00:17:27.400 based tests didn't reach. It might also reveal redundancies. If 358 00:17:27.440 --> 00:17:30.759 multiple tests exercise the exact same code path, then you 359 00:17:30.799 --> 00:17:33.640 can design additional tests, perhaps using path or data flow 360 00:17:33.680 --> 00:17:36.119 ideas specifically to fill those gaps. 361 00:17:36.240 --> 00:17:38.559 So it's an iterative refinement using both perspectives. 362 00:17:38.880 --> 00:17:42.960 Precisely, you leverage the strengths of both. Jorgensen uses an 363 00:17:42.960 --> 00:17:45.880 insurance premium case study to show this how you'd pick 364 00:17:45.960 --> 00:17:49.519 different techniques based on whether variables are physical quantities or 365 00:17:49.559 --> 00:17:53.480 logical flags, whether faults are likely independent or interacting, and 366 00:17:53.519 --> 00:17:57.559 how to handle exceptions. It really emphasizes choosing the right 367 00:17:57.640 --> 00:18:00.680 tool or combination of tools for this job. 368 00:18:01.240 --> 00:18:04.079 The hallmark of a craftsman. Okay, let's zoom out. Now 369 00:18:04.119 --> 00:18:06.799 we've covered the unit level in detail. How does testing 370 00:18:06.960 --> 00:18:10.440 fit into the bigger picture across different software development life cycles? 371 00:18:10.519 --> 00:18:11.119 Good question. 372 00:18:11.480 --> 00:18:14.000 The traditional waterfall model, as we touched on with the 373 00:18:14.079 --> 00:18:17.759 V model, was very linear. Design everything first, then build it, 374 00:18:17.880 --> 00:18:21.279 then test it in stages unit integrash system. It kind 375 00:18:21.319 --> 00:18:23.960 of assumed you could know everything perfectly upfront. 376 00:18:23.559 --> 00:18:25.480 Which rarely happens in reality. 377 00:18:25.160 --> 00:18:30.000 Exactly, so iterative models emerge, things like incremental development, evolutionary prototyping, 378 00:18:30.079 --> 00:18:33.559 the spiral model. The big shift there was moving away 379 00:18:33.559 --> 00:18:36.000 from doing everything in one big chunk to doing things 380 00:18:36.000 --> 00:18:39.039 in smaller cycles, building and testing parts of the system, 381 00:18:39.319 --> 00:18:40.960 getting feedback, and then. 382 00:18:40.920 --> 00:18:43.920 Iterating more flexible, more adaptive. 383 00:18:43.720 --> 00:18:46.759 Much more. And this really paved the way for agile testing. 384 00:18:47.000 --> 00:18:50.240 What are the key characteristics of testing in an agile world? 385 00:18:50.400 --> 00:18:54.319 Agile testing is fundamentally driven by customer needs and feedback. 386 00:18:54.640 --> 00:18:58.559 It's typically bottom up, focusing on delivering working software components 387 00:18:58.599 --> 00:19:03.799 early and often, and it absolutely embraces changing requirements. 388 00:19:03.880 --> 00:19:04.920 Flexibility is key. 389 00:19:05.000 --> 00:19:08.839 Absolutely. Two big examples you hear about are Extreme Programming 390 00:19:09.400 --> 00:19:13.359 XP and Test driven development TDDDDD. 391 00:19:13.480 --> 00:19:14.920 That's the one where we write the test first. 392 00:19:14.759 --> 00:19:17.960 Right that's the one. It sounds backward, but it's powerful. 393 00:19:18.640 --> 00:19:23.519 In TDD, developers work in very short cycles. First, write 394 00:19:23.559 --> 00:19:26.640 an automated test case for a tiny piece of functionality 395 00:19:26.640 --> 00:19:29.519 that doesn't exist yet. The test will obviously. 396 00:19:29.119 --> 00:19:30.880 Fail because the code isn't there right. 397 00:19:31.319 --> 00:19:33.920 Then write the minimum amount code needed to make that 398 00:19:34.000 --> 00:19:38.240 test pass okay, and then importantly refactor the code, clean 399 00:19:38.279 --> 00:19:41.559 it up, improve its design, while continually rerunning all the 400 00:19:41.640 --> 00:19:43.200 tests to make sure nothing broke. 401 00:19:43.759 --> 00:19:46.000