WEBVTT 1 00:00:00.160 --> 00:00:03.080 Welcome to the deep dive. Today, we're jumping into the 2 00:00:03.120 --> 00:00:07.519 Linux kernel, specifically how it handles hardware. If you've ever 3 00:00:07.559 --> 00:00:10.599 wanted to write a device driver, or you know, just 4 00:00:10.679 --> 00:00:13.800 understand how your keyboard actually talks to your OS, this 5 00:00:13.919 --> 00:00:16.679 is for you. We're building the conceptual map exactly. 6 00:00:16.760 --> 00:00:20.039 Our mission today is really about the architecture. We've looked 7 00:00:20.039 --> 00:00:24.600 at some pretty comprehensive guides on kernel and driver development, 8 00:00:25.039 --> 00:00:27.480 and we want to quickly lay out that core blueprint. 9 00:00:27.640 --> 00:00:29.320 You know, what do you have to know? What's different? 10 00:00:29.359 --> 00:00:31.359 How does the kernel see the world of devices? 11 00:00:31.519 --> 00:00:35.600 Okay, so starting basic the device driver, it's fundamentally a 12 00:00:35.600 --> 00:00:37.200 translator right sitting in the middle. 13 00:00:37.240 --> 00:00:40.159 It's the crucial link. Yeah, it's that specialized code connecting 14 00:00:40.200 --> 00:00:43.479 your apps, your user space stuff to the actual physical 15 00:00:43.479 --> 00:00:45.560 hardware and it all goes to the kernel. You could 16 00:00:45.560 --> 00:00:48.200 almost say it's the only bit of software allowed to 17 00:00:48.240 --> 00:00:50.359 like directly touch the hardware ports. 18 00:00:50.560 --> 00:00:53.039 Got it. Okay, So let's start where most drivers start, 19 00:00:53.479 --> 00:00:55.920 the kernel module. This is how Linux stays flexible. Right, 20 00:00:56.000 --> 00:00:57.560 Let's you add stuff without a. 21 00:00:57.479 --> 00:01:01.799 Full reboot precisely, modules like you extend the kernel dynamically 22 00:01:02.159 --> 00:01:07.239 at run time, and our sources. Definitely emphasize this isn't 23 00:01:07.280 --> 00:01:10.959 for absolute beginners. You need solid C skills and to 24 00:01:10.959 --> 00:01:13.280 be comfortable on the Linux command line before you really 25 00:01:13.280 --> 00:01:14.640 dive into writing kernel code. 26 00:01:14.799 --> 00:01:18.040 Right, So, if we're building one, we need the basic structure, 27 00:01:18.480 --> 00:01:21.120 an entry point, and ideally an exit point. 28 00:01:21.200 --> 00:01:25.200 That's the core. You use module in it that declares 29 00:01:25.239 --> 00:01:28.000 the function that runs when your module gets loaded you 30 00:01:28.000 --> 00:01:31.680 know withins modern mod probe. And then module exit declares 31 00:01:31.719 --> 00:01:34.079 the cleanup function for when you unload it with a romedet. 32 00:01:34.200 --> 00:01:37.000 Okay, now here's where it gets interesting. Compared to normal programming. 33 00:01:37.400 --> 00:01:41.640 Memory optimization. Yeah, the kernel uses these macros in and exit. 34 00:01:42.159 --> 00:01:43.439 Why are they so important? Ah? 35 00:01:43.519 --> 00:01:45.719 Yeah, they're not just comments. There are instructions for the 36 00:01:45.760 --> 00:01:49.200 compiler and linker really important. When you mark your in 37 00:01:49.280 --> 00:01:52.000 it function within it, you're telling the linker put this 38 00:01:52.159 --> 00:01:53.840 code in a special memory section. 39 00:01:54.000 --> 00:01:55.480 And what's special about that section? 40 00:01:55.879 --> 00:01:58.799 Well, if your module is built into the kernel statically compiled, 41 00:01:59.200 --> 00:02:02.319 the kernel actually freeze that memory once the init function 42 00:02:02.439 --> 00:02:04.280 is done, because, I mean, it's never going to call 43 00:02:04.319 --> 00:02:06.400 that in it function again until the next reboot, right, 44 00:02:06.680 --> 00:02:09.120 so why keep the code around. It's a smart optimization. 45 00:02:09.319 --> 00:02:10.080 Use it, lose it. 46 00:02:10.360 --> 00:02:13.520 That is clever proactively cleaning up its own startup code. 47 00:02:13.879 --> 00:02:17.719 And exit is that the flip side pretty much? Yeah. 48 00:02:17.840 --> 00:02:21.400 Exit tills the compiler to just leave out the exit 49 00:02:21.439 --> 00:02:24.319 functions code entirely if the module is built in, because 50 00:02:24.319 --> 00:02:26.240 if it's built in, you can unload it anyway, so 51 00:02:26.639 --> 00:02:28.400 save space removes code you'd never run. 52 00:02:28.479 --> 00:02:33.919 Okay, moving to metadata. Every module needs info like module author, 53 00:02:34.439 --> 00:02:37.639 But the big one seems to be module license. Why 54 00:02:37.719 --> 00:02:39.879 is that one so sensitive? 55 00:02:40.000 --> 00:02:43.120 It really is. It's about the GPL and the kernel's ecosystem. 56 00:02:43.479 --> 00:02:46.719 You must declare a GPL compatible license using module license, 57 00:02:47.280 --> 00:02:49.840 especially if you want your module to use certain kernel functions, 58 00:02:50.039 --> 00:02:53.319 ones that are specifically exported only for GPL modules using 59 00:02:53.360 --> 00:02:54.520 exports symbol GPL. 60 00:02:54.599 --> 00:02:57.360 And if you don't, or if you use a proprietary. 61 00:02:56.759 --> 00:02:59.479 License, then boot your kernel gets more distainted. The kernel 62 00:02:59.520 --> 00:03:02.400 sits a fly. It basically says warning, non open or 63 00:03:02.520 --> 00:03:05.400 untrusted code has been loaded. If you then get a 64 00:03:05.400 --> 00:03:09.599 crash a kernel panic, well good luck getting help from 65 00:03:09.639 --> 00:03:12.800 the community. They'll likely see the taint flag and you know, 66 00:03:12.919 --> 00:03:15.199 say they can't support it because proprietary code might be 67 00:03:15.240 --> 00:03:17.120 the cause. It's a serious line. 68 00:03:17.240 --> 00:03:19.960 Okay, that definitely sets the stage. Now let's make that 69 00:03:20.000 --> 00:03:24.759 mental shift. We're not writing userspace programs anymore. We're inside 70 00:03:24.759 --> 00:03:27.759 the kernel. The rules change, the safety nets. 71 00:03:27.960 --> 00:03:31.400 Gone completely gone. In user space, if something goes terribly wrong, 72 00:03:31.479 --> 00:03:34.639 your program crashes. The OS cleans up fine. In the kernel, 73 00:03:34.919 --> 00:03:38.599 no way. If you allocate memory, grab an ieoport, whatever, 74 00:03:39.199 --> 00:03:41.680 you must clean it up yourself before your function returns. 75 00:03:41.719 --> 00:03:44.759 Fail to do that and your risk leaks instability, maybe 76 00:03:44.800 --> 00:03:45.840 even a full system crash. 77 00:03:45.919 --> 00:03:48.560 So error handling is totally different. No more just returning 78 00:03:48.680 --> 00:03:50.360 zero for success and one for failure. 79 00:03:50.560 --> 00:03:53.599 Right. The standard is really strict. Kernel functions that interact 80 00:03:53.639 --> 00:03:57.680 with system calls. They must return errors as negative values 81 00:03:57.840 --> 00:04:00.719 like return error code. So you'll see you're an AIO 82 00:04:00.879 --> 00:04:03.199 for an IO error or an enulmum if you couldn't 83 00:04:03.199 --> 00:04:05.759 get memory. It's a clear convention, so errors get passed 84 00:04:05.840 --> 00:04:06.960 up correctly, and. 85 00:04:06.919 --> 00:04:09.280 To manage that cleanup, especially if you have say five 86 00:04:09.319 --> 00:04:12.560 steps that work then the sixth one fails. The kernel 87 00:04:12.560 --> 00:04:17.480 style actually recommends using go to. That sounds controversial. It does. 88 00:04:17.560 --> 00:04:19.759 It goes against a lot of Standard C teaching, but 89 00:04:19.879 --> 00:04:23.040 in the kernel it's pragmatic. You use go to strictly 90 00:04:23.079 --> 00:04:26.399 for error cleanup paths. You set up labels like air free, buffer, 91 00:04:26.439 --> 00:04:29.360 air release, lock, maybe out. When an error happens, you 92 00:04:29.399 --> 00:04:31.920 go to the appropriate label. This jumps you straight into 93 00:04:31.920 --> 00:04:34.959 the sequence of cleanup steps needed, executed in reverse order 94 00:04:34.959 --> 00:04:35.519 of allocation. 95 00:04:35.639 --> 00:04:37.959 But doesn't that make the code harder to follow all 96 00:04:38.000 --> 00:04:38.639 those jumps. 97 00:04:39.000 --> 00:04:41.160 You'd think so, but it actually tends to make it cleaner. 98 00:04:41.160 --> 00:04:45.560 In this context. The alternative is deeply nested eifles blocks 99 00:04:45.800 --> 00:04:48.360 which get really hard to read and are super easy 100 00:04:48.399 --> 00:04:51.839 to mess up, like forgetting a cleanup step inside one branch. 101 00:04:52.560 --> 00:04:55.879 The go to approach enforces a clear, linear cleanup path. 102 00:04:56.240 --> 00:04:59.040 It's considered the safest and most readable pattern for kernel 103 00:04:59.120 --> 00:05:00.959 error handling. Strange but true. 104 00:05:01.040 --> 00:05:03.240 Okay, I can see the logic there, given the stakes. 105 00:05:03.720 --> 00:05:06.120 What about functions that need to return a pointer but 106 00:05:06.199 --> 00:05:08.959 might also fail. You can't return both a pointer and 107 00:05:09.079 --> 00:05:10.800 a negative error code, ah. 108 00:05:10.720 --> 00:05:13.560 Classic C problem. The kernel has neat macros for this, 109 00:05:13.720 --> 00:05:17.120 or ptr iSER and per tier. If your function fails 110 00:05:17.120 --> 00:05:20.720 and needs to return say anviol invalid argument instead of 111 00:05:20.759 --> 00:05:24.639 a pointer, it uses airptr and ball fail. This converts 112 00:05:24.639 --> 00:05:27.120 the air code into a special pointer value. The code 113 00:05:27.120 --> 00:05:29.759 calling that function then checks the return pointer using iSER. 114 00:05:30.120 --> 00:05:32.040 If it returns true, it means it's an error pointer. 115 00:05:32.360 --> 00:05:34.319 Then if you need the actual air code back, you 116 00:05:34.439 --> 00:05:37.399 use ptr on that pointer and it gives you ANVOL. 117 00:05:37.560 --> 00:05:39.720 It avoids ambiguity with returning anal MP. 118 00:05:39.800 --> 00:05:43.600 That's ill again. Okay, last bit on the programming shift logging, 119 00:05:43.639 --> 00:05:46.240 we're supposed to move beyond just using print, right. 120 00:05:46.399 --> 00:05:49.720 Yeah. While print is still the underlying engine, the recommendation 121 00:05:49.839 --> 00:05:53.120 is strongly towards using specific elper functions. Don't just use 122 00:05:53.120 --> 00:05:57.360 print with log levels directly, use things like prayer or 123 00:05:57.399 --> 00:06:02.120 prinfo for general module messages. But and this is important, 124 00:06:02.399 --> 00:06:05.319 if you're in a device driver, use the dev versions 125 00:06:05.639 --> 00:06:07.879 UH or dev infostruct device DEV. 126 00:06:08.120 --> 00:06:11.319 And why the dev prefix ones specifically for drivers. 127 00:06:11.199 --> 00:06:15.040 Because they automatically include context about the specific device the 128 00:06:15.040 --> 00:06:17.720 message is coming from, its name is positioned in the system, 129 00:06:18.000 --> 00:06:21.160 makes debugging way easier when you have multiple identical devices. 130 00:06:21.240 --> 00:06:24.199 They're even netdev versions for network drivers. They all tie 131 00:06:24.199 --> 00:06:26.000 the message to a specific kernel object. 132 00:06:26.079 --> 00:06:28.000 Got it? And if we want our modules messages to 133 00:06:28.000 --> 00:06:30.839 stand out in the flood of kernel logs, how do 134 00:06:30.879 --> 00:06:31.759 we add a prefix? 135 00:06:32.240 --> 00:06:36.199 Simple. You use the prfmt macro, you define hashtag, define 136 00:06:36.240 --> 00:06:39.480 prfmt fmt KDI. You're building the devmt at the top 137 00:06:39.519 --> 00:06:42.160 of your source file, Katie. Build mode name gets set 138 00:06:42.199 --> 00:06:44.800 automatically during the build to your module's name. So now 139 00:06:44.839 --> 00:06:48.319 every print FO, dev etc. Automatically prints like my driver 140 00:06:48.800 --> 00:06:50.800 error occurred. Super helpful. 141 00:06:50.839 --> 00:06:53.319 Okay, we've got the rules of engagement down. Now. The 142 00:06:53.360 --> 00:06:58.560 big danger zone concurrency, especially on SMP systems symmetric multiprocessing, 143 00:06:58.839 --> 00:07:01.600 where multiple CPU use can access the same memory the 144 00:07:01.639 --> 00:07:05.759 same hardware simultaneously. Locks are essential. 145 00:07:05.920 --> 00:07:09.399 This is absolutely where things get tricky, even for experienced devs. 146 00:07:09.839 --> 00:07:12.560 The key is understanding the context. Are you in code 147 00:07:12.560 --> 00:07:16.519 that can sleep user context or code that absolutely cannot sleep, 148 00:07:16.680 --> 00:07:18.560 like an interrupt handler atomic context. 149 00:07:18.639 --> 00:07:20.959 Let's start with the fast ones. Spin locks meant for 150 00:07:21.040 --> 00:07:22.680 short atomic operation. 151 00:07:22.439 --> 00:07:25.639 Exactly very short critical sections. When a CPU grabs a 152 00:07:25.680 --> 00:07:28.839 standard spin lock. Using spin lock, it disables preemption on 153 00:07:28.839 --> 00:07:31.560 that specific CPU, so other tasks in the same core 154 00:07:31.600 --> 00:07:34.439 won't interrupt it. But and this is the critical part, 155 00:07:34.519 --> 00:07:37.560 a standard spin lock does not stop hardware interrupts from 156 00:07:37.600 --> 00:07:38.560 firing on that cpu. 157 00:07:38.879 --> 00:07:42.519 Ah, and that's the classic deadlock scenario waiting to happen sisily. 158 00:07:42.560 --> 00:07:45.519 Imagine task A on CPU zero called spin lock to 159 00:07:45.560 --> 00:07:49.759 protect some data. Then bam, hardware interrupt fires on CPU zero. 160 00:07:50.079 --> 00:07:53.519 The CPU jumps to the interrupt handler IRQ. Now, if 161 00:07:53.560 --> 00:07:56.000 that IRQ handler needs the same data and tries to 162 00:07:56.040 --> 00:07:59.920 acquire the same spin lock, the IRQ spins forever, waiting 163 00:08:00.120 --> 00:08:02.480 task A to release the lock. But Task A is 164 00:08:02.480 --> 00:08:04.879 preempted by the IRQ and can't run to release. 165 00:08:04.600 --> 00:08:09.360 It out catastrophic. So the rule is, if data protected 166 00:08:09.360 --> 00:08:11.519 by a spin lock might also be touched by an 167 00:08:11.519 --> 00:08:13.120 interrupt handler, you need something more. 168 00:08:13.319 --> 00:08:16.360 Yes, unless you are absolutely one hundred percent certain that 169 00:08:16.439 --> 00:08:18.720 no interrupt handler will ever try to access that data 170 00:08:18.800 --> 00:08:22.360 or acquire that lock, you must use the IRQ save variants. 171 00:08:22.600 --> 00:08:24.560 Those are functions like spin locker save and spin a 172 00:08:24.560 --> 00:08:27.600 locker store. They do two things, disable preemption and disabled 173 00:08:27.639 --> 00:08:30.480 hardware interrupts on the local CPU. This makes the critical 174 00:08:30.480 --> 00:08:33.039 section truly atomic on that core. It's a vital lesson. 175 00:08:33.080 --> 00:08:36.240 Okay, So spin locks are for short atomic contexts, possibly 176 00:08:36.360 --> 00:08:39.399 needing IRQ disabling. How do utexts differ? 177 00:08:39.600 --> 00:08:43.360 Mutexes are conceptually simpler but have different rules they're built 178 00:08:43.480 --> 00:08:46.799 using spin locks underneath, but their behavior is different. If 179 00:08:46.840 --> 00:08:49.720 task B tries to acquire a mutex held by task A, 180 00:08:50.279 --> 00:08:53.360 instead of spinning, task B is put to sleep. The 181 00:08:53.440 --> 00:08:55.360 kernel puts it on a weight queue and lets the 182 00:08:55.360 --> 00:08:56.639 CPU schedule some other. 183 00:08:56.559 --> 00:09:00.399 Task, so they yield the CPU much more better for 184 00:09:00.559 --> 00:09:02.399 potentially longer critical sections. 185 00:09:02.440 --> 00:09:05.919 Then, yes, much better for contention and longer holds. But 186 00:09:06.399 --> 00:09:09.039 and this is a huge butt because they involve sleeping. 187 00:09:09.320 --> 00:09:12.360 You can only use mutexes in context where sleeping is allowed. 188 00:09:12.639 --> 00:09:15.480 That mean primarily user context like when handling a system call. 189 00:09:15.799 --> 00:09:18.000 You can never acquire a mutex from an interrupt handler 190 00:09:18.200 --> 00:09:21.360 or any other atomic context because those contexts absolutely cannot 191 00:09:21.360 --> 00:09:25.200 sleep using a mutex. There boom, system panic, got it. 192 00:09:25.240 --> 00:09:28.519 Spin locks for atomic mutexes for sleeping contexts. Let's briefly 193 00:09:28.519 --> 00:09:30.639 touch on time. We hear about Jiffey's but things are 194 00:09:30.639 --> 00:09:31.480 more abstract now. 195 00:09:31.600 --> 00:09:33.639 Yeah, Chiffees was the old way. It is. The kernel 196 00:09:33.639 --> 00:09:36.399 counter the increments at a certain frequency defined by the 197 00:09:36.720 --> 00:09:40.600 h z value. It gave fairly low resolution timing. The 198 00:09:40.639 --> 00:09:43.639 modern kernel abstracts this. It relies on two types of 199 00:09:43.639 --> 00:09:44.960 hardware components. For time. 200 00:09:45.279 --> 00:09:47.759 The things that track time and the things that schedule 201 00:09:47.799 --> 00:09:48.519 events in time. 202 00:09:48.679 --> 00:09:52.559 You got it. First, clock source devices. These provide a 203 00:09:52.600 --> 00:09:56.639 high resolution, always increasing monotonic counter. Think of it as 204 00:09:56.639 --> 00:10:00.679 the master clock for accurate time stamps. Second clock event 205 00:10:00.759 --> 00:10:04.799 devices these are the programmable timers. They can be told 206 00:10:05.039 --> 00:10:09.639 fire and interrupt exactly n nanoseconds from now. They're crucial 207 00:10:09.720 --> 00:10:12.759 for things like high resolution timers or timers much more 208 00:10:12.840 --> 00:10:13.840 precise than Jiffy's. 209 00:10:13.879 --> 00:10:17.200 Okay, and before we leave this area, tasklts they were 210 00:10:17.200 --> 00:10:19.559 for deferring work from interrupts, right the bottom half. 211 00:10:19.679 --> 00:10:22.159 Yes, they were a common mechanism. Yeah, but the sources 212 00:10:22.200 --> 00:10:24.840 we looked at had a very very strong warning. Task 213 00:10:24.919 --> 00:10:28.200 lits are being actively deprecated. They're slated for removal. You 214 00:10:28.240 --> 00:10:30.440 should not use them in any new code. They really 215 00:10:30.480 --> 00:10:32.799 only exist now for historical pedagogic reasons. 216 00:10:32.840 --> 00:10:35.320 Okay, loud and clear. So for deferring work now, the 217 00:10:35.360 --> 00:10:36.480 answer is work cues. 218 00:10:36.720 --> 00:10:40.159 Absolutely use workques. They provide a much more flexible and 219 00:10:40.279 --> 00:10:43.039 robust way to q work to be executed by dedicated 220 00:10:43.080 --> 00:10:45.879 kernel threads. Those threads can sleep, so they can handle 221 00:10:45.960 --> 00:10:47.840 much longer, more complex tasks safely. 222 00:10:48.159 --> 00:10:51.679 Right. Let's zoom out now to the big picture, how 223 00:10:51.679 --> 00:10:55.240 the kernel organizes all this hardware, the Linux Device Model LDM. 224 00:10:55.480 --> 00:10:57.840 This is how physical stuff turns into those directories we 225 00:10:57.879 --> 00:10:58.480 see insists. 226 00:10:58.840 --> 00:11:02.120 LDM is the framework, the glue it creates that hierarchy. 227 00:11:02.559 --> 00:11:06.600 Buses contain devices, drivers attached to devices, and it uses 228 00:11:06.679 --> 00:11:09.720 three core low level structures. At the very bottom, the 229 00:11:09.759 --> 00:11:12.960 most fundamental piece is the object. A cobject isn't a 230 00:11:12.960 --> 00:11:16.000 device itself or a driver. It represents the connection or 231 00:11:16.039 --> 00:11:19.519 an object that can appear in sifts. Basically, every single 232 00:11:19.519 --> 00:11:23.399 directory you navigate in the CYS Virtual file system corresponds 233 00:11:23.440 --> 00:11:26.159 to a object in kernel memory. It's the basic building 234 00:11:26.200 --> 00:11:27.279 block of that hierarchy. 235 00:11:27.399 --> 00:11:30.360 Ah okay, So cobject is the foundation for creating that 236 00:11:30.399 --> 00:11:34.399 topology and exposing attributes out to user space through sifts. 237 00:11:34.519 --> 00:11:35.480 That clicks. 238 00:11:35.559 --> 00:11:38.559 It's fundamental. These objects are then grouped into ksets, which 239 00:11:38.600 --> 00:11:41.720 often correspond to directories containing multiple objects, and they have 240 00:11:41.759 --> 00:11:46.000 associated cobduy types which define behavior like attribute handling. Together 241 00:11:46.039 --> 00:11:46.879 they build the whole. 242 00:11:46.679 --> 00:11:49.480 Structure, and within the structure we find different kinds of buses. 243 00:11:49.759 --> 00:11:54.720 Broadly, yeah, you have discoverable buses. Think PCI USB. You 244 00:11:54.759 --> 00:11:57.039 plug something in the bus itself can figure out what 245 00:11:57.080 --> 00:11:59.919 it is until the kernel the hardware handles the numeration. 246 00:12:00.360 --> 00:12:04.080 Then you have nondiscoverable buses. The platform bus is a 247 00:12:04.120 --> 00:12:07.639 common example, especially in embedded systems, where peripherals are just 248 00:12:07.679 --> 00:12:11.200 hardwired to the CPU. The bus itself doesn't announce devices. 249 00:12:11.679 --> 00:12:13.399 For these, the kernel needs to be told that a 250 00:12:13.440 --> 00:12:17.039 device exists at a certain address or uses certain resources. 251 00:12:17.279 --> 00:12:19.799 This information needs to be provided somehow. 252 00:12:19.440 --> 00:12:22.200 Which brings us neatly to the modern way of providing 253 00:12:22.200 --> 00:12:25.759 that information, the device tree. This replaced the old hard 254 00:12:25.759 --> 00:12:27.360 coded board files exactly. 255 00:12:27.720 --> 00:12:31.200 Device tree or DT is a huge step for portability. 256 00:12:31.799 --> 00:12:34.159 Instead of c code describing the hardware layout for one 257 00:12:34.200 --> 00:12:37.600 specific board, you use a textual format dot DTS source files. 258 00:12:37.919 --> 00:12:41.360 These files describe the hardware using nodes representing devices, buses, 259 00:12:41.559 --> 00:12:45.039 and properties like memory addresses, interrupt numbers. This dot dts 260 00:12:45.039 --> 00:12:47.960 file is compiled into a binary blob, the dot DTB 261 00:12:48.200 --> 00:12:51.039 or FDT Flatten device tree. The bootloader loads the dot 262 00:12:51.120 --> 00:12:53.399 DPB into memory, and the kernel parses it at boot 263 00:12:53.399 --> 00:12:55.879 time to learn about the non discoverable hardware present. 264 00:12:56.039 --> 00:13:00.399 What about adding hardware after boot or changing configurations dynamic. 265 00:13:00.720 --> 00:13:04.039 That's where dtoverlays come in. These are smaller partial dot 266 00:13:04.120 --> 00:13:07.799 DTBO device tree blob overlay files. They act like patches. 267 00:13:08.039 --> 00:13:10.600 You can load it dot DTB at runtime, usually via 268 00:13:10.679 --> 00:13:14.120 can figs, and it modifies the live in memory device tree, 269 00:13:14.360 --> 00:13:17.519 maybe enabling an interface, changing a pin configuration, adding a 270 00:13:17.559 --> 00:13:19.480 new IT two C device. It's very powerful. 271 00:13:19.639 --> 00:13:21.639 So when my driver starts up and it needs to 272 00:13:21.679 --> 00:13:24.240 know its memory address or its interrupt number, how does 273 00:13:24.279 --> 00:13:26.360 it get that from the device tree? Reliably? 274 00:13:26.519 --> 00:13:29.879 It uses specific API calls to query the tree. Crucially, 275 00:13:29.960 --> 00:13:33.240 the device tree uses named properties, so instead of asking 276 00:13:33.279 --> 00:13:35.720 for the first interrupts, your driver asked for the interrupt 277 00:13:35.759 --> 00:13:39.320 named four or similar functions like platforms yourt resource by 278 00:13:39.399 --> 00:13:41.759 name or of property read thirty two. Let the driver 279 00:13:41.840 --> 00:13:44.600 request resources by name as defined in the dot dts. 280 00:13:44.960 --> 00:13:47.679 This decouples the driver from the exact physical wiring order. 281 00:13:47.840 --> 00:13:51.039 Makes sense, Okay, last piece, let's connect this architecture to 282 00:13:51.080 --> 00:13:54.600 a real world subsystem. The example given was industrial io 283 00:13:54.879 --> 00:13:56.279 iiO right io is. 284 00:13:56.279 --> 00:13:59.919 A great example. It's a whole kernel subsystem specifically designed 285 00:13:59.919 --> 00:14:04.399 for sensors. Analog to digital converters ADCs, Digital to analog 286 00:14:04.399 --> 00:14:07.559 converters dcs basically data acquisition hardware. 287 00:14:07.799 --> 00:14:10.759 It manages all the complexity of reading raw sens or values, 288 00:14:10.799 --> 00:14:12.879 scaling them, handling triggers. 289 00:14:12.440 --> 00:14:16.159 Exactly, But how does it expose that data to userspace. 290 00:14:16.600 --> 00:14:18.960 It doesn't require custom applications for every sensor. 291 00:14:19.039 --> 00:14:21.639 It uses the device model and sifts precisely. 292 00:14:22.080 --> 00:14:25.480 The io framework models sensors and their data points as channels. 293 00:14:25.759 --> 00:14:29.080 Each channel like the X axis acceleration or temperature reading, 294 00:14:29.519 --> 00:14:33.039 gets exposed as attributes within sifts, So reading the raw 295 00:14:33.120 --> 00:14:35.600 value from an accelerometer might be as simple as reading 296 00:14:35.600 --> 00:14:39.519 a file like cispusiodevi COO dot device zero in SLX. 297 00:14:40.519 --> 00:14:44.120 User space just interacts with these standard file interfaces underpinned 298 00:14:44.120 --> 00:14:47.480 by LDM and the cobject structure representing that channel attribute. 299 00:14:47.519 --> 00:14:50.120 Wow. Okay, we've covered a lot, from those tiny innit 300 00:14:50.240 --> 00:14:52.919 macros affecting memory all the way up to these complex 301 00:14:52.919 --> 00:14:55.399 frameworks like iiO built on the core architecture. 302 00:14:55.559 --> 00:14:58.720 Yeah, we hit the big points. The mindset shift for 303 00:14:58.799 --> 00:15:03.159 kernel programming error clean up with go to logging, then 304 00:15:03.200 --> 00:15:06.679 the concurrency mindfield spin locks versus mutex is, atomic versus 305 00:15:06.720 --> 00:15:08.440 sleeping contexts. 306 00:15:08.000 --> 00:15:11.159 And then the overall structure. How the Linux device model 307 00:15:11.240 --> 00:15:14.240 uses cobjects to build the hierarchy, how the device tree 308 00:15:14.240 --> 00:15:18.399 describes hardware, and how subsystems like iiO plug into that. 309 00:15:18.879 --> 00:15:20.759 So if there's one final thought to leave you with, 310 00:15:20.840 --> 00:15:24.960 it's this, remember that object. It seems abstract, almost trivial, 311 00:15:25.399 --> 00:15:28.919 but everything in the device hierarchy, from the simplest led 312 00:15:29.080 --> 00:15:33.120 driver to the most complex network interface or iiO sensor framework, 313 00:15:33.519 --> 00:15:36.440 ultimately relies on objects to establish its existence and its 314 00:15:36.440 --> 00:15:40.000 attributes within the kernel's world, making it visible and controllable 315 00:15:40.039 --> 00:15:43.879 through sifts. That tiny structure is the bedrock of Linux's 316 00:15:44.039 --> 00:15:45.480 entire device organization. 317 00:15:45.759 --> 00:15:49.120 Amazing how that one fundamental piece underpins so much complexity. 318 00:15:49.200 --> 00:15:51.480 That's a great takeaway. Thank you for walking us through this. 319 00:15:51.600 --> 00:15:53.000 My pleasure is fascinating stuff. 320 00:15:53.080 --> 00:15:55.080 It really is. We'll catch you next time for another 321 00:15:55.120 --> 00:15:55.600 deep dive.