WEBVTT 1 00:00:00.120 --> 00:00:03.359 Have you ever wondered how your phone seems to, I 2 00:00:03.359 --> 00:00:07.240 don't know, finish your sentences, or how that streaming service 3 00:00:07.360 --> 00:00:11.000 uncannily suggests your next binge worthy show. 4 00:00:11.240 --> 00:00:14.519 Right, it often feels like some kind of personalized magic. 5 00:00:14.400 --> 00:00:18.960 Exactly, but it's actually this unseen intelligence that powers so 6 00:00:19.079 --> 00:00:20.280 much of our digital world. 7 00:00:20.399 --> 00:00:22.480 And that's just it. This magic, Well, it isn't some 8 00:00:22.719 --> 00:00:27.559 wizard pulling levels behind a curtain. It's sophisticated algorithms, constantly 9 00:00:27.640 --> 00:00:28.679 learning and adapting. 10 00:00:28.920 --> 00:00:31.480 That's where we're headed today. We're taking a deep dive 11 00:00:31.600 --> 00:00:35.759 into that very world. Machine learning or mL. It's the 12 00:00:35.799 --> 00:00:39.640 engine behind all that digital intelligence, and honestly, it's far 13 00:00:39.679 --> 00:00:43.200 more pervasive than you might realize. Absolutely, So we've shacked 14 00:00:43.240 --> 00:00:46.560 up some fascinating insights from building machine learning systems using 15 00:00:46.600 --> 00:00:49.280 Python and some other related. 16 00:00:48.920 --> 00:00:51.039 Sources you've shared, Yeah, some really good stuff in there. 17 00:00:51.280 --> 00:00:54.600 Our mission today is really to unpack what machine learning 18 00:00:54.640 --> 00:00:58.039 truly is, maybe explore its surprising origins. 19 00:00:57.640 --> 00:00:58.719 Which are quite surprising. 20 00:00:59.039 --> 00:01:02.759 Yeah, it's huge impact on our daily lives, and maybe 21 00:01:02.759 --> 00:01:05.760 most importantly, shine a light on the crucial challenges it faces, 22 00:01:06.599 --> 00:01:09.879 especially that often overlooked issue of bias and fairness. 23 00:01:10.040 --> 00:01:12.159 That's a big one, definitely. 24 00:01:11.760 --> 00:01:15.599 So get ready for some hopefully genuine aha moments. 25 00:01:16.000 --> 00:01:19.400 You know, understanding mL isn't just for tech enthusiasts anymore, 26 00:01:19.439 --> 00:01:23.040 is it. It's becoming like an essential literacy for anyone just 27 00:01:23.159 --> 00:01:24.640 navigating our digital landscape. 28 00:01:24.680 --> 00:01:27.959 Frtully agree. Okay, let's unpack this then. So what exactly 29 00:01:28.000 --> 00:01:29.280 is machine learning at its core? 30 00:01:29.719 --> 00:01:33.439 Well, our sources define mL pretty clearly as the ability 31 00:01:33.439 --> 00:01:37.760 of a system to learn automatically through experience without being 32 00:01:37.840 --> 00:01:40.040 explicitly programmed for every single step. 33 00:01:40.280 --> 00:01:42.560 Right, So, instead of a programmer writing rules for everything, 34 00:01:42.920 --> 00:01:45.159 the system learns the rules itself exactly. 35 00:01:45.280 --> 00:01:48.120 Imagine the sheer scale of problems we can tackle when 36 00:01:48.159 --> 00:01:51.879 software isn't limited by human programmers defining every possibility. It 37 00:01:52.000 --> 00:01:54.000 just teaches itself adapts. 38 00:01:53.640 --> 00:01:57.359 Tackling everything from what medical diagnostics to climate modeling. 39 00:01:57.599 --> 00:01:59.920 You got it. That's the real power behind the definition. 40 00:02:00.040 --> 00:02:02.959 It's essentially building its own rule book just by looking 41 00:02:03.000 --> 00:02:05.599 at the data, teaching itself how things work. 42 00:02:05.719 --> 00:02:07.239 That self teaching idea is key. 43 00:02:07.400 --> 00:02:10.159 Yeah, and the concept isn't entirely new either. The term 44 00:02:10.199 --> 00:02:13.479 machine learning itself that was actually coined way back in 45 00:02:13.560 --> 00:02:14.520 nineteen fifty nine. 46 00:02:14.680 --> 00:02:16.159 Nineteen fifty nine, Wow. 47 00:02:16.039 --> 00:02:19.319 Yeah, by Arthur Samuel. He was an American scientist, an 48 00:02:19.400 --> 00:02:22.599 expert in computer gaming and AI. He really laid the 49 00:02:22.639 --> 00:02:26.960 groundwork for this idea of computers learning without explicit step 50 00:02:26.960 --> 00:02:28.840 by step instructions, and then it got. 51 00:02:28.639 --> 00:02:31.120 A more let's a formal definition later on. 52 00:02:31.280 --> 00:02:34.120 It did. In nineteen ninety seven, Tom Mitchell put it 53 00:02:34.159 --> 00:02:36.400 really well. He said, a computer program is said to 54 00:02:36.439 --> 00:02:39.360 learn from experience E with respect to some task T 55 00:02:39.840 --> 00:02:43.240 and some performance measure P if its performance on T, 56 00:02:43.479 --> 00:02:46.439 as measured by P improves with experience E. 57 00:02:46.639 --> 00:02:48.520 Okay, that's a bit dense, but let's break it down. 58 00:02:48.759 --> 00:02:53.120 Experience E is like more data, more practice exactly. 59 00:02:53.159 --> 00:02:55.520 Task T is what it's trying to do, like recognize 60 00:02:55.560 --> 00:02:57.639 faces or predict traffic. 61 00:02:57.319 --> 00:03:00.360 And performance measure P is how well it's doing that task. 62 00:03:00.479 --> 00:03:03.639 Yeah, precisely. So if it gets better at the task 63 00:03:04.080 --> 00:03:08.800 P improves the more data or practice it gets E increases, 64 00:03:09.400 --> 00:03:10.000 then it's. 65 00:03:09.919 --> 00:03:12.639 Learning kind of like a child learning to identify animals 66 00:03:12.639 --> 00:03:15.319 from pictures. Right, they get better with each new example. 67 00:03:15.439 --> 00:03:19.000 That's a perfect analogy, simple, but it captures the essence. 68 00:03:19.159 --> 00:03:21.919 Okay, so this brings up the history. How did we 69 00:03:22.000 --> 00:03:24.639 get from these early ideas to where we are now 70 00:03:24.719 --> 00:03:25.879 used at nineteen fifty nine. 71 00:03:26.080 --> 00:03:29.319 Well, the history has surprisingly deep roots. If you go 72 00:03:29.400 --> 00:03:32.039 back even further to the nineteen forties, with the invention 73 00:03:32.199 --> 00:03:36.080 of the first big electronic computers like the Enie, Right, 74 00:03:36.439 --> 00:03:39.080 the initial idea was already kind of there, this dream 75 00:03:39.240 --> 00:03:42.599 of building machines that could mimic human learning and thinking. 76 00:03:42.719 --> 00:03:44.719 It was very early days, of course. 77 00:03:44.560 --> 00:03:47.319 Incredible to think about that long ago. What were the 78 00:03:47.360 --> 00:03:51.560 first real maybe sparks of this. Where did it start 79 00:03:51.560 --> 00:03:51.960 to click? 80 00:03:52.039 --> 00:03:54.439 Well, a significant step was in the nineteen fifties we 81 00:03:54.479 --> 00:03:57.000 saw Frank Rosenblat's invention of the perceptron. 82 00:03:57.039 --> 00:03:58.080 The perceptron, what was that. 83 00:03:58.240 --> 00:04:00.400 It was a very simple type of classifier. Think of 84 00:04:00.400 --> 00:04:03.280 it as an early, very basic precursor to the neural 85 00:04:03.280 --> 00:04:05.960 networks we talked about today, A crucial first step. 86 00:04:05.800 --> 00:04:07.520 Okay, and then things really took. 87 00:04:07.319 --> 00:04:11.199 Off later, definitely, the nineteen nineties was when machine learning 88 00:04:11.280 --> 00:04:12.919 truly started hitting the mainstream. 89 00:04:13.319 --> 00:04:14.960 Why then, specifically, a. 90 00:04:14.879 --> 00:04:19.800 Couple of things came together. These probabilistic approaches in AI, 91 00:04:20.600 --> 00:04:25.560 basically using statistics to handle uncertainty and make predictions, started 92 00:04:25.560 --> 00:04:29.759 merging really effectively. With computer science. And crucially, this happened 93 00:04:30.040 --> 00:04:32.959 just as we started getting access to much larger amounts 94 00:04:32.959 --> 00:04:36.319 of data. Suddenly you had the methods and the fuel 95 00:04:36.439 --> 00:04:39.199 the data to build systems that could actually learn from 96 00:04:39.279 --> 00:04:40.519 vast amounts of information. 97 00:04:40.680 --> 00:04:42.240 And computers were getting more powerful too. 98 00:04:42.360 --> 00:04:45.480 Assume absolutely that was essential. And then there was a 99 00:04:45.480 --> 00:04:50.000 big public moment ah Deep Blue exactly IBM's Deep Blue 100 00:04:50.079 --> 00:04:54.639 Chest computer beating world chess champion Gary Kasparov. That was huge. 101 00:04:54.920 --> 00:04:57.519 It really captured the public imagination and showed what was 102 00:04:57.560 --> 00:04:58.439 becoming possible. 103 00:04:58.600 --> 00:05:01.680 Yeah, I remember that it shifted from just academic papers 104 00:05:01.720 --> 00:05:05.319 into something real, something that could beat the best human 105 00:05:05.360 --> 00:05:06.480 minds at a complex tax. 106 00:05:06.560 --> 00:05:07.839 Precisely, it was a landmark. 107 00:05:08.040 --> 00:05:12.240 So okay, we know what it is roughly and a 108 00:05:12.240 --> 00:05:14.639 bit about its history. But you mentioned it's not a 109 00:05:14.680 --> 00:05:17.199 one size fits all thing. There are different flavors. 110 00:05:16.680 --> 00:05:20.199 Of learning, that's right, and understanding these different types is 111 00:05:20.360 --> 00:05:22.439 key to seeing how it's applied everywhere. 112 00:05:22.600 --> 00:05:24.959 Right, Let's do a quick tour. Then, first up is 113 00:05:25.000 --> 00:05:27.680 supervised learning. What's the deal there? 114 00:05:28.360 --> 00:05:31.720 Think of supervised learning as well learning with a teacher 115 00:05:31.879 --> 00:05:35.600 or like having the answer key. The system gets fed example, 116 00:05:35.680 --> 00:05:36.240 data that's. 117 00:05:36.079 --> 00:05:37.639 Already labeled labeled house. 118 00:05:37.680 --> 00:05:41.000 So like historical traffic data paired with the actual congestion 119 00:05:41.040 --> 00:05:44.560 outcomes that happened, or pictures of cats labeled cat and 120 00:05:44.639 --> 00:05:48.240 dogs labeled dog. The system learns the relationship between the 121 00:05:48.279 --> 00:05:50.399 input and the known correct. 122 00:05:50.040 --> 00:05:53.720 Outpoot ah okay. So it uses those examples to learn 123 00:05:53.720 --> 00:05:57.360 how to predict the outcome for new unseen data, like 124 00:05:57.439 --> 00:06:00.879 predicting tomorrow's traffic based on past pasthatterns exactly. 125 00:06:00.920 --> 00:06:03.920 It learns a mapping from input to output. The supervision 126 00:06:04.079 --> 00:06:06.439 comes from those correct labels in the training data. 127 00:06:06.759 --> 00:06:07.879 Got it? So what's next? 128 00:06:08.000 --> 00:06:10.759 Then you have unsupervised learning, and this is more like 129 00:06:10.839 --> 00:06:13.360 learning without a teacher. There's no answer key provided. 130 00:06:13.560 --> 00:06:15.959 So what does it do? Then? 131 00:06:16.480 --> 00:06:20.639 Here the system analyzes data without any associated target responses 132 00:06:20.759 --> 00:06:24.360 or labels. Its goal isn't really to predict a specific output, 133 00:06:24.480 --> 00:06:28.040 but more to find hidden patterns, structures, or to segment 134 00:06:28.079 --> 00:06:29.519 the data into similar groups. 135 00:06:29.639 --> 00:06:30.639 Can you give an example? 136 00:06:30.800 --> 00:06:33.800 Sure, think about grouping customers based on their purchasing habits. 137 00:06:34.279 --> 00:06:37.879 With unsupervised learning, you wouldn't tell the system beforehand find 138 00:06:37.920 --> 00:06:40.560 groups A, B and C. You just give it the 139 00:06:40.600 --> 00:06:43.279 purchase data and it figures out that maybe there are 140 00:06:43.279 --> 00:06:47.000 distinct clusters of customers who buy similar things. It discovers 141 00:06:47.000 --> 00:06:47.879 the structure itself. 142 00:06:47.920 --> 00:06:49.879 Okay, so it's finding patterns we might not have even 143 00:06:49.959 --> 00:06:51.839 know we're there. Interesting. 144 00:06:52.000 --> 00:06:54.920 And the last one, the third main type, is reinforcement learning. 145 00:06:55.279 --> 00:06:58.800 This one is a bit different. Again, It's somewhat similar 146 00:06:58.839 --> 00:07:02.120 to unsupervised in that it often doesn't have explicit labels 147 00:07:02.120 --> 00:07:05.319 for every piece of data, but it learns by interacting 148 00:07:05.319 --> 00:07:07.879 with an environment and receiving feedback in the form of 149 00:07:07.920 --> 00:07:09.800 rewards or penalties for its actions. 150 00:07:09.879 --> 00:07:11.759 Ah like training a dog with treats. 151 00:07:12.120 --> 00:07:15.480 Kind of think about training a robot to navigate a maze. 152 00:07:15.560 --> 00:07:17.399 If it takes a step that gets it closer to 153 00:07:17.480 --> 00:07:20.079 the exit, it gets a positive reward. If it hits 154 00:07:20.120 --> 00:07:23.600 a wall, it gets a negative penalty. Over time, it 155 00:07:23.680 --> 00:07:27.319 learns the sequence of actions the policy that maximizes its 156 00:07:27.360 --> 00:07:28.120 total reward. 157 00:07:28.240 --> 00:07:30.759 So it learns through trial and error guided by feedback. 158 00:07:30.879 --> 00:07:35.279 Precisely, it's really powerful for things like gameplaying, AI, robotics, 159 00:07:35.319 --> 00:07:36.319 and control systems. 160 00:07:36.399 --> 00:07:38.759 Okay, supervised unsupervised reinforcement. 161 00:07:39.160 --> 00:07:43.040 Different ways machines learn, and these different approaches, often working together, 162 00:07:43.279 --> 00:07:45.759 are what create that everyday magic we talked about at 163 00:07:45.759 --> 00:07:48.920 the start. mL really does shine in so many applications 164 00:07:48.920 --> 00:07:49.839 we use constantly. 165 00:07:49.959 --> 00:07:52.920 It really does. Like, let's talk specifics, virtual personal assistance, 166 00:07:53.199 --> 00:07:57.000 Alexis Serie, Google Now Prime examples. 167 00:07:57.519 --> 00:08:01.360 They're constantly collecting and refining information based on your past requests, 168 00:08:01.360 --> 00:08:05.439 your preferences, even your location, to understand your queries better 169 00:08:05.480 --> 00:08:08.879 and give you relevant answers. They learn your voice, your habits. 170 00:08:09.040 --> 00:08:10.920 It's almost spooky sometimes it learns. 171 00:08:10.959 --> 00:08:13.879 And then there's social media services. Oh boy, mL is 172 00:08:13.920 --> 00:08:14.560 everywhere there. 173 00:08:14.759 --> 00:08:16.920 How so beyond just the ads? 174 00:08:17.319 --> 00:08:20.800 Oh yeah, think about the people you may know feature 175 00:08:20.839 --> 00:08:22.800 on platforms like Facebook or LinkedIn. 176 00:08:23.079 --> 00:08:24.040 Right, how does that work? 177 00:08:24.120 --> 00:08:28.680 It's analyzing tons of data, your existing connections, profiles, you've visited, 178 00:08:28.959 --> 00:08:32.600 your workplace, groups, you're in common interest to figure out 179 00:08:32.600 --> 00:08:34.120 who else you might realistically know. 180 00:08:34.600 --> 00:08:37.200 It's connecting the dots in a way a human couldn't 181 00:08:37.440 --> 00:08:39.120 just because of the scale exactly. 182 00:08:39.519 --> 00:08:44.320 Or face recognition Facebook's Deep Face project, for instance, it 183 00:08:44.399 --> 00:08:48.919 learns to identify unique features and photos to automatically suggest tags. 184 00:08:48.519 --> 00:08:50.559 For your friends, even if the angles are weird or 185 00:08:50.559 --> 00:08:51.600 the lighting isn't great. 186 00:08:51.720 --> 00:08:55.320 Yeah, it learns to account for variations like poses and projections. 187 00:08:55.360 --> 00:08:58.039 It's incredibly complex stuff happening behind the scenes, but it 188 00:08:58.080 --> 00:08:59.120 feels seamless to us. 189 00:08:59.440 --> 00:09:02.879 Okay, moving beyond social media, self driving cars, it's a 190 00:09:02.960 --> 00:09:03.440 huge one. 191 00:09:03.480 --> 00:09:07.759 Absolutely. Companies like Tesla heavily rely on machine learning, particularly 192 00:09:07.840 --> 00:09:13.879 forms of unsupervised and reinforcement learning, for perception detecting objects, pedestrians, 193 00:09:14.000 --> 00:09:18.200 other cars, lane lines, all in real time. That's mL 194 00:09:18.240 --> 00:09:19.679 interpreting sensor data. 195 00:09:19.840 --> 00:09:22.919 Mind boggling complexity there, and something may be a bit 196 00:09:22.960 --> 00:09:26.799 more mundane but still powerful. Product recommendations. 197 00:09:26.879 --> 00:09:32.240 Ah, yes, the customers who bought this also bought magic right. 198 00:09:32.519 --> 00:09:35.879 How does that work? Is just based on my past purchases, That's. 199 00:09:35.720 --> 00:09:38.600 Part of it, But it also looks at items you've groused, 200 00:09:38.639 --> 00:09:40.519 things you've put in your cart but didn't buy, what 201 00:09:40.600 --> 00:09:44.519 similar users bought, maybe even brand preferences inferred from your behavior. 202 00:09:44.840 --> 00:09:47.879 It's constantly building a profile to anticipate. 203 00:09:47.320 --> 00:09:49.440 What you might want next, try and attempt me. 204 00:09:49.600 --> 00:09:53.159 Basically, yes, and critically. mL plays a vital role in 205 00:09:53.200 --> 00:09:55.600 security too, like online fraud detection. 206 00:09:55.759 --> 00:09:57.320 How does that work? It must be like finding a 207 00:09:57.360 --> 00:09:58.840 needle in a haystack. It is. 208 00:09:59.120 --> 00:10:02.519 Companies like Paypa how banks. They use machine learning to 209 00:10:02.559 --> 00:10:07.519 analyze millions, even billions of transactions. The algorithms learn patterns 210 00:10:07.519 --> 00:10:13.159 associated with normal, legitimate activity versus suspicious, potentially fraudulent activity. 211 00:10:12.879 --> 00:10:14.639 So it can flag things that look out of the 212 00:10:14.720 --> 00:10:16.840 ordinary based on learned patterns. 213 00:10:16.960 --> 00:10:20.200 Exactly, it can spot anomalies much faster and more accurately 214 00:10:20.320 --> 00:10:23.240 than humans waiting through that much data. It helps prevent 215 00:10:23.320 --> 00:10:25.799 things like money laundering or identity theft. 216 00:10:26.120 --> 00:10:30.000 Okay, wow, So from predicting my next purchase or bingewatch 217 00:10:30.240 --> 00:10:34.279 to preventing serious financial crime. mL is truly woven into 218 00:10:34.279 --> 00:10:35.639 the fabric of our daily lives. 219 00:10:35.720 --> 00:10:36.240 It really is. 220 00:10:36.720 --> 00:10:40.159 But and there's always a butt, right. With such incredible 221 00:10:40.159 --> 00:10:45.399 power comes naturally some pretty significant challenges and maybe dangers 222 00:10:45.399 --> 00:10:46.240 we need to unpack. 223 00:10:46.320 --> 00:10:49.200 Absolutely, it's not all smooth sailing, and it's crucial we 224 00:10:49.279 --> 00:10:51.120 talk about the downsides and the risks. 225 00:10:51.360 --> 00:10:52.399 Where do we even start? 226 00:10:52.480 --> 00:10:55.320 Well, A critical point is what happens when these powerful 227 00:10:55.360 --> 00:10:59.360 systems bump up against difficult ethical terrain or lead to 228 00:10:59.519 --> 00:11:03.840 unexpected did maybe harmful outcomes Like what one really compelling 229 00:11:03.879 --> 00:11:08.279 example our sources highlighted involves ethical dilemmas with autonomous weapons. 230 00:11:08.320 --> 00:11:10.240 Remember Google's Project. 231 00:11:09.840 --> 00:11:12.799 Maiden, vaguely those using mL for drums. 232 00:11:12.519 --> 00:11:15.879 Right exactly, using mL to analyze drone footage, potentially for 233 00:11:15.919 --> 00:11:20.240 targeting and military applications. It sparked massive protests from within 234 00:11:20.320 --> 00:11:22.919 Google employees, scientists, and externally too. 235 00:11:23.039 --> 00:11:25.240 Why the protest The ethical. 236 00:11:24.879 --> 00:11:29.759 Concerns were huge. Thousands signed petitions asking Google to abandon 237 00:11:29.799 --> 00:11:33.240 the project, worried about mL being used to create truly 238 00:11:33.320 --> 00:11:36.559 autonomous weapons that could make life or death decisions without 239 00:11:36.639 --> 00:11:43.320 human intervention. It highlighted this very real, very difficult ethical typerope. 240 00:11:43.399 --> 00:11:46.440 Wow, that's a heavy example right off the bat. Technology 241 00:11:46.559 --> 00:11:48.639 definitely isn't neutral there, not at all. 242 00:11:49.080 --> 00:11:52.480 And then there are other, maybe less dramatic, but still 243 00:11:52.519 --> 00:11:57.279 problematic challenges, like the phenomenon of false correlations sometimes called 244 00:11:57.320 --> 00:11:58.519 spurious correlations. 245 00:11:58.519 --> 00:12:00.000 Okay, what's that sounds intriguing. 246 00:12:00.240 --> 00:12:03.000 This is when you have two things that seem statistically related. 247 00:12:03.200 --> 00:12:05.799 Their trends move together on a graph, but there's absolutely 248 00:12:05.799 --> 00:12:09.159 no real world connection between them. They're independent, but the 249 00:12:09.240 --> 00:12:11.559 numbers look linked. You give an example, my favorite one 250 00:12:11.559 --> 00:12:14.919 from our sources, it's almost comical. Is a documented false 251 00:12:14.919 --> 00:12:17.799 correlation between the increase in people using car seat belts 252 00:12:18.039 --> 00:12:20.960 and a decrease in astronaut deaths from spacecraft accident. 253 00:12:21.039 --> 00:12:23.279 Wait what seat belts and astronaut deaths? 254 00:12:23.679 --> 00:12:26.519 Exactly? They have absolutely nothing to do with each other, 255 00:12:27.039 --> 00:12:31.039 but maybe purely by coincidence, the graph showing seat belt 256 00:12:31.120 --> 00:12:33.039 use went up around the same time the graph for 257 00:12:33.080 --> 00:12:36.639 astronaut deaths went down. The numbers correlate, but it's meaningless. 258 00:12:36.840 --> 00:12:40.600 Huh. Okay, that's a great illustration. It's a stark reminder 259 00:12:41.000 --> 00:12:44.000 not to just assume causation from correlation. 260 00:12:44.159 --> 00:12:48.159 Right. Absolutely. It's a classic statistical trap, and algorithms, if 261 00:12:48.159 --> 00:12:51.080 they're not designed carefully, can fall right into it. They 262 00:12:51.120 --> 00:12:55.360 might identify these spurious correlations in data and based decisions on. 263 00:12:55.320 --> 00:12:58.559 Them, leading to potentially nonsensical or even harmful outcomes. 264 00:12:58.600 --> 00:13:02.759 Precisely, and maybe even worse than false correlations are feedback loops. 265 00:13:03.240 --> 00:13:05.120 Feedback loops? How are they different? 266 00:13:05.279 --> 00:13:08.399 This is more insidious. It's when an algorithm's decision actually 267 00:13:08.440 --> 00:13:12.080 affects the real world, changes the situation on the ground, okay, 268 00:13:12.120 --> 00:13:15.240 and then the algorithm uses that new altered reality, which 269 00:13:15.519 --> 00:13:18.600 its own past decisions helped create, as evidence to confirm 270 00:13:18.600 --> 00:13:22.600 its original conclusion, even if that conclusion was initially flawed 271 00:13:22.679 --> 00:13:23.279 or biased. 272 00:13:24.080 --> 00:13:27.159 That sounds circular and potentially dangerous. Could you give an 273 00:13:27.159 --> 00:13:27.759 example of that. 274 00:13:28.039 --> 00:13:31.039 Yeah. Think about a crime prediction algorithm. Let's say it 275 00:13:31.080 --> 00:13:35.519 analyzes historical crime data and suggests sending more police patrols 276 00:13:35.519 --> 00:13:39.000 to a specific neighborhood because reported crime is higher there. 277 00:13:39.120 --> 00:13:40.679 Okay, seems logical so far. 278 00:13:40.759 --> 00:13:43.240 But if you put more police in that neighborhood, what happens. 279 00:13:43.960 --> 00:13:47.039 People might report more minor incidents simply because there are 280 00:13:47.080 --> 00:13:50.759 officers readily available to take a report. Police might make 281 00:13:50.840 --> 00:13:54.440 more arrests for low level offenses because they're patrolling more intensely. 282 00:13:54.720 --> 00:13:57.720 Ah, So the reporting crime rate goes up partly just 283 00:13:57.799 --> 00:14:00.720 because of the increased police presence exactly. 284 00:14:00.799 --> 00:14:03.840 And then the algorithm sees this higher reported crime rate 285 00:14:03.879 --> 00:14:06.480 in the next batch of data and says, see, I 286 00:14:06.679 --> 00:14:09.559 was right, this neighborhood has high crime. We need even 287 00:14:09.600 --> 00:14:10.440 more police here. 288 00:14:10.519 --> 00:14:14.519 Wow. So the algorithm's initial prediction, potentially based on biased 289 00:14:14.600 --> 00:14:18.279 historical data, creates the conditions that seem to validate it, 290 00:14:18.440 --> 00:14:19.320 leading to a cycle. 291 00:14:19.559 --> 00:14:23.080 That's the feedback loop. The algorithm effectively creates the data 292 00:14:23.120 --> 00:14:27.919 that justifies its own potentially biased decisions, reinforcing existing inequalities 293 00:14:28.000 --> 00:14:28.840 or errors. 294 00:14:28.919 --> 00:14:32.000 Yeah, that's a really clear and concerning example. So beyond 295 00:14:32.039 --> 00:14:37.159 these conceptual or ethical challenges, are there more practical hurdles? Oh? 296 00:14:37.200 --> 00:14:41.279 Definitely. A big one is just the sheer computational needs. 297 00:14:41.840 --> 00:14:46.519 Our sources really emphasize this. Machine learning, especially deep learning 298 00:14:46.559 --> 00:14:50.720 with huge data sets, requires immense computational power. You mean 299 00:14:50.879 --> 00:14:55.159 like supercomputers often, Yeah, or at least very powerful servers 300 00:14:55.279 --> 00:14:59.919 packed with specialized hardware like GPUs graphics processing units. 301 00:14:59.600 --> 00:15:02.440 Those chip originally for video games, the very same. 302 00:15:02.519 --> 00:15:04.639 They turned out to be incredibly good at the kind 303 00:15:04.679 --> 00:15:08.639 of parallel calculations needed for mL. But accessing this kind 304 00:15:08.639 --> 00:15:12.200 of power is expensive, and even with it, training complex 305 00:15:12.240 --> 00:15:15.559 models on large data sets can still take days, sometimes weeks. 306 00:15:15.679 --> 00:15:17.600 It's not like running your typical software. 307 00:15:17.759 --> 00:15:21.120 So resources are a bottleneck. And what about the models themselves? 308 00:15:21.200 --> 00:15:21.960 Can they go wrong? 309 00:15:22.120 --> 00:15:24.679 Absolutely? A very common problem is called overfitting. 310 00:15:24.840 --> 00:15:27.120 Overfitting like a suit that's too tight. 311 00:15:27.240 --> 00:15:29.759 Kind of It happens when a model learns the training 312 00:15:29.840 --> 00:15:33.000 data too well. It becomes excessively complex. It doesn't just 313 00:15:33.080 --> 00:15:35.679 learn the underlying patterns you want it to learn. It 314 00:15:35.720 --> 00:15:39.480 also learns the specific noise, the quirks, and the random 315 00:15:39.519 --> 00:15:42.600 outliers present in that particular training data set, so. 316 00:15:42.559 --> 00:15:45.879 It memorizes the training examples instead of generalizing exactly. 317 00:15:45.919 --> 00:15:48.440 It's like that student who memorizes every single word in 318 00:15:48.480 --> 00:15:52.240 the textbook, including the typos, but can't apply the concepts 319 00:15:52.279 --> 00:15:55.519 to a new problem they haven't seen before. An overfitted 320 00:15:55.559 --> 00:15:57.919 model performs great on the data it was trained on, 321 00:15:58.279 --> 00:16:02.120 but fails miserably when you show a new unseen data. 322 00:16:01.840 --> 00:16:04.559 Because the real world doesn't have those exact same quirks 323 00:16:04.559 --> 00:16:05.320 and noise. Right. 324 00:16:05.600 --> 00:16:08.679 The goal is what's called appropriate fitting, a model that 325 00:16:08.720 --> 00:16:12.759 captures the genuine patterns but ignores the noise. The opposite 326 00:16:12.759 --> 00:16:15.639 problem is underfitting, where the model is too simple and 327 00:16:15.679 --> 00:16:19.039 fails to capture even the basic patterns. Finding that sweet 328 00:16:19.039 --> 00:16:20.000 spot is key. 329 00:16:20.320 --> 00:16:26.679 Okay, overfitting, computational costs, feedback loops, ethical mindfields. Quite a list, 330 00:16:26.919 --> 00:16:29.840 But there's one more huge one we flagged earlier. Bias 331 00:16:30.000 --> 00:16:30.720 and fairness. 332 00:16:31.080 --> 00:16:33.759 Yes, and this is arguably one of the most critical 333 00:16:33.840 --> 00:16:37.600 challenges because it directly impacts people's lives in very real ways. 334 00:16:37.799 --> 00:16:40.720 Let's define it first. What is bias in the context 335 00:16:40.759 --> 00:16:41.240 of mL. 336 00:16:41.759 --> 00:16:45.720 Bias in mL usually refers to results that are systematically prejudiced. 337 00:16:45.799 --> 00:16:49.159 It's often a disproportionate weight in favor of or against 338 00:16:49.240 --> 00:16:53.399 an idea or thing, often stemming from underlying human biases 339 00:16:53.440 --> 00:16:58.360 that get encoded intentionally or unintentionally into the algorithm or 340 00:16:58.360 --> 00:16:59.440 the data it learns from. 341 00:16:59.519 --> 00:17:03.080 So the out algorithms can essentially inherit our own societal biases. 342 00:17:03.159 --> 00:17:06.279 Precisely, if the data used to train an algorithm reflects 343 00:17:06.319 --> 00:17:10.200 existing societal inequalities or prejudices, the algorithm will likely learn 344 00:17:10.240 --> 00:17:12.559 and potentially even amplify those biases. 345 00:17:12.599 --> 00:17:14.960 That feels like a massive problem. If it's baked into 346 00:17:15.000 --> 00:17:17.720 the data. How do you even spot it? Our sources 347 00:17:17.720 --> 00:17:19.119 mentioned different types they did. 348 00:17:19.160 --> 00:17:21.400 It can creep in at various stages. For example, during 349 00:17:21.480 --> 00:17:25.400 data collection, well there's selection bias. Imagine you're developing a 350 00:17:25.400 --> 00:17:28.759 health app, but you only collect data from young, tech 351 00:17:28.799 --> 00:17:32.759 savvy users because they're easiest to reach. The resulting algorithm 352 00:17:32.839 --> 00:17:35.440 might not work well for older adults or less tech 353 00:17:35.519 --> 00:17:38.599 litterate populations. The sample isn't representative. 354 00:17:38.759 --> 00:17:40.079 Okay, that makes sense. What else? 355 00:17:40.240 --> 00:17:42.839 There's the framing effect. How you ask questions in a 356 00:17:42.880 --> 00:17:45.680 survey used to gather data can influence the answers you get. 357 00:17:45.839 --> 00:17:50.920 Introducing bias or even systematic bias from faulty equipment. Imagine 358 00:17:50.920 --> 00:17:55.480 a sensor that consistently reads slightly too high. That error 359 00:17:55.519 --> 00:17:57.240 gets baked into the data. 360 00:17:56.920 --> 00:17:58.640 So bias can enter right from the start. 361 00:17:59.039 --> 00:18:01.440 Just in how data is going absolutely and then there's 362 00:18:01.480 --> 00:18:03.920 bias that can arise during data modeling itself. 363 00:18:03.960 --> 00:18:04.720 How does that happen? 364 00:18:05.039 --> 00:18:08.759 A really prominent real world example our sources discussed was 365 00:18:08.799 --> 00:18:12.279 Amazon's experimental hiring algorithm from a few years back. 366 00:18:12.400 --> 00:18:14.720 Oh, I think I remember hearing about this. What happened? 367 00:18:14.960 --> 00:18:17.119 They tried to build a tool to help screen job 368 00:18:17.160 --> 00:18:21.359 applicants resumes, but it turned out the system effectively penalized 369 00:18:21.400 --> 00:18:25.519 resumes that included words like women's like women's chess club captain, 370 00:18:26.200 --> 00:18:29.039 and it favored candidates who sounded more like the company's 371 00:18:29.039 --> 00:18:31.000 predominantly male workforce at the time. 372 00:18:31.079 --> 00:18:35.599 Wow. So it basically learned the existing gender imbalance from 373 00:18:35.640 --> 00:18:37.839 past hiring data exactly. 374 00:18:38.039 --> 00:18:40.799 It wasn't explicitly programmed to be sexist, but it learned 375 00:18:40.839 --> 00:18:44.200 that male candidates had historically been hired more often, especially 376 00:18:44.200 --> 00:18:48.079 in technical roles, and it started associating male typical language 377 00:18:48.119 --> 00:18:52.000 patterns with success. Amazon ultimately scrapped the system. 378 00:18:52.200 --> 00:18:57.240 That's a powerful and sobering illustration of how historical bias 379 00:18:57.279 --> 00:19:00.519 gets perpetuated, even amplified by a now algorithm. 380 00:19:00.599 --> 00:19:03.599 Really is. It shows how systems trained on bias data 381 00:19:03.799 --> 00:19:07.000 can easily replicate and even scale those biases. 382 00:19:07.240 --> 00:19:10.640 So if we know these biases exist and we can 383 00:19:10.680 --> 00:19:13.759 sometimes detect them, what on earth can we do to 384 00:19:13.799 --> 00:19:15.960 fix them? How do we strive for fairness? 385 00:19:16.039 --> 00:19:19.200 That's the million dollar question. Really, there's no single magic bullet, 386 00:19:19.240 --> 00:19:20.519 but there are approaches. 387 00:19:20.680 --> 00:19:21.640 What's the starting point? 388 00:19:21.759 --> 00:19:26.359 Well, a basic principle, though not always sufficient, is to 389 00:19:26.480 --> 00:19:31.559 try and avoid explicitly including sensitive attributes things like race, gender, 390 00:19:31.759 --> 00:19:35.240 religion as features in the model's training data, especially if 391 00:19:35.240 --> 00:19:37.160 they aren't directly relevant to the task. 392 00:19:37.559 --> 00:19:41.039 But that Amazon example shows bias can creep in even 393 00:19:41.079 --> 00:19:45.440 without explicitly using gender as a feature right through correlated 394 00:19:45.519 --> 00:19:46.960 language patterns exactly. 395 00:19:47.039 --> 00:19:50.559 So simply removing sensitive attributes isn't enough. We need more 396 00:19:50.559 --> 00:19:55.200 sophisticated mitigation strategies. Are these approaches like just patching holes? 397 00:19:55.319 --> 00:19:57.680 Or can we build fair systems from the start? 398 00:19:57.839 --> 00:20:00.000 That's the crucial question. What are the sources? 399 00:20:00.680 --> 00:20:04.319 They outline several approaches, often categorized by when you intervene 400 00:20:04.319 --> 00:20:08.000 in the mL pipeline. Okay, like what First, there's preprocessing. 401 00:20:08.480 --> 00:20:11.160