WEBVTT 1 00:00:00.080 --> 00:00:04.040 Okay, imagine your AI assistant doing way more than just 2 00:00:04.280 --> 00:00:07.559 answering questions. Yeah, Like what if it could actually plan 3 00:00:07.599 --> 00:00:10.599 your entire week, figure out what you need before you 4 00:00:10.679 --> 00:00:13.919 even ask, and maybe even work with other AIS to 5 00:00:13.960 --> 00:00:15.439 get really complex stuff done. 6 00:00:15.560 --> 00:00:17.960 Right, We're talking about a pretty big leap beyond you know, 7 00:00:18.039 --> 00:00:19.839 the typical chatbot experience you might have. 8 00:00:19.920 --> 00:00:23.359 Now, exactly what if AI wasn't just this tool we use, 9 00:00:23.559 --> 00:00:27.800 but more like an autonomous collaborator, something that can reason, adapt, 10 00:00:28.079 --> 00:00:30.160 strategize almost alongside you. 11 00:00:30.320 --> 00:00:32.759 That's precisely what we're here to unpack today. This whole 12 00:00:32.840 --> 00:00:35.159 duc dit is about agentic AI systems. 13 00:00:35.200 --> 00:00:36.679 Agentic AI okay, Yeah. 14 00:00:36.479 --> 00:00:39.359 We're exploring AI that doesn't just spit out content, but 15 00:00:39.399 --> 00:00:44.560 can actively reason, plan, adapt and act with quite a 16 00:00:44.560 --> 00:00:46.880 bit of autonomy. It can even reflect on its own 17 00:00:46.920 --> 00:00:48.079 experiences to get better. 18 00:00:48.399 --> 00:00:50.399 So our mission today is basically to take you on 19 00:00:50.439 --> 00:00:53.679 a journey to understand what this agentic AI really is. 20 00:00:54.320 --> 00:00:56.560 We'll look at how these well intelligent systems are built, 21 00:00:56.719 --> 00:00:58.960 the principles behind how they make decisions and. 22 00:00:58.920 --> 00:01:01.000 Learn, and where there actually be use the real world 23 00:01:01.039 --> 00:01:02.960 applications across different industries. 24 00:01:03.320 --> 00:01:06.200 And importantly, we have to get into the crucial stuff 25 00:01:06.239 --> 00:01:11.239 around trust, safety, ethics, all of that. 26 00:01:11.680 --> 00:01:15.200 Absolutely. Think of this as your shortcut to really getting 27 00:01:15.239 --> 00:01:18.640 a handle on this pretty transformative moment in AI. 28 00:01:18.760 --> 00:01:20.359 Okay, sounds good. Where do we start. 29 00:01:20.439 --> 00:01:22.840 Well, let's maybe start with a quick refresher on generative 30 00:01:22.879 --> 00:01:24.920 AI just to set the stage, and then we can 31 00:01:25.000 --> 00:01:28.400 bridge that gap to what makes an AI system truly agentic. 32 00:01:28.799 --> 00:01:32.719 Perfect. So, for anyone following AI, generative models are probably 33 00:01:32.760 --> 00:01:36.599 familiar territory, but just for everyone, let's quickly clarify what 34 00:01:36.799 --> 00:01:38.280 is generative AI at its core. 35 00:01:38.920 --> 00:01:42.000 At its heart, generative AI is all about creating brand 36 00:01:42.000 --> 00:01:46.439 new synthetic content content like like text, images, audio, video, 37 00:01:46.719 --> 00:01:49.079 basically anything that looks like the real world data it 38 00:01:49.120 --> 00:01:51.359 was trained on. It's different from older AI that just 39 00:01:51.519 --> 00:01:55.079 you know, classifies or identifies things right. Generative models learn 40 00:01:55.120 --> 00:01:58.280 the underlying patterns and the data, the structure, and then 41 00:01:58.400 --> 00:02:01.840 use that knowledge to produce completely novel instances. 42 00:02:01.640 --> 00:02:03.719 It's like training it on faces and it makes a 43 00:02:03.760 --> 00:02:04.400 new face. 44 00:02:04.560 --> 00:02:07.719 Exactly faces of people who don't actually exist, but they 45 00:02:07.799 --> 00:02:11.879 look incredibly real. That ability to create is really the 46 00:02:11.960 --> 00:02:15.960 first critical step towards AI that can eventually act on 47 00:02:16.000 --> 00:02:16.840 its own that's. 48 00:02:16.719 --> 00:02:18.919 A powerful idea. And we've all heard of models like 49 00:02:19.759 --> 00:02:24.439 GPT for text or maybe daily and stable diffusion for images. 50 00:02:24.840 --> 00:02:26.199 What's the magic behind those? 51 00:02:26.439 --> 00:02:29.199 Well, a lot of them, especially the big large language 52 00:02:29.199 --> 00:02:33.599 models like the GPT series, use something called the transformer architecture. Okay, 53 00:02:33.759 --> 00:02:36.000 think of it as just a very efficient way for 54 00:02:36.080 --> 00:02:40.080 the AI to understand and generate sequential data like language 55 00:02:40.120 --> 00:02:43.919 like sentences makes sense. This architecture lets the models process 56 00:02:44.080 --> 00:02:46.639 just huge amounts of text and then predict what word 57 00:02:46.719 --> 00:02:50.759 is most likely to come next, building up coherent, relevant responses. 58 00:02:50.960 --> 00:02:54.080 It's what drives that incredibly human like text we see. 59 00:02:54.120 --> 00:02:56.840 Okay, so we have these powerhouse models that can generate stuff, 60 00:02:57.199 --> 00:02:58.680 But how do we get from an AI that just 61 00:02:58.719 --> 00:03:01.639 makes a convincing picture, write some text to one that 62 00:03:01.759 --> 00:03:05.240 actually acts independently, makes decisions, pursues goals. That's the big 63 00:03:05.319 --> 00:03:06.759 lead to agentic systems, right. 64 00:03:06.879 --> 00:03:10.680 That's the fundamental shift exactly. Agentic systems go beyond just 65 00:03:10.800 --> 00:03:15.159 generating content. They are really designed for active decision making, planning, 66 00:03:15.560 --> 00:03:20.039 and goal oriented behavior. They operate with a clear purpose. 67 00:03:20.479 --> 00:03:22.560 And what gives them that sense of purpose. We're talking 68 00:03:22.560 --> 00:03:25.840 concepts like self governance, agency. 69 00:03:25.599 --> 00:03:31.199 Autonomy, precisely, self governance is the agent's ability to operate 70 00:03:31.280 --> 00:03:34.919 based on its own internal principles and goals without needing 71 00:03:34.919 --> 00:03:36.520 a human constantly telling. 72 00:03:36.240 --> 00:03:37.120 It what to do. Okay. 73 00:03:37.360 --> 00:03:40.400 Agency is its capacity to act on behalf of someone, 74 00:03:40.439 --> 00:03:44.080 maybe a user or another system. It defines objectives, gets 75 00:03:44.080 --> 00:03:46.840 the information it needs, and takes steps to achieve them. 76 00:03:46.879 --> 00:03:47.639 And the autonomy. 77 00:03:47.759 --> 00:03:51.280 Autonomy is really that ability to operate independently, making decisions, 78 00:03:51.360 --> 00:03:55.520 taking actions without direct human control at every single step. 79 00:03:55.599 --> 00:03:57.319 This is where it gets really interesting. Let's use that 80 00:03:57.319 --> 00:03:59.639 flight booking example from the source material. It really brings 81 00:03:59.639 --> 00:04:02.000 it home up. Imagine you want to butt a trip, 82 00:04:02.479 --> 00:04:06.960 say San Diego to San Francisco, next Friday to Sunday. Okay, 83 00:04:07.080 --> 00:04:09.240 you start super vague, book me a flight from San 84 00:04:09.280 --> 00:04:11.360 Diego to San Francisco and next Friday to Sunday. 85 00:04:11.439 --> 00:04:14.199 Right, And this AI assistant, which is an l empowered agent, 86 00:04:14.680 --> 00:04:17.319 it knows that's not enough detail it needs more, so 87 00:04:17.360 --> 00:04:19.680 it asks. It might come back with something like, okay, 88 00:04:19.800 --> 00:04:22.399 do you have a preferred airline or are you open 89 00:04:22.439 --> 00:04:24.759 to any and what class of service were you thinking of? 90 00:04:25.000 --> 00:04:29.839 And you reply, I prefer morning flights, no airline preference 91 00:04:29.959 --> 00:04:31.319 economy is fine. 92 00:04:31.000 --> 00:04:33.560 And the bot processes that. It says, okay, thanks for 93 00:04:33.600 --> 00:04:37.319 the details. I'll look for morning flights economy class across 94 00:04:37.399 --> 00:04:38.959 all airlines for those dates. 95 00:04:38.959 --> 00:04:41.040 Give me just a moment, and then it comes back 96 00:04:41.040 --> 00:04:41.959 with options. 97 00:04:41.600 --> 00:04:44.000 Exactly and might say, okay, I found a few options. 98 00:04:44.040 --> 00:04:48.759 Here are the best morning flights, and list maybe option 99 00:04:48.879 --> 00:04:51.680 one on United Alaska for three hundred and twenty five dollars, 100 00:04:51.800 --> 00:04:54.560 option two on Delta Southwest for three hundred and ten dollars. 101 00:04:54.920 --> 00:04:56.160 Which one works best for you? 102 00:04:56.519 --> 00:04:59.800 That exchange that really shows agency and autonomy and action. 103 00:05:00.199 --> 00:05:04.120 It absolutely does. The AI isn't just generating text responses. 104 00:05:04.160 --> 00:05:08.120 It's actively asking for information, using that info as parameters 105 00:05:08.120 --> 00:05:10.959 for what we can imagine our back end tools or 106 00:05:11.000 --> 00:05:13.639 APIs maybe a flight look up tool, then later a 107 00:05:13.680 --> 00:05:17.199 book flight tool. It's making decisions based on the conversation flow, 108 00:05:17.240 --> 00:05:20.160 like independently searching for the best options, and it's even 109 00:05:20.199 --> 00:05:22.399 ready to kick off the booking process, maybe send a 110 00:05:22.399 --> 00:05:25.399 payment link. It's genuinely acting on your behalf. 111 00:05:26.000 --> 00:05:28.480 That is a huge shift. It's not just talking to you, 112 00:05:28.600 --> 00:05:31.920 it's acting for you. Okay, So how do these agents 113 00:05:31.959 --> 00:05:36.319 actually well think? How do they learn to manage these tasks. 114 00:05:36.360 --> 00:05:38.720 What's their internal map of the world look like. 115 00:05:38.920 --> 00:05:43.279 Yeah, good question. They need structured ways to store and 116 00:05:43.399 --> 00:05:46.160 organize information. We call this knowledge representation. 117 00:05:46.519 --> 00:05:46.800 Okay. 118 00:05:47.000 --> 00:05:51.519 One really powerful approach is using semantic networks. Imagine a 119 00:05:51.680 --> 00:05:54.800 huge sort of interconnected web of concept by going map, 120 00:05:54.959 --> 00:05:57.480 kind of like a giant mind map. Yeah, each concept 121 00:05:57.519 --> 00:06:00.639 is a node like dog or animal or reads air, 122 00:06:01.199 --> 00:06:05.879 and lines connect them. Showing relationships is a type of causes. 123 00:06:05.519 --> 00:06:07.720 Is part of So if it knows animals breathe air 124 00:06:07.879 --> 00:06:09.480 and dogs are animals. 125 00:06:09.160 --> 00:06:12.560 We can automatically figure out or infer that dogs breathe there. 126 00:06:13.120 --> 00:06:15.480 These networks allow them to connect the dots and derive 127 00:06:15.560 --> 00:06:16.079 new facts. 128 00:06:16.120 --> 00:06:18.560 That's pretty intuitive. What about frames you mentioned those two. 129 00:06:18.560 --> 00:06:21.480 They sound a bit like digital index cards. 130 00:06:21.800 --> 00:06:23.920 That's actually a great way to put it. Frames are 131 00:06:23.959 --> 00:06:27.519 more structured. Think of a car frame. It has specific 132 00:06:27.600 --> 00:06:32.439 slots or attributes like make, model, year, color. You fill 133 00:06:32.480 --> 00:06:35.199 in the values for each specific. 134 00:06:34.759 --> 00:06:37.959 Car, so it groups related information together exactly. 135 00:06:38.000 --> 00:06:42.399 It mirrors how we humans often conceptualize things, grouping properties 136 00:06:42.399 --> 00:06:44.079 together into a single unit. 137 00:06:44.360 --> 00:06:49.759 And for situations where you absolutely need precision, like mathematically precise. 138 00:06:49.519 --> 00:06:53.199 That's where logic based representations come in. These use formal 139 00:06:53.240 --> 00:06:57.879 logic like propositional or first order logic to encode facts 140 00:06:57.879 --> 00:06:58.399 and rules. 141 00:06:58.480 --> 00:07:00.079 Like in math class pretty. 142 00:06:59.839 --> 00:07:02.399 Much much you might represent all humans are mortal in 143 00:07:02.439 --> 00:07:05.399 a strict mathematical way. This rigor is super important and 144 00:07:05.439 --> 00:07:09.879 feels where errors are costly. Think software verification, maybe even 145 00:07:09.959 --> 00:07:14.040 legal analysis. It ensures every conclusion is logically sound. 146 00:07:14.160 --> 00:07:17.040 Okay, so the agent builds this complex internal knowledge map. 147 00:07:17.399 --> 00:07:19.720 How does it then use that map to draw conclusions 148 00:07:19.800 --> 00:07:22.839 or figure out new things? That's reasoning right, Precisely. 149 00:07:23.120 --> 00:07:26.439 Reasoning is how agents manipulate that knowledge to get insights. 150 00:07:26.600 --> 00:07:29.759 One type is deductive reasoning. This is very top down, 151 00:07:29.839 --> 00:07:33.399 top down, meaning you start with general rules or premises, 152 00:07:33.800 --> 00:07:37.040 and you arrive at specific conclusions that must be true 153 00:07:37.120 --> 00:07:41.360 if the premises are true. The classic example, all men 154 00:07:41.360 --> 00:07:41.920 are mortal. 155 00:07:42.120 --> 00:07:46.399 Socrates is a man, Therefore Socrates is mortal exactly. 156 00:07:46.519 --> 00:07:50.000 It's logically inescapable. You see this in math, logic proofs, 157 00:07:50.199 --> 00:07:53.600 verifying software anywhere certainty is key. 158 00:07:53.879 --> 00:07:57.040 It's like a guaranteed logical chain. But what about when 159 00:07:57.079 --> 00:08:00.399 things aren't so certain? When agents need to find passatterns 160 00:08:00.519 --> 00:08:01.920 or make educated guesses. 161 00:08:02.040 --> 00:08:03.920 That's where inductive reasoning is vital. 162 00:08:04.360 --> 00:08:07.800 This is more bottom up, so starting with specifics, right. 163 00:08:07.879 --> 00:08:10.439 You look at specific observations and you try to form 164 00:08:10.480 --> 00:08:13.839 probable generalizations, like the sun has risen every single day 165 00:08:13.920 --> 00:08:15.680 for as long as we know, so it will probably 166 00:08:15.759 --> 00:08:18.759 rise tomorrow exactly. It's not a logical certainty, but it's 167 00:08:18.800 --> 00:08:21.959 a very strong probability based on evidence. This is fundamental 168 00:08:21.959 --> 00:08:24.639 to science and especially to machine learning, finding patterns in 169 00:08:24.759 --> 00:08:26.319 data to make predictions. 170 00:08:26.560 --> 00:08:30.879 Okay, deduction for certainty, induction for probability. What about figuring 171 00:08:30.879 --> 00:08:34.120 out the cause of something like plain detective. 172 00:08:34.159 --> 00:08:37.759 Ugh, that's abductive reasoning. It's often called inference to the 173 00:08:37.799 --> 00:08:39.440 best explanation. 174 00:08:38.960 --> 00:08:40.399 Inference to the best explanation. 175 00:08:40.480 --> 00:08:42.679 Yeah, you observe an effect and you try to figure 176 00:08:42.720 --> 00:08:45.240 out the most plausible cause. If you see the lawn 177 00:08:45.279 --> 00:08:49.279 is wet, a good abductive inference is it probably rained 178 00:08:49.360 --> 00:08:49.840 last night. 179 00:08:50.120 --> 00:08:52.320 It's not the only possibility. Maybe the sprinklers are all 180 00:08:52.879 --> 00:08:53.799 but rain. 181 00:08:53.759 --> 00:08:57.519 Is often the simplest, most likely explanation. This is super 182 00:08:57.600 --> 00:09:00.480 useful and fields like medical diagnosis figure earing out the 183 00:09:00.519 --> 00:09:04.039 disease from symptoms or fault detection, or even forensics you're 184 00:09:04.039 --> 00:09:05.440 piecing together clues. 185 00:09:05.960 --> 00:09:09.679 Okay, So agents can represent knowledge, they can reason about 186 00:09:09.720 --> 00:09:13.440 it deductively, inductively, abductively, But how do they get better? 187 00:09:13.720 --> 00:09:15.960 How do they adapt over time? That has to involve 188 00:09:16.039 --> 00:09:17.000 learning mechanisms. 189 00:09:17.240 --> 00:09:20.679 Learning is absolutely fundamental for any agent that needs to adapt. 190 00:09:20.840 --> 00:09:23.320 There are several key types. You have supervised learning. 191 00:09:23.440 --> 00:09:26.919 That's learning from labeled examples, right like seeing pictures labeled cat. 192 00:09:26.799 --> 00:09:31.240 Or dog exactly, or predicting house prices based on features 193 00:09:31.279 --> 00:09:33.879 where you have the actual prices for your training data 194 00:09:34.240 --> 00:09:35.480 input output pairs. 195 00:09:35.720 --> 00:09:38.279 Okay, Then there's unsupervised. 196 00:09:37.799 --> 00:09:41.440 Unsupervised learning is about finding patterns in data that isn't labeled. 197 00:09:42.080 --> 00:09:45.360 Think about grouping customers into segments based on their buying 198 00:09:45.399 --> 00:09:49.360 habits without knowing the segments beforehand. The AI finds the 199 00:09:49.440 --> 00:09:50.519 structure itself. 200 00:09:50.679 --> 00:09:53.000 And reinforcement learning RL. 201 00:09:53.519 --> 00:09:56.559 That sounds interesting, RL is fascinating. It's learning through trial 202 00:09:56.600 --> 00:09:59.759 and mirror. The agent takes actions in an environment, and 203 00:09:59.759 --> 00:10:03.399 it sieves rewards or punishments based on the outcomes. 204 00:10:02.879 --> 00:10:05.200 Like training a dog or a game AI. 205 00:10:05.759 --> 00:10:08.799 Very much like that game AI is a classic example. 206 00:10:09.080 --> 00:10:11.759 The AI learns to play chess or go by playing 207 00:10:11.759 --> 00:10:15.000 millions of games and getting rewarded for winning. Robotics uses 208 00:10:15.039 --> 00:10:17.039 it a lot too, for learning how to walk or 209 00:10:17.080 --> 00:10:18.639 grasp objects. 210 00:10:18.200 --> 00:10:20.000 And lastly, transfer learning. 211 00:10:20.240 --> 00:10:23.679 Transfer learning is really efficient. It's about taking knowledge gain 212 00:10:23.759 --> 00:10:25.960 from one task and applying it to a different, but 213 00:10:26.240 --> 00:10:28.679 related task. It means the agent doesn't have to start 214 00:10:28.720 --> 00:10:29.519 from scratch. 215 00:10:29.240 --> 00:10:33.399 Every time, okay, knowledge reasoning learning puts the agent in 216 00:10:33.440 --> 00:10:35.600 a position to actually make choices and figure out what 217 00:10:35.639 --> 00:10:38.919 to do next. How do they handle decision making and planning. 218 00:10:38.759 --> 00:10:41.840 For decision making? A key concept is the utility function. 219 00:10:42.240 --> 00:10:46.240 Utility function sounds economic, it kind of is. 220 00:10:46.279 --> 00:10:49.320 It's a way to quantify the agent's preferences. It maps 221 00:10:49.360 --> 00:10:54.000 different possible outcomes to numerical values representing how desirable each 222 00:10:54.000 --> 00:10:55.080 outcome is to the agent. 223 00:10:55.320 --> 00:10:57.480 So like our travel agent example. 224 00:10:57.240 --> 00:11:02.240 Exactly, the travel agent's utility function might weigh factors like price, comfort, 225 00:11:02.519 --> 00:11:06.639 travel time convenience. Maybe a budget airline has a low 226 00:11:06.679 --> 00:11:09.759 price score but also low comfort. A road trip might 227 00:11:09.799 --> 00:11:11.399 be cheaper overall, but take. 228 00:11:11.279 --> 00:11:13.519 Longer, and the function helps it choose right. 229 00:11:13.799 --> 00:11:16.399 It calculates the total utility for each option based on 230 00:11:16.440 --> 00:11:19.480 the weight's assigned to price, comfort, et cetera, and picks 231 00:11:19.519 --> 00:11:22.679 the option with the highest score. It allows for rational 232 00:11:22.759 --> 00:11:26.240 choices based on defined goals, even when those goals conflict 233 00:11:26.440 --> 00:11:28.399 like cost versus speed, So. 234 00:11:28.399 --> 00:11:31.120 It picks the best option according to its values, not 235 00:11:31.200 --> 00:11:31.919 just any option. 236 00:11:32.039 --> 00:11:32.320 Wow. 237 00:11:32.399 --> 00:11:34.559 And once it decides what it wants, it needs a 238 00:11:34.559 --> 00:11:37.080 plan to get there. That's planning algorithms exactly. 239 00:11:37.159 --> 00:11:40.080 Planning algorithms figure out the sequence of actions needed to 240 00:11:40.120 --> 00:11:43.200 reach the desired goal state. There are many types, simple 241 00:11:43.240 --> 00:11:46.159 graph searches like finding a route on a map, more 242 00:11:46.159 --> 00:11:49.679 complex heuristic searching, oh, the chess programs yeah, or things 243 00:11:49.720 --> 00:11:52.519 like Monte Carlo tresearch, which is great for games or 244 00:11:52.559 --> 00:11:56.159 situations with uncertainty. But what's really interesting for these LM 245 00:11:56.279 --> 00:11:58.720 based agents we're talking about, yes, is that sometimes the 246 00:11:58.840 --> 00:12:02.279 LLM itself can act as the planner. It uses its 247 00:12:02.360 --> 00:12:06.240 language understanding to formulate a plan. And another powerful approach 248 00:12:06.360 --> 00:12:11.720 is hierarchical task network planning or htn HTM. It breaks 249 00:12:11.759 --> 00:12:16.480 down a big complex goal like planifacation, into smaller nested 250 00:12:16.519 --> 00:12:21.480 subtasks find flights, book hotel, plan activities. This hierarchical approach 251 00:12:21.519 --> 00:12:24.759 fits really well with how lllm's process information and handle 252 00:12:24.840 --> 00:12:26.080 complex instructions. 253 00:12:26.159 --> 00:12:28.200 That makes a lot of sense. Okay, we've got a 254 00:12:28.240 --> 00:12:30.320 good handle on the building blocks. Now let's talk about 255 00:12:30.360 --> 00:12:32.600 how these systems really start to shine in practice. 256 00:12:32.720 --> 00:12:33.039 Yeah. 257 00:12:33.080 --> 00:12:37.519 One capability that sounds very human is reflection and introspection 258 00:12:38.360 --> 00:12:39.720 agents thinking about themselves. 259 00:12:39.840 --> 00:12:42.600 It really is quite human like. Reflection is the agent's 260 00:12:42.600 --> 00:12:45.519 ability to monitor its own performance and adapt its behavior 261 00:12:45.559 --> 00:12:49.399 based on that monitoring. It's like human metacognition, thinking about 262 00:12:49.399 --> 00:12:50.080 your own thinking. 263 00:12:50.279 --> 00:12:52.159 Why is that so important for an AI agent? 264 00:12:52.519 --> 00:12:55.919 Well, several reasons. It leads to much better decision making 265 00:12:56.240 --> 00:12:59.759 because the agent can essentially replay past choices in their 266 00:12:59.799 --> 00:13:03.759 own outcomes, learning from mistakes and reinforcing successes. 267 00:13:03.360 --> 00:13:05.480 So it learns from its own history exactly. 268 00:13:05.799 --> 00:13:09.399 It also enables better adaptation. Think about our travel agent again. 269 00:13:09.440 --> 00:13:13.600 The travel industry changes constantly, Prices fluctuate, new routes appear. 270 00:13:14.159 --> 00:13:17.559 Reflection allows the agent to notice these changes and adjust 271 00:13:17.600 --> 00:13:18.919 its strategies accordingly. 272 00:13:19.120 --> 00:13:21.679 And I imagine there are ethical angles too, Definitely. 273 00:13:22.000 --> 00:13:25.360 Reflection can help ensure the agent's actions stay aligned with 274 00:13:25.440 --> 00:13:28.960 human values or ethical guidelines over time, and it can 275 00:13:29.039 --> 00:13:32.159 even improve how humans interact with the AI, maybe by 276 00:13:32.159 --> 00:13:35.480 allowing the agent to adapt its communication style based on 277 00:13:35.600 --> 00:13:37.639 perceived user frustration or confusion. 278 00:13:37.960 --> 00:13:40.080 How does this actually work under the hood? How is 279 00:13:40.159 --> 00:13:41.320 reflection implemented. 280 00:13:41.679 --> 00:13:44.840 Some key techniques include meta reasoning, where the agent literally 281 00:13:44.879 --> 00:13:48.559 analyzes its own reasoning process. Did my previous strategy work well? 282 00:13:48.600 --> 00:13:51.960 Why or why not? There's also self explanation, the agent 283 00:13:52.000 --> 00:13:55.159 generates explanations for its own decisions. This isn't just for 284 00:13:55.200 --> 00:13:58.279 the user. It helps the agent itself understand and learn 285 00:13:58.279 --> 00:14:01.559 from its choices, blanes to itself in a way yes. 286 00:14:02.080 --> 00:14:05.600 And self modeling, where the agent updates its internal understanding 287 00:14:05.639 --> 00:14:08.559 of its goals, its capabilities, and the world based on 288 00:14:08.639 --> 00:14:11.200 new experiences and the results of its reflections. 289 00:14:11.240 --> 00:14:14.960 Fascinating. So as agents get smarter about themselves, they also 290 00:14:15.039 --> 00:14:18.120 need to interact with the outside world more effectively. This 291 00:14:18.159 --> 00:14:21.480 brings us to enabling tool use. Getting agents to use 292 00:14:21.519 --> 00:14:23.080 external resources right. 293 00:14:23.320 --> 00:14:26.519 Tool use is fundamental for making these agents truly practical. 294 00:14:26.840 --> 00:14:31.559 It means an LM agent leveraging things outside itself like APIs, databases, 295 00:14:31.600 --> 00:14:34.240 software functions to add to its own abilities. 296 00:14:33.799 --> 00:14:35.279 So it can do more than just what it was 297 00:14:35.320 --> 00:14:36.440 trained on exactly. 298 00:14:36.600 --> 00:14:39.120 It allows agents to as the source material puts it, 299 00:14:39.320 --> 00:14:43.519 transcend intrinsic limitations. They're not stuck with only their internal knowledge. 300 00:14:43.559 --> 00:14:47.879 They can fetch real time information, perform calculations, interact with 301 00:14:47.960 --> 00:14:50.200 other systems, even control hardware. 302 00:14:50.480 --> 00:14:53.120 How does an AI know how to use, say, a 303 00:14:53.200 --> 00:14:56.799 specific weather API. This is just figure it out. 304 00:14:56.879 --> 00:15:00.039 Not quite magically, but intelligently. The key is that the 305 00:15:00.080 --> 00:15:03.559 agent needs a good description of the tool A description, yeah, 306 00:15:03.639 --> 00:15:06.360 usually provided by the developer. It needs to know the 307 00:15:06.360 --> 00:15:10.279 tool's purpose, what kind of input it expects, what parameters 308 00:15:10.320 --> 00:15:13.240 it takes. Often this is written right into the code 309 00:15:13.320 --> 00:15:14.799 using something called a dock string. 310 00:15:15.120 --> 00:15:15.399 Okay. 311 00:15:15.440 --> 00:15:18.519 Once the LLM understands what the tool does and how 312 00:15:18.519 --> 00:15:21.519 to call it, it can intelligently decide when using that 313 00:15:21.559 --> 00:15:24.000 tool is the right step to achieve its current goal. 314 00:15:24.200 --> 00:15:27.080 So an agent could use a weather API for forecasts, 315 00:15:27.519 --> 00:15:30.600 connect to a payment system for a transaction, maybe query 316 00:15:30.600 --> 00:15:32.960 a database for specific information. 317 00:15:32.799 --> 00:15:36.519 Or even interact with hardware interfaces in a robotics context. 318 00:15:36.720 --> 00:15:38.000 The possibilities are huge. 319 00:15:38.080 --> 00:15:40.360 Yeah, the significance seems massive, then it really is. 320 00:15:40.600 --> 00:15:44.639 Tool use is what lets agents tackle complex real world problems. 321 00:15:44.799 --> 00:15:47.240 Think about a healthcare agent using up to the minute 322 00:15:47.320 --> 00:15:51.200 medical databases or interacting with diagnostic tools. It's a complete 323 00:15:51.279 --> 00:15:51.919 game changer. 324 00:15:52.120 --> 00:15:54.559 Okay, so we have individual agents that can reflect and 325 00:15:54.679 --> 00:15:57.960 use tools, but the real power often comes from teamwork, 326 00:15:58.120 --> 00:16:01.799 right even for ais. Let's talk about multi agent systems 327 00:16:01.879 --> 00:16:02.519 or MS. 328 00:16:02.919 --> 00:16:09.000 Yes. Masays are where you have multiple autonomous agents interacting, cooperating, 329 00:16:09.120 --> 00:16:12.919 maybe coordinating to achieve goals that might be too complex 330 00:16:12.960 --> 00:16:16.480 for any single agent. It's about distributed problem solving, and. 331 00:16:16.440 --> 00:16:18.519 There are ways to organize these teams of agents. 332 00:16:18.600 --> 00:16:22.360 Definitely. One really effective model mentioned in our sources is 333 00:16:22.440 --> 00:16:25.960 the coordinator worker delegator model or CWD CWD. 334 00:16:26.080 --> 00:16:29.000 Okay, break that down for us. Coordinator, worker delegator. 335 00:16:29.080 --> 00:16:31.720 Right. The coordinator is like the project manager. It oversees 336 00:16:31.759 --> 00:16:35.559 the whole workflow, sets priorities, tracks progress towards the main goal. 337 00:16:35.679 --> 00:16:37.679 You got it, the boss sort of yeah. 338 00:16:38.039 --> 00:16:40.919 Then you have the workers. These are specialized agents, each 339 00:16:41.039 --> 00:16:44.399 expert at a specific task. In our travel example, you 340 00:16:44.480 --> 00:16:47.440 might have a flight booking worker, a hotel booking worker, 341 00:16:47.480 --> 00:16:51.440 maybe a data analyst worker looking for deals specialists exactly, 342 00:16:51.720 --> 00:16:55.960 and finally, the delegator. This agent sits between the coordinator 343 00:16:56.000 --> 00:16:58.559 and the workers. It takes the high level plan from 344 00:16:58.559 --> 00:17:01.840 the coordinator and breaks it down and concrete tasks, assigning 345 00:17:01.879 --> 00:17:04.279 them to the right workers and managing resources. 346 00:17:04.400 --> 00:17:08.079 Okay, let's apply CUD to the travel example. Again, user 347 00:17:08.119 --> 00:17:09.079 asks for a trip. 348 00:17:09.559 --> 00:17:12.759 Right the coordinator agent receives the request and forms a 349 00:17:12.799 --> 00:17:17.160 high level plan book flights, book hotel, find activities for 350 00:17:17.200 --> 00:17:18.240 San Francisco trip. 351 00:17:18.400 --> 00:17:19.279 Then the delegator. 352 00:17:19.359 --> 00:17:23.079 The delegator takes that plan and creates specific tasks. Task 353 00:17:23.119 --> 00:17:27.559 one find morning economy flights SD to SF, next freysun 354 00:17:27.960 --> 00:17:32.200 assigned to flight worker. Task two find three star hotel 355 00:17:32.319 --> 00:17:35.960 near downtown SF for those dates, assigned to hotel worker, 356 00:17:36.279 --> 00:17:39.480 and so on. Maybe it assigns tasks to an analyst 357 00:17:39.519 --> 00:17:42.359 worker to check for package deals or a reflector agent 358 00:17:42.440 --> 00:17:43.720 to review the plan's. 359 00:17:43.400 --> 00:17:45.279 Logic, and the workers just do their jobs. 360 00:17:45.319 --> 00:17:48.480 The workers execute their specialized tasks, possibly in parallel, and 361 00:17:48.559 --> 00:17:52.920 report results back up. The delegator or coordinator integrates everything that. 362 00:17:52.920 --> 00:17:55.480 Sounds incredibly efficient, much better than one agent trying to 363 00:17:55.559 --> 00:17:56.279 juggle everything. 364 00:17:56.359 --> 00:18:00.680 It really highlights the benefits efficiency through parallel processings, specialization, 365 00:18:00.839 --> 00:18:04.079 leading to higher quality results and distributed control, making the 366 00:18:04.119 --> 00:18:05.079 system more robust. 367 00:18:05.319 --> 00:18:08.319 And for this team to work, communication must be key. 368 00:18:08.440 --> 00:18:11.480 Absolutely critical. They need standardized ways to talk to each 369 00:18:11.519 --> 00:18:16.160 other protocols for coordination, like how to prioritize tasks, mechanisms 370 00:18:16.200 --> 00:18:19.680 for sharing knowledge effectively, and maybe even ways to negotiate 371 00:18:19.720 --> 00:18:23.519 if conflicts arise between agents, goals, or resource needs. 372 00:18:23.920 --> 00:18:27.960 This is all incredibly powerful stuff, but it also brings 373 00:18:28.039 --> 00:18:31.799 up some really significant questions about trust, safety, and ethics. 374 00:18:32.319 --> 00:18:34.799 This AI frontier needs careful navigation. 375 00:18:34.960 --> 00:18:38.119 That's paramount. Honestly, if users don't trust these systems, they 376 00:18:38.160 --> 00:18:41.279 just won't be adopted, or worse, they'll be misused. Trust 377 00:18:41.319 --> 00:18:45.160 isn't just one thing. It covers reliability, transparency, Knowing the 378 00:18:45.200 --> 00:18:47.160 AI aligns with your expectations and. 379 00:18:47.160 --> 00:18:49.119 Values, and lack of trust leads. 380 00:18:49.000 --> 00:18:52.759 To skepticism, resistance, maybe even people trying to work around 381 00:18:52.799 --> 00:18:54.359 the system, negating its benefits. 382 00:18:54.400 --> 00:18:56.480 So what are some of the big risks or challenges 383 00:18:56.519 --> 00:18:59.079 we really need to grapple with? As these agentic systems 384 00:18:59.119 --> 00:19:02.400 become more capable and widespread, they can act now, which 385 00:19:02.440 --> 00:19:03.079 feels different. 386 00:19:03.200 --> 00:19:08.640 It is different. The risks get amplified. Take misinformation and hallucinations. 387 00:19:09.359 --> 00:19:12.759 If a simple chatbot makes something up, it's annoying. If 388 00:19:12.799 --> 00:19:17.720 an agentic system hallucinates, say, incorrect flight details or faulty instructions, 389 00:19:17.720 --> 00:19:20.200 for a physical task and then acts on that. 390 00:19:20.200 --> 00:19:22.880 Information that could have real consequence. 391 00:19:22.400 --> 00:19:27.039 Various consequences because it might make booking, spend money, or 392 00:19:27.079 --> 00:19:32.400 control machinery based on flawed data, potentially without immediate human oversight. 393 00:19:33.119 --> 00:19:34.480 Then there's data privacy. 394 00:19:34.559 --> 00:19:36.400 We hear about data breaches all the time. 395 00:19:36.279 --> 00:19:39.480 Right, but here it's not just about accidental inclusion of 396 00:19:39.519 --> 00:19:42.160 personal info and training data, although that's still a risk. 397 00:19:42.599 --> 00:19:47.119 Agentic systems might actively gather, process, and potentially misuse sensitive 398 00:19:47.160 --> 00:19:48.960 data while performing tasks like. 399 00:19:48.920 --> 00:19:52.519 Our travel assistant figuring out confidential business travel plans. 400 00:19:52.200 --> 00:19:55.680 Exactly, or memorizing personal details shared in conversation, which some 401 00:19:55.839 --> 00:19:59.279