WEBVTT 1 00:00:00.040 --> 00:00:04.559 Okay, let's unpack this today. We're diving into a really 2 00:00:04.559 --> 00:00:10.359 fascinating resource. This comprehensive workshop all about statistics and calculus, 3 00:00:10.839 --> 00:00:12.720 but specifically using. 4 00:00:12.519 --> 00:00:15.919 Python, right, and our mission really is to pull out 5 00:00:15.960 --> 00:00:17.160 the most important bits. 6 00:00:16.960 --> 00:00:20.359 Of knowledge exactly and show how Python can take these 7 00:00:20.359 --> 00:00:24.039 concepts which, let's be honest, can seem pretty intimidating. 8 00:00:23.559 --> 00:00:27.440 Oh definitely math stats they have that reputation, and. 9 00:00:27.440 --> 00:00:31.760 Turn them into genuinely powerful practical tools, tools you can 10 00:00:31.879 --> 00:00:33.000 use to understand the world. 11 00:00:33.079 --> 00:00:35.280 And what's great, I think, is how Python makes it 12 00:00:35.320 --> 00:00:38.640 not just accessible or efficient, but actually kind of engaging. 13 00:00:38.759 --> 00:00:42.439 You know, it really does feel different, almost fun sometimes. 14 00:00:43.240 --> 00:00:44.840 So to get there, we kind of start at the 15 00:00:44.840 --> 00:00:47.039 beginning the foundations of Python itself. 16 00:00:47.119 --> 00:00:47.840 Building blocks. 17 00:00:47.880 --> 00:00:53.560 Yeah, the toolkit, so basic data structures first, strings, lists, tuples, dictionaries. 18 00:00:53.600 --> 00:00:54.520 These are how you hold your. 19 00:00:54.399 --> 00:00:57.000 Information, and dictionaries are a good example, right with their 20 00:00:57.079 --> 00:00:59.840 key value pairs. The workshop mentions using them for something 21 00:00:59.840 --> 00:01:01.240 like shopping cart calculation. 22 00:01:02.439 --> 00:01:05.680 And if you look for an item a key that 23 00:01:05.840 --> 00:01:06.599 isn't in your. 24 00:01:06.439 --> 00:01:09.239 Dictionary, you get that key air right, which. 25 00:01:09.079 --> 00:01:11.879 Isn't just an air message, it's Python telling you, hey, 26 00:01:11.920 --> 00:01:14.640 this isn't here forces you to think about handling those 27 00:01:14.640 --> 00:01:16.239 missing items properly. 28 00:01:16.040 --> 00:01:17.840 Which is critical for reliable code. 29 00:01:18.000 --> 00:01:18.120 Right. 30 00:01:18.400 --> 00:01:21.760 Then, beyond just storing data, you need to control how 31 00:01:21.760 --> 00:01:22.640 the program runs. 32 00:01:22.680 --> 00:01:26.719 Control flow, Yeah, you're if l if else for decisions, 33 00:01:27.239 --> 00:01:29.120 your fore loops, for doing things repeatedly. 34 00:01:29.239 --> 00:01:31.959 And Python's readability is a big plus here. It feels 35 00:01:32.280 --> 00:01:34.400 quite intuitive compared to some other languages. 36 00:01:34.519 --> 00:01:37.319 It does. Now this leads into something really powerful. Functions 37 00:01:38.079 --> 00:01:39.799 and recursion. 38 00:01:39.959 --> 00:01:44.560 Ah yes, functions packaging up logic input output, but really 39 00:01:44.920 --> 00:01:46.599 breaking down big problems exactly. 40 00:01:46.719 --> 00:01:50.480 And recursion where a function calls itself. That can seem 41 00:01:50.519 --> 00:01:51.519 a bit mind bending at. 42 00:01:51.400 --> 00:01:54.799 First it can, but the Sudoku solver example in the 43 00:01:54.799 --> 00:01:58.159 workshop is perfect for illustrating it. How so, well, think 44 00:01:58.159 --> 00:02:01.519 about solving Sudoku manually, try a number. Maybe it leads 45 00:02:01.560 --> 00:02:03.599 to a dead end, so you backtrack, right, you erase 46 00:02:03.640 --> 00:02:06.359 it and try something else. Recursion and Python can work 47 00:02:06.439 --> 00:02:08.919 just like that. If a path doesn't work out, the 48 00:02:08.960 --> 00:02:13.439 function effectively returns false, signaling it needs to back up 49 00:02:13.520 --> 00:02:17.520 and try a different possibility. It explores the solution space. 50 00:02:17.560 --> 00:02:21.280 Very elegant, but writing code is one thing. Making sure 51 00:02:21.319 --> 00:02:23.319 it works and stays working is another. 52 00:02:23.680 --> 00:02:28.280 Debugging absolutely crucial. The workshop mentions simple things like, you know, 53 00:02:28.400 --> 00:02:32.080 just using print statements to see variable values. Print debugging 54 00:02:32.120 --> 00:02:35.560 like classic approach it still works, but also more advanced 55 00:02:35.560 --> 00:02:38.439 tools like PDB, the Python debugger. 56 00:02:38.080 --> 00:02:40.360 That lets you step through the code line. 57 00:02:40.039 --> 00:02:44.360 By line exactly, pause execution, inspect everything, find exactly where 58 00:02:44.360 --> 00:02:45.479 things go off the rails. 59 00:02:45.840 --> 00:02:50.120 And we can't forget version control, get and get hub. 60 00:02:50.000 --> 00:02:53.039 No negotiable really, especially for anything more than a tiny 61 00:02:53.080 --> 00:02:55.000 script or if you're working with others. 62 00:02:55.039 --> 00:02:57.240 It's like a safety net and a collaboration hub rolled 63 00:02:57.280 --> 00:02:59.280 into one. You track changes, you can go back in time. 64 00:02:59.360 --> 00:03:01.800 You set up your little repository, link it to GitHub, 65 00:03:01.840 --> 00:03:04.719 and then just get push your changes. Keeps everything organized. 66 00:03:04.919 --> 00:03:08.199 Okay, so foundations are set. Now, how do we actually 67 00:03:08.240 --> 00:03:11.879 start wrestling with data? This is where Python's analytical libraries 68 00:03:11.919 --> 00:03:12.319 come in. 69 00:03:12.400 --> 00:03:16.319 Right, and the main workhourse for anything numerical or scientific 70 00:03:16.439 --> 00:03:17.159 is numb Pi. 71 00:03:17.240 --> 00:03:19.800 Numb pi raise. They're different from standard Python. 72 00:03:19.560 --> 00:03:22.719 Lists, very different, much more flexible, especially for multi dimensional 73 00:03:22.800 --> 00:03:26.759 data I think spreadsheets, images, three D simulations. Numb Pi 74 00:03:26.960 --> 00:03:28.960 handles that structure naturally. 75 00:03:28.599 --> 00:03:31.759 And the speed. The workshop had that comparison. 76 00:03:31.800 --> 00:03:34.719 Oh yeah, the vectorized operations, it's night and day. A 77 00:03:34.800 --> 00:03:37.759 regular four loop doing multiplication might take what was it, 78 00:03:37.879 --> 00:03:38.599 half a second? 79 00:03:38.840 --> 00:03:40.479 About point five four to three seconds? 80 00:03:40.520 --> 00:03:43.000 Yeah, and the numb pi vectorized version point. 81 00:03:42.919 --> 00:03:46.039 Zero zero zero five seconds, tiny fraction. 82 00:03:46.240 --> 00:03:49.520 It's just fundamentally faster because it processes entire arrays at 83 00:03:49.560 --> 00:03:52.000 once using highly optimized C code underneath. 84 00:03:52.159 --> 00:03:55.560 That kind of speed up changes what's even possible to analyze. 85 00:03:55.879 --> 00:03:57.960 Huge data sets become manageable. 86 00:03:57.560 --> 00:04:02.120 Totally, and a key point for doing analysis reproducibility. Setting 87 00:04:02.159 --> 00:04:05.759 the random seed with np dot random dot seed one two. 88 00:04:05.639 --> 00:04:08.319 Three, so even if you use random numbers, you get 89 00:04:08.319 --> 00:04:11.159 the same random sequence each time you run it exactly. 90 00:04:11.360 --> 00:04:14.240 Ensures your results are consistent and someone else can reproduce 91 00:04:14.240 --> 00:04:16.120 your work. Critical for science. 92 00:04:16.399 --> 00:04:20.480 Okay, so numb Pi handles the raw numbers, but often 93 00:04:20.839 --> 00:04:23.879 data comes in tables like spreadsheets. 94 00:04:23.160 --> 00:04:25.920 And that's where Pandas comes in. Panda's data frames are 95 00:04:25.920 --> 00:04:27.959 the go to for tabular data. 96 00:04:27.680 --> 00:04:32.720 So you can load data, look at rows, columns, manipulate things. 97 00:04:32.519 --> 00:04:36.959 Yep, initialize a data frame, access data, rename columns to 98 00:04:36.959 --> 00:04:40.319 be clearer, fill in missing values, sort the data to 99 00:04:40.360 --> 00:04:43.399 see trends all standard operations. 100 00:04:42.959 --> 00:04:44.800 And it has that handy described well. 101 00:04:45.079 --> 00:04:48.040 Describe is great for a quick overview. For numerical columns, 102 00:04:48.040 --> 00:04:51.519 it gives you count means, standard deviation, min max. 103 00:04:51.399 --> 00:04:55.720 Quartiles, a quick statistical summary. What about non numerical like 104 00:04:55.920 --> 00:04:56.879 text data. 105 00:04:56.720 --> 00:04:59.160 It handles that too. It'll show things like the number 106 00:04:59.160 --> 00:05:01.439 of unique entries, the most frequent one for stats like 107 00:05:01.519 --> 00:05:04.480 mean that don't apply to shows nan not a number. 108 00:05:04.279 --> 00:05:06.560 Makes sense and if you need numbers for say a 109 00:05:06.600 --> 00:05:09.639 machine learning model. There was mention of one hot encoding. 110 00:05:09.800 --> 00:05:12.279 Right, that's a common way to turn categorical features like 111 00:05:12.279 --> 00:05:15.519 color dot red, color blue into numerical columns, usually le's 112 00:05:15.600 --> 00:05:16.000 and ones. 113 00:05:16.199 --> 00:05:19.279 But it adds more columns, right, that's the drawback exactly. 114 00:05:19.360 --> 00:05:22.199 It increases the dimensionality of your data, which can sometimes 115 00:05:22.279 --> 00:05:24.040 make things more complex. It's a trade off. 116 00:05:24.439 --> 00:05:26.839 Okay, so we have the data wrangled, how do we 117 00:05:26.879 --> 00:05:27.639 see what's going on? 118 00:05:27.839 --> 00:05:32.399 Visualization mattplotlib and seaborn are the key libraries here, turning 119 00:05:32.480 --> 00:05:34.120 numbers into pictures. 120 00:05:33.800 --> 00:05:37.680 Scatterplots, line graphs, bar charts, the usual. 121 00:05:37.560 --> 00:05:41.519 Susple all those yeah, grouped bar charts for comparing categories 122 00:05:41.600 --> 00:05:46.120 side by side, histograms to sea distributions. You can tweak 123 00:05:46.199 --> 00:05:49.759 histograms too, like setting density true to compare shapes even 124 00:05:49.759 --> 00:05:53.199 if sample sizes differ, or changing the number of bins. 125 00:05:53.040 --> 00:05:55.519 And heat maps. I always find those interesting. 126 00:05:55.319 --> 00:05:58.439 Very useful, especially for correlation matrices. You can instantly see 127 00:05:58.439 --> 00:06:01.680 which variables tend to move together. It's a great visual shortcut. 128 00:06:01.839 --> 00:06:04.040 So the workshop puts this into practice with a real 129 00:06:04.120 --> 00:06:06.720 data set the Apple App Store games. 130 00:06:06.959 --> 00:06:09.920 Yes a practical example, and it highlights the importance of 131 00:06:10.000 --> 00:06:13.000 data prep. You know, cleaning things up first. 132 00:06:12.920 --> 00:06:16.040 Like changing column names, setting the it is the index, 133 00:06:16.680 --> 00:06:18.120 dropping columns that aren't. 134 00:06:17.959 --> 00:06:21.480 Useful right like the earl or icon earl. And dealing 135 00:06:21.480 --> 00:06:24.560 with missing data is huge. The subtitle column had like 136 00:06:24.759 --> 00:06:27.079 eleven thousand missing values wow. 137 00:06:27.519 --> 00:06:29.439 And the user ratings had a lot missing too. 138 00:06:29.319 --> 00:06:32.759 Over nine thousand missing average user rating values. So a 139 00:06:32.800 --> 00:06:36.079 key step was filtering, only keeping games with at least 140 00:06:36.199 --> 00:06:37.079 thirty ratings. 141 00:06:37.240 --> 00:06:40.000 Why thirty just to have enough data for stats to 142 00:06:40.040 --> 00:06:40.680 be meaningful. 143 00:06:40.879 --> 00:06:44.600 Basically, yes, it's a common threshold for technical reasons to 144 00:06:44.720 --> 00:06:46.399 ensure some reliability in the averages. 145 00:06:46.480 --> 00:06:49.079 And after all that cleaning and filtering, what did they find. 146 00:06:49.560 --> 00:06:52.480 One really interesting finding was that the distribution of average 147 00:06:52.560 --> 00:06:56.240 user ratings looked almost identical for free games versus paid game. 148 00:06:56.279 --> 00:06:58.879 Really, so paying doesn't necessarily mean people like the. 149 00:06:58.839 --> 00:07:01.240 Game more, it seems is not, at least in this 150 00:07:01.319 --> 00:07:05.639 data set. Suggests maybe game quality itself or user experience 151 00:07:05.759 --> 00:07:07.720 is the dominant factor, not the price tag. 152 00:07:07.879 --> 00:07:11.000 Fascinating. Okay, so we've tained the data. Now let's get 153 00:07:11.000 --> 00:07:15.079 into the statistical side, drawing deeper insights, making predictions. 154 00:07:14.600 --> 00:07:17.319 Right, moving beyond just describing the data. 155 00:07:17.040 --> 00:07:21.360 We have, which brings up that distinction descriptive versus inferential statistics. 156 00:07:21.680 --> 00:07:25.720 Yeah, Descriptive is summarizing what you see, average spread things 157 00:07:25.759 --> 00:07:29.560 like that. Inferential is using your sample to say something 158 00:07:29.600 --> 00:07:33.759 about the bigger picture or about unseen data, making inferences. 159 00:07:33.959 --> 00:07:37.240 And a lot of that involves probability, dealing with randomness. 160 00:07:37.360 --> 00:07:39.839 It does. And the interesting thing is, while one random 161 00:07:39.879 --> 00:07:43.800 event is unpredictable, like one coin flip heads or tails, 162 00:07:43.800 --> 00:07:46.639 who knows a lot of random events become surprisingly predictable. 163 00:07:46.720 --> 00:07:49.360 Flip that coin one thousand times and you're almost certainly 164 00:07:49.360 --> 00:07:50.800 going to get around five hundred heads. 165 00:07:50.920 --> 00:07:55.079 The workshop used a die tossing example yeah a million times. 166 00:07:54.879 --> 00:07:58.639 Yeah, to show calculating probability from relative frequency. Pod number 167 00:07:58.639 --> 00:08:01.000 came out around point five zers old one, very close 168 00:08:01.040 --> 00:08:04.160 to the theoretical point five. P less than five was 169 00:08:04.160 --> 00:08:06.920 about point six sixty six, again very close to forty 170 00:08:06.920 --> 00:08:07.720 six or twenty three. 171 00:08:07.879 --> 00:08:12.000 This predictability leads nicely into the roulette example explaining expected value. 172 00:08:12.079 --> 00:08:14.399 It's a classic. You bet one dollar on red. Say 173 00:08:14.519 --> 00:08:16.800 you win one dollar if it lands red, lose one 174 00:08:16.839 --> 00:08:17.920 dollar if black or green. 175 00:08:18.279 --> 00:08:21.040 But there are those two green spaces zero zero, zero 176 00:08:21.160 --> 00:08:22.319 zero exactly. 177 00:08:22.720 --> 00:08:25.839 They tip the odds slightly in the casino's favor. Over 178 00:08:25.959 --> 00:08:29.680 many bets, the expected value for the gambler is slightly negative, 179 00:08:29.720 --> 00:08:32.840 about minus two point seven cents per dollar bet. 180 00:08:32.879 --> 00:08:35.720 And that small negative amount for the player is the 181 00:08:35.720 --> 00:08:36.960 casino's profit margin. 182 00:08:37.120 --> 00:08:41.039 Precisely, that's the house edge built right into the probabilities. 183 00:08:41.159 --> 00:08:44.720 This idea of large numbers leading to predictable averages sounds 184 00:08:44.799 --> 00:08:46.080 like the central limit theorem. 185 00:08:46.159 --> 00:08:50.679 You got it, The CLT hugely important concept. It basically says, 186 00:08:50.919 --> 00:08:54.039 if your sample size is large enough usually thirty or 187 00:08:54.080 --> 00:08:55.639 more is a rule of thumb, then. 188 00:08:55.559 --> 00:08:58.480 The distribution of the sample means will look like a 189 00:08:58.519 --> 00:08:59.399 normal bell. 190 00:08:59.240 --> 00:09:02.519 Curve exactly even if the original data source isn't normally 191 00:09:02.559 --> 00:09:05.720 distributed at all. It could be uniform skewed. Whatever, the 192 00:09:05.799 --> 00:09:08.840 averages from large samples will tend towards normality. 193 00:09:09.000 --> 00:09:12.360 The workshop had that example drawing samples from a uniform distribution. 194 00:09:12.559 --> 00:09:15.720 Yeah, ten thousand samples. The histogram of the sample averages 195 00:09:15.759 --> 00:09:18.559 looked almost perfectly like a bell curve, fitting the normal 196 00:09:18.600 --> 00:09:21.919 distribution predicted by the CLT. It's quite striking visually. 197 00:09:22.200 --> 00:09:26.080 Okay, so sample means follow a normal distribution if the 198 00:09:26.120 --> 00:09:29.840 sample is big enough, But any single sample mean might 199 00:09:29.879 --> 00:09:32.879 still be off from the true population mean. How do 200 00:09:32.919 --> 00:09:34.399 we account for that uncertainty? 201 00:09:34.639 --> 00:09:37.399 That's where confidence intervals come in. They give you a range, 202 00:09:37.559 --> 00:09:39.240 not just a single point estimate. 203 00:09:39.440 --> 00:09:42.320 Like in election polls, they often report a margin of error. 204 00:09:42.360 --> 00:09:46.159 Exactly. A ninety five percent confidence interval means we're ninety 205 00:09:46.159 --> 00:09:49.240 five percent confident that the true population value lies within 206 00:09:49.279 --> 00:09:53.039 this range. The workshop example mentioned a small pole four 207 00:09:53.039 --> 00:09:55.519 to six people out of ten might vote for someone. 208 00:09:55.679 --> 00:09:58.960 The interval reflects the uncertainty due to the small sample size. 209 00:09:59.080 --> 00:10:03.799 Got it, So intervals quantify uncertainty? What about testing specific claims? 210 00:10:03.919 --> 00:10:05.080 Hypothesis testing? 211 00:10:05.200 --> 00:10:08.360 Right? This is about formally testing if a statistic you 212 00:10:08.440 --> 00:10:11.440 observe is significantly different from what you'd expect under some 213 00:10:11.519 --> 00:10:12.320 default assumption. 214 00:10:12.519 --> 00:10:13.919 It has three parts with it yep. 215 00:10:14.320 --> 00:10:18.000 First the hypotheses, the null hypothesis H zero, which is 216 00:10:18.000 --> 00:10:21.879 the default or no effect assumption, and the alternative hypothesis HA, 217 00:10:21.960 --> 00:10:23.600 which is what you're trying to find evidence for. 218 00:10:23.919 --> 00:10:27.240 Like Richard the baker, H zero's is his factory still 219 00:10:27.240 --> 00:10:28.600 makes fifteen thousand loaves. 220 00:10:28.879 --> 00:10:35.159 Correct, HHA equals fifteen thousand. HA might be a fifteen thousand, 221 00:10:35.840 --> 00:10:38.559 or maybe fifteen thousand if he hopes the new equipment 222 00:10:38.679 --> 00:10:43.799 increased output. Okay, hypotheses first, Then, then you calculate a 223 00:10:43.840 --> 00:10:47.039 test statistic based on your data, and finally, the P value. 224 00:10:47.159 --> 00:10:49.120 The P value that tells you it's the. 225 00:10:49.120 --> 00:10:52.320 Probability of seeing your data or something even more extreme 226 00:10:52.519 --> 00:10:55.840 if the null hypothesis were actually true. A small P 227 00:10:56.080 --> 00:10:59.240 value suggests your data is unlikely under the null providing 228 00:10:59.240 --> 00:11:01.320 evidence for the alternative makes sense. 229 00:11:01.559 --> 00:11:04.639 Now there was that important warning about correlation and causation. 230 00:11:04.840 --> 00:11:08.600 Ah. Yes, the community's data set activity found higher test 231 00:11:08.600 --> 00:11:11.399 scores in groups with more Internet access. The P value 232 00:11:11.440 --> 00:11:13.600 was small, indicating a significant difference. 233 00:11:13.720 --> 00:11:15.360 So more internet equals better. 234 00:11:15.120 --> 00:11:18.559 Scores, not necessarily. That's the crucial point. Correlation does not 235 00:11:18.679 --> 00:11:19.639 imply causation. 236 00:11:19.840 --> 00:11:22.120 There could be another factor involved, exactly. 237 00:11:21.840 --> 00:11:25.279 A lurking variable like the overall wealth or socioeconomic status 238 00:11:25.279 --> 00:11:28.159 at a community, could be driving both higher internet access 239 00:11:28.200 --> 00:11:31.200 and higher test scores. You can't conclude causation just from 240 00:11:31.200 --> 00:11:33.559 the correlation. Always have to be careful. 241 00:11:33.399 --> 00:11:36.919 A vital lesson. And you mentioned machine learning models like 242 00:11:36.960 --> 00:11:39.480 linear regression. They fit in here too. 243 00:11:39.639 --> 00:11:43.759 Yeah, there're essentially another form of inferential statistics. You build 244 00:11:43.759 --> 00:11:47.320 a model on known data to make predictions about unseen data. 245 00:11:47.320 --> 00:11:48.200 It's all connected. 246 00:11:48.559 --> 00:11:51.639 Okay, let's shift to calculus, but again through the lens 247 00:11:51.720 --> 00:11:55.159 of Python, making it practical. We talked about functions earlier, right. 248 00:11:55.000 --> 00:11:58.639 And that core rule one input, only one output. The 249 00:11:58.720 --> 00:12:02.519 vertical line test helps visualize that a circle fails it. 250 00:12:02.679 --> 00:12:05.720 So why isn't a simple function of X for a circle. 251 00:12:05.440 --> 00:12:08.320 And functions can be transformed, shifted. 252 00:12:08.039 --> 00:12:12.039 Scaled, yep, adding a constant, shifts vertically, adding inside like 253 00:12:12.200 --> 00:12:17.279 FX plus C, shifts horizontally, multiplying, stretches or shrinks. Python's 254 00:12:17.320 --> 00:12:20.720 plotting makes seeing these transformations really intuitive. 255 00:12:20.320 --> 00:12:22.960 And Python helps solve equations too, even tricky ones. 256 00:12:23.039 --> 00:12:25.879 Oh, definitely simple linear like three by five lay six. 257 00:12:25.919 --> 00:12:27.519 Python can solve that easily that you can do it 258 00:12:27.519 --> 00:12:30.279 by hand. But for polynomials like by three seven x 259 00:12:30.279 --> 00:12:33.360 two plus fifteen x x nine that looks harder. Python 260 00:12:33.399 --> 00:12:36.320 can help factor it find the roots in this case 261 00:12:36.639 --> 00:12:40.200 x one and x school three. Libraries like SIMP can 262 00:12:40.240 --> 00:12:45.159 even handle symbolic math, solving systems of nonlinear equations algebraically. 263 00:12:45.279 --> 00:12:48.080 Wow. What about sequences and series they popped up too. 264 00:12:48.120 --> 00:12:52.519 Yeah, arithmetic geometric sequences. They have direct applications in finance, 265 00:12:52.600 --> 00:12:56.039 like calculating compound interest for retirement. 266 00:12:55.519 --> 00:12:57.960 Savings four one K calculations. 267 00:12:57.360 --> 00:13:00.639 Exactly, or even modeling things like bacterial growth which often 268 00:13:00.679 --> 00:13:02.039 follows a geometric sequence. 269 00:13:02.159 --> 00:13:04.799 Useful stuff. And a quick mention of trigonometry and. 270 00:13:04.879 --> 00:13:08.840 Vectors right sine cosine tangent for angles, the Pythagorean theorem 271 00:13:08.919 --> 00:13:12.919 for right triangles, and vectors for quantities with magnitude and direction. 272 00:13:13.200 --> 00:13:16.480 The dot products came up for finding the angle between vectors. 273 00:13:16.559 --> 00:13:19.039 Yep, if the dot product is zero, the vectors are 274 00:13:19.159 --> 00:13:22.080 orthogonal perpendicular. Useful in physics and graphics. 275 00:13:22.200 --> 00:13:26.120 Okay, now the core calculus concepts derivatives and integrals made 276 00:13:26.120 --> 00:13:29.039 practical with Python. Derivatives first rate of. 277 00:13:29.039 --> 00:13:32.679 Change instantaneous rate of change, how fast something is changing 278 00:13:32.679 --> 00:13:36.360 at a specific point. Traditionally, finding derivatives involves a lot 279 00:13:36.360 --> 00:13:38.279 of algebra limit rules. 280 00:13:38.000 --> 00:13:40.600 The tedious algebraic manipulations. 281 00:13:39.960 --> 00:13:44.240 Exactly, but Python lets you do it numerically. You approximate 282 00:13:44.279 --> 00:13:46.879 the slope using a tiny change in X like H 283 00:13:46.960 --> 00:13:50.039 equals zero point zero zero zero zero zero one. You 284 00:13:50.080 --> 00:13:53.159 calculate FX plus h FX, so. 285 00:13:53.080 --> 00:13:55.440 You get the slope the rate of change without the 286 00:13:55.639 --> 00:13:57.480 complex algebra pretty much. 287 00:13:57.360 --> 00:13:58.879 And once you have the slope at a point you 288 00:13:58.879 --> 00:14:01.720 can find the equation of the tangent line. There. Very 289 00:14:01.759 --> 00:14:03.919 powerful for optimization and analysis. 290 00:14:04.000 --> 00:14:07.159 Okay, And integrals the opposite kind of adding things up. 291 00:14:07.279 --> 00:14:10.759 Conceptually, yes, adding up areas or volumes by slicing them 292 00:14:10.759 --> 00:14:14.440 into many tiny pieces. Old methods like rhemen sums use 293 00:14:14.559 --> 00:14:17.240 rectangles but weren't very accurate with few slices. 294 00:14:17.519 --> 00:14:20.559 Python uses trapezoids the trap intogal function right. 295 00:14:20.879 --> 00:14:24.240 Using trapezoids gives a better approximation, and because Python can 296 00:14:24.279 --> 00:14:28.039 handle thousands or millions of slices easily, the numerical integration 297 00:14:28.120 --> 00:14:32.200 becomes incredibly accurate. The workshop example show just five trapezoids 298 00:14:32.200 --> 00:14:34.159 getting the air down to three percent, and this lets. 299 00:14:34.000 --> 00:14:37.559 You calculate volumes of complex shapes solids of revolution. 300 00:14:37.440 --> 00:14:40.279 Exactly like rotating a curve to make a bowl shape 301 00:14:40.279 --> 00:14:44.159 a paraboloid, or solving optimization problems like finding the maximum 302 00:14:44.200 --> 00:14:46.600 volume cone you can fit inside a sphere. 303 00:14:46.600 --> 00:14:49.440 But the real power seem to be in differential equations. 304 00:14:49.480 --> 00:14:52.720 Absolutely. These describe situations where the rate of change of 305 00:14:52.759 --> 00:14:55.840 something depends on its current value. Finding the function itself 306 00:14:55.840 --> 00:14:59.600 can be very hard or even impossible algebraically. 307 00:14:59.080 --> 00:15:04.960 But Python offers numerical methods Euler's method, ran Jikuda RK four. 308 00:15:05.200 --> 00:15:08.799 Yes, these are algorithmic approaches. You start with an initial 309 00:15:08.799 --> 00:15:11.960 condition and step forward in small time increments, using the 310 00:15:11.960 --> 00:15:15.039 derivative information to predict the next value. It's like building 311 00:15:15.120 --> 00:15:16.919 the solution step by step. 312 00:15:16.679 --> 00:15:19.559 And this opens up modeling for tons of real world things. 313 00:15:19.639 --> 00:15:22.480 Oh a huge range. The workshop listed quite a few. 314 00:15:22.679 --> 00:15:25.799 Let's recap some interest calculations. How money grows. 315 00:15:25.639 --> 00:15:29.320 YEP modeling compound interest one thousand dollars growing to one 316 00:15:29.360 --> 00:15:32.240 million dollars in about eighty six years, and eight percent. 317 00:15:32.080 --> 00:15:34.480 Population growth like Kenya's doubling. 318 00:15:34.159 --> 00:15:38.080 Time right, modeling exponential growth or maybe logistic growth if 319 00:15:38.080 --> 00:15:41.600 there are limiting factors, how policy changes might affect growth. 320 00:15:41.360 --> 00:15:44.519 Rates, radioactive decay carbon fourteen. 321 00:15:44.600 --> 00:15:48.960 Dating exactly the half life calculation is a classic differential 322 00:15:49.000 --> 00:15:51.840 equation problem used to date artifacts. 323 00:15:52.120 --> 00:15:55.240 Noon's law of cooling, like figuring out time of death. 324 00:15:55.360 --> 00:15:59.120 That's a famous application. Yes, or just modeling how any 325 00:15:59.159 --> 00:16:02.440 object cools down or warms up towards the ambient temperature. 326 00:16:02.559 --> 00:16:04.720 Mixture problems salt in a tank. 327 00:16:04.679 --> 00:16:08.159 Yeah, Tracking the concentration of a substance as fluids flow 328 00:16:08.240 --> 00:16:10.200 in and out common in chemical engineering. 329 00:16:10.240 --> 00:16:13.679 Projectile motion calculating balls trajectory m HM. 330 00:16:13.799 --> 00:16:18.120 Python can constantly reclculate velocity and position, accounting for gravity 331 00:16:18.200 --> 00:16:22.120 air resistance, much more realistically than simple formulas. 332 00:16:21.639 --> 00:16:23.240 And even predator prese scenarios. 333 00:16:23.440 --> 00:16:26.360 A fox chasing a rabbit, Yes, showing exactly where the 334 00:16:26.399 --> 00:16:29.679 fox intercepts the rabbit why twenty three point ninety nine 335 00:16:29.759 --> 00:16:32.039 In the example, it models pursuit curves. 336 00:16:32.039 --> 00:16:34.600 So the big advantage is avoiding the complex algebra and 337 00:16:34.679 --> 00:16:36.279 just letting Python crunch the numbers. 338 00:16:36.360 --> 00:16:39.919 Essentially, Yes, Modeling using Python and running simulations has saved 339 00:16:40.000 --> 00:16:42.000 us a lot of algebra and still got us very 340 00:16:42.039 --> 00:16:45.480 accurate answers. You can use brute force by recalculating thousands 341 00:16:45.519 --> 00:16:46.000 of times. 342 00:16:46.120 --> 00:16:48.559 Very cool. And finally, a brief look at matrices and 343 00:16:48.639 --> 00:16:49.559 Markoff chains. 344 00:16:49.759 --> 00:16:53.480 Right. Matrices are fundamental in linear algebra, AI machine learning, 345 00:16:53.519 --> 00:16:58.000 and Markov chains model systems transitioning between states based on probabilities. 346 00:16:58.080 --> 00:17:00.559 The example was a text predictor yeah, using. 347 00:17:00.360 --> 00:17:04.759 The probability of one word following another state transitions to generate. 348 00:17:04.480 --> 00:17:05.400 New text Yeah. 349 00:17:05.480 --> 00:17:08.519 A basic but illustrative example of Markov chains in action. 350 00:17:09.319 --> 00:17:12.319 So, wrapping this all up, what's the big takeaway here? 351 00:17:12.559 --> 00:17:15.839 I think it's that Python, with these incredible libraries, really 352 00:17:15.880 --> 00:17:19.079 acts like a universal translator for math and stats. 353 00:17:18.799 --> 00:17:21.960 Taking abstract concepts and making them tools for solving real 354 00:17:22.000 --> 00:17:22.920 problem exactly. 355 00:17:22.920 --> 00:17:27.039 Whether it's finance, biology, physics, social science, you can model 356 00:17:27.200 --> 00:17:30.839 complex systems, make predictions, understand dynamics. 357 00:17:30.359 --> 00:17:33.480 Without necessarily needing a PhD in advanced mathematics to do 358 00:17:33.519 --> 00:17:34.839 the calculations by hand. Right. 359 00:17:35.000 --> 00:17:38.599 It democratizes the ability to use these powerful techniques. You 360 00:17:38.720 --> 00:17:41.160 leverage the computational power to get insights. 361 00:17:41.200 --> 00:17:44.319 It lets you ask what if and get remarkably accurate 362 00:17:44.319 --> 00:17:46.759 answers through simulation and numerical methods. 363 00:17:46.839 --> 00:17:50.319 It really does shift the focus from algebraic manipulation to 364 00:17:50.440 --> 00:17:52.559 understanding the concepts and applying them. 365 00:17:52.839 --> 00:17:56.480 So here's something to think about. What problem maybe something 366 00:17:56.519 --> 00:17:59.920 that seemed mathematically impossible or just way too complex before. 367 00:18:00.599 --> 00:18:03.920 What might you approach differently now knowing that Python could 368 00:18:03.960 --> 00:18:05.160 be your computational guide.