WEBVTT

1
00:00:00.040 --> 00:00:02.600
<v Speaker 1>Okay, let's unpack this. We've got some great source material

2
00:00:02.680 --> 00:00:06.480
<v Speaker 1>here for a deep dive into well, really the engine

3
00:00:06.559 --> 00:00:11.039
<v Speaker 1>room of modern computer graphics. We're talking GLSL, the OPENNGL

4
00:00:11.119 --> 00:00:12.679
<v Speaker 1>shading language.

5
00:00:12.199 --> 00:00:16.440
<v Speaker 2>And how it basically ripped up the old rule book

6
00:00:16.480 --> 00:00:17.960
<v Speaker 2>for the rendering pipeline.

7
00:00:18.079 --> 00:00:21.559
<v Speaker 1>Yeah, exactly. And this info comes from excerpts of GLSL

8
00:00:21.679 --> 00:00:24.440
<v Speaker 1>Essentials by Jacobo Rodriguez.

9
00:00:23.920 --> 00:00:25.960
<v Speaker 2>Right, a really solid book on the topic.

10
00:00:26.079 --> 00:00:28.600
<v Speaker 1>And it's worth saying upfront this isn't really for total beginners.

11
00:00:28.640 --> 00:00:28.719
<v Speaker 2>Is.

12
00:00:28.719 --> 00:00:32.159
<v Speaker 1>It's aimed more at people who've maybe dabbled in computer

13
00:00:32.200 --> 00:00:32.920
<v Speaker 1>graphics before.

14
00:00:33.079 --> 00:00:35.399
<v Speaker 2>Yeah, folks who want to get up to speed with

15
00:00:35.479 --> 00:00:39.399
<v Speaker 2>modern OPENNGL or maybe make that jump from the old way,

16
00:00:39.479 --> 00:00:41.079
<v Speaker 2>the fixed pipeline.

17
00:00:40.560 --> 00:00:43.600
<v Speaker 1>To the programmable pipeline, which the book calls And I

18
00:00:43.640 --> 00:00:46.640
<v Speaker 1>love this phrase, the biggest revolution in real time graphics

19
00:00:46.640 --> 00:00:47.600
<v Speaker 1>programming history.

20
00:00:47.640 --> 00:00:51.240
<v Speaker 2>It's not an exaggeration really. It unlocks so much visual potential.

21
00:00:51.399 --> 00:00:54.159
<v Speaker 1>And the author, Jocobo Rodriguez, he wasn't just writing about it.

22
00:00:54.159 --> 00:00:58.840
<v Speaker 1>He was like building tools for GLSL shaders super early on.

23
00:00:59.039 --> 00:01:01.240
<v Speaker 2>Yeah, even before some but the big hardware guys had

24
00:01:01.280 --> 00:01:03.560
<v Speaker 2>their own tools ready. So he's got that real hands

25
00:01:03.560 --> 00:01:05.719
<v Speaker 2>on pioneer perspective definitely.

26
00:01:06.040 --> 00:01:08.519
<v Speaker 1>So our goal today digging into this for you is

27
00:01:08.560 --> 00:01:11.560
<v Speaker 1>to pull out those core ideas. We wanted to explain

28
00:01:11.599 --> 00:01:14.480
<v Speaker 1>the architecture, what the different shaders actually.

29
00:01:14.159 --> 00:01:16.920
<v Speaker 2>Do, and show why this programmable stuff is such a

30
00:01:16.959 --> 00:01:20.799
<v Speaker 2>game changer. Think of this as your essential guide to

31
00:01:20.959 --> 00:01:22.599
<v Speaker 2>programmable graphics concepts.

32
00:01:22.719 --> 00:01:26.000
<v Speaker 1>Perfect. So where do we start, Maybe the absolute basics

33
00:01:26.519 --> 00:01:28.799
<v Speaker 1>CPU versus GPU, good place.

34
00:01:29.040 --> 00:01:33.079
<v Speaker 2>Yeah, it really boils down to how they think, how

35
00:01:33.120 --> 00:01:37.079
<v Speaker 2>they process things. Your CPU, your main processor. It's great

36
00:01:37.200 --> 00:01:40.879
<v Speaker 2>at doing tasks one after another, sequential stuff.

37
00:01:40.599 --> 00:01:43.599
<v Speaker 1>Scaler processing right, yeah, for general task exactly.

38
00:01:43.959 --> 00:01:47.120
<v Speaker 2>But the GPU it's a different beast. It's all about

39
00:01:47.120 --> 00:01:51.120
<v Speaker 2>parallel power, vectorial. It can chew through hundreds, maybe thousands

40
00:01:51.439 --> 00:01:53.799
<v Speaker 2>of similar calculations all at the same time.

41
00:01:53.719 --> 00:01:56.040
<v Speaker 1>Using all those little specialized cores. It has not great

42
00:01:56.040 --> 00:01:58.560
<v Speaker 1>for everything, but for graphics math unbeatable.

43
00:01:58.640 --> 00:02:01.400
<v Speaker 2>Yeah, and that parallel must is what drives the graphics

44
00:02:01.400 --> 00:02:02.879
<v Speaker 2>rendering pipeline, which.

45
00:02:02.680 --> 00:02:05.480
<v Speaker 1>The book describes as like a pipe where we insert

46
00:02:05.519 --> 00:02:08.560
<v Speaker 1>some data into one end vertices texture shaders and they

47
00:02:08.560 --> 00:02:10.199
<v Speaker 1>start to travel through some small machines.

48
00:02:10.319 --> 00:02:13.240
<v Speaker 2>It's a good analogy an assembly line for pixels almost

49
00:02:13.439 --> 00:02:16.680
<v Speaker 2>each stage does its specific job on the data flowing through.

50
00:02:16.680 --> 00:02:19.919
<v Speaker 1>And for a long time that assembly line was completely fixed,

51
00:02:20.159 --> 00:02:21.919
<v Speaker 1>same steps, same order, every time.

52
00:02:22.000 --> 00:02:26.479
<v Speaker 2>Predictable, sure, but really inflexible. That started changing what early

53
00:02:26.520 --> 00:02:29.400
<v Speaker 2>two thousands, two thousand and four, maybe.

54
00:02:29.319 --> 00:02:31.800
<v Speaker 1>Yeah, around them the pipeline started opening up. You could

55
00:02:31.800 --> 00:02:36.039
<v Speaker 1>replace some of those fixed stages with small, low level

56
00:02:36.080 --> 00:02:38.520
<v Speaker 1>programs early shaders.

57
00:02:38.360 --> 00:02:42.000
<v Speaker 2>But writing those was kind of a nightmare. Initially, differed

58
00:02:42.000 --> 00:02:45.560
<v Speaker 2>hardware vendors had completely different assembly languages.

59
00:02:45.199 --> 00:02:48.280
<v Speaker 1>Right Moving code between say, an Nvidia card and an

60
00:02:48.319 --> 00:02:50.199
<v Speaker 1>ATI card was a major.

61
00:02:49.919 --> 00:02:52.520
<v Speaker 2>Headache, which led pretty quickly to a need for a standard,

62
00:02:52.639 --> 00:02:56.240
<v Speaker 2>something common, and that's where GLSL came in around two thousand.

63
00:02:55.960 --> 00:02:59.000
<v Speaker 1>And four, a high level shader language like C but

64
00:02:59.479 --> 00:03:02.439
<v Speaker 1>designed for GPUs and meant to work across different platforms.

65
00:03:02.520 --> 00:03:06.360
<v Speaker 2>Initially, GLSL let you program two key parts, transforming the

66
00:03:06.400 --> 00:03:08.120
<v Speaker 2>three D geometry.

67
00:03:07.759 --> 00:03:09.919
<v Speaker 1>That became the vertex shader YEP.

68
00:03:09.840 --> 00:03:12.360
<v Speaker 2>And then figuring out the color for each pixel.

69
00:03:12.080 --> 00:03:13.680
<v Speaker 1>The fragment shader exactly.

70
00:03:13.719 --> 00:03:17.199
<v Speaker 2>Those were the first two big programmable stages. Later on,

71
00:03:17.319 --> 00:03:20.080
<v Speaker 2>things like the geometry shader and the compute shader got added,

72
00:03:20.400 --> 00:03:22.400
<v Speaker 2>pushing what the BPU could do even further.

73
00:03:22.639 --> 00:03:26.520
<v Speaker 1>Okay, so that's the pipeline's evolution. What about GLSL itself,

74
00:03:26.879 --> 00:03:28.360
<v Speaker 1>the language, what's it like.

75
00:03:28.800 --> 00:03:31.199
<v Speaker 2>It feels a lot like C or C plus plus a,

76
00:03:31.719 --> 00:03:34.199
<v Speaker 2>which is nice and familiar for many programmers. But there

77
00:03:34.240 --> 00:03:37.000
<v Speaker 2>are some pretty big differences under the hood, and.

78
00:03:37.039 --> 00:03:40.520
<v Speaker 1>The book flag's one huge one right away. GLSL does

79
00:03:40.560 --> 00:03:41.360
<v Speaker 1>not have pointers.

80
00:03:41.439 --> 00:03:45.319
<v Speaker 2>That's a big ee yeah, no messing with memory addresses directly.

81
00:03:45.680 --> 00:03:48.439
<v Speaker 2>It simplifies things a lot, avoids a ton of potential bugs,

82
00:03:48.439 --> 00:03:51.520
<v Speaker 2>and honestly makes it easier to manage thousands of threads

83
00:03:51.599 --> 00:03:53.360
<v Speaker 2>running in parallel safely makes sense.

84
00:03:53.680 --> 00:03:58.159
<v Speaker 1>So syntactically, it's c like Semicolon's curly braces for blocks

85
00:03:58.639 --> 00:04:01.479
<v Speaker 1>standard variable scoping. But the data types are where the

86
00:04:01.520 --> 00:04:02.719
<v Speaker 1>real graphics power comes in.

87
00:04:02.800 --> 00:04:05.919
<v Speaker 2>Absolutely, you have your standard ball and float, but then

88
00:04:05.919 --> 00:04:09.199
<v Speaker 2>you get specialized types types for handling textures called samplers,

89
00:04:09.560 --> 00:04:12.039
<v Speaker 2>and crucially built in vectors like vec two.

90
00:04:12.120 --> 00:04:14.560
<v Speaker 1>Vec three, vec four for two D three D four

91
00:04:14.639 --> 00:04:15.599
<v Speaker 1>y values.

92
00:04:15.240 --> 00:04:17.240
<v Speaker 2>And matrix's matt two mat three mat four. These are

93
00:04:17.319 --> 00:04:18.399
<v Speaker 2>native types, and.

94
00:04:18.399 --> 00:04:21.839
<v Speaker 1>Doing math with these vectors and matrices it's like super optimized.

95
00:04:22.519 --> 00:04:25.120
<v Speaker 1>The book says they map right to the hardware costs

96
00:04:25.199 --> 00:04:27.720
<v Speaker 1>way less than doing the same math on the CPU.

97
00:04:27.439 --> 00:04:32.879
<v Speaker 2>Massively less. It's fundamental to GPU performance. Think about calculating

98
00:04:32.879 --> 00:04:36.519
<v Speaker 2>a cross product between two three D vectors on a CPU.

99
00:04:36.600 --> 00:04:40.040
<v Speaker 2>That's several multiplication subtractions in GLSL.

100
00:04:40.120 --> 00:04:41.800
<v Speaker 1>It's cross vaca vecu often.

101
00:04:41.879 --> 00:04:44.800
<v Speaker 2>Yeah, it can be a single hardware instruction because the

102
00:04:44.839 --> 00:04:47.079
<v Speaker 2>GPU is built for that kind of linear algebra.

103
00:04:47.279 --> 00:04:50.160
<v Speaker 1>Wow, Okay, that explains a lot about the speed definitely.

104
00:04:50.600 --> 00:04:52.600
<v Speaker 2>And there's another cool vector thing, swizzling.

105
00:04:52.759 --> 00:04:56.079
<v Speaker 1>Ah yeah, swizzling super handy. Lets you rearrange your pickout

106
00:04:56.079 --> 00:04:57.480
<v Speaker 1>components of a vector to make a new.

107
00:04:57.360 --> 00:05:01.680
<v Speaker 2>One, right exactly, using things like dot xyzw dot RGBA.

108
00:05:02.079 --> 00:05:04.879
<v Speaker 2>You can grab the red, green blue components like mycolor

109
00:05:04.879 --> 00:05:08.800
<v Speaker 2>dot RGB, or you could reverse them mycolor dot bgr,

110
00:05:09.040 --> 00:05:11.000
<v Speaker 2>or even make a vector of just the alpha value

111
00:05:11.160 --> 00:05:12.360
<v Speaker 2>my color dot aaa.

112
00:05:12.480 --> 00:05:15.399
<v Speaker 1>It's a really compact syntax for manipulating vector data. What

113
00:05:15.439 --> 00:05:19.319
<v Speaker 1>else does the language have? Standard stuff like variable initializers.

114
00:05:18.839 --> 00:05:22.839
<v Speaker 2>YEP initializers, explicit casting if you need to convert between

115
00:05:22.920 --> 00:05:27.680
<v Speaker 2>types like into float to handle precision, standard comments and whoop.

116
00:05:27.800 --> 00:05:30.759
<v Speaker 1>And control flow, ifels loops.

117
00:05:30.480 --> 00:05:33.720
<v Speaker 2>All there Ifel's switch case, which the book notes can

118
00:05:33.759 --> 00:05:36.480
<v Speaker 2>sometimes be nicely optimized by the driver, so worth using

119
00:05:36.879 --> 00:05:39.360
<v Speaker 2>and loops for wild do wild.

120
00:05:39.519 --> 00:05:42.079
<v Speaker 1>You can group variables together using structures just like can C.

121
00:05:42.439 --> 00:05:46.279
<v Speaker 1>Give them a custom type name like struct material VEC three,

122
00:05:46.360 --> 00:05:48.839
<v Speaker 1>color float, shininess.

123
00:05:48.839 --> 00:05:52.120
<v Speaker 2>And arrays for multiple items of the same type accessed

124
00:05:52.120 --> 00:05:54.680
<v Speaker 2>with square brackets, and you can often get the length

125
00:05:54.680 --> 00:05:55.279
<v Speaker 2>with length there.

126
00:05:55.360 --> 00:05:57.800
<v Speaker 1>And functions for reusable code blocks yep.

127
00:05:57.879 --> 00:06:02.240
<v Speaker 2>Functions are key for organization. But remember no pointers. So

128
00:06:02.240 --> 00:06:04.120
<v Speaker 2>how do you pass data back out of a function?

129
00:06:04.360 --> 00:06:07.600
<v Speaker 1>Ah? Right, use those qualifiers in out, in out exactly.

130
00:06:07.680 --> 00:06:10.600
<v Speaker 2>In is the default pass by value. Out means the

131
00:06:10.680 --> 00:06:13.480
<v Speaker 2>function will write to that variable, and in out means

132
00:06:13.519 --> 00:06:15.480
<v Speaker 2>it can read and write. It's kind of like passed

133
00:06:15.480 --> 00:06:17.279
<v Speaker 2>by reference, but without actual pointers.

134
00:06:17.319 --> 00:06:19.759
<v Speaker 1>Okay, a different way of thinking about function parameters. And

135
00:06:19.800 --> 00:06:21.519
<v Speaker 1>there's a preprocessor.

136
00:06:20.839 --> 00:06:25.160
<v Speaker 2>Too, yeah, standard C style preprocessor stuff. Hashtag version is crucial.

137
00:06:25.199 --> 00:06:27.720
<v Speaker 2>You have to declare which GLSL version you're writing for

138
00:06:27.879 --> 00:06:31.199
<v Speaker 2>right at the top. Then you have hashtag defined for macros,

139
00:06:31.600 --> 00:06:35.279
<v Speaker 2>hashtag if, hashtag if deaf for conditional compilation. Pretty standard.

140
00:06:35.319 --> 00:06:38.839
<v Speaker 1>So this GLSL code runs on the GPU. How does

141
00:06:38.839 --> 00:06:42.319
<v Speaker 1>my main program running on the CPU actually feed data

142
00:06:42.560 --> 00:06:45.959
<v Speaker 1>into these shaders like the position of a light or

143
00:06:46.040 --> 00:06:47.439
<v Speaker 1>the main camera's view matrix.

144
00:06:47.680 --> 00:06:50.439
<v Speaker 2>That's where uniform variables come in. These are variables you

145
00:06:50.480 --> 00:06:53.959
<v Speaker 2>declare in your shader code marked with the uniform keyword, but.

146
00:06:53.920 --> 00:06:56.120
<v Speaker 1>Their values aren't set in the shader. They come from

147
00:06:56.120 --> 00:06:57.160
<v Speaker 1>outside exactly.

148
00:06:57.319 --> 00:07:01.120
<v Speaker 2>Your CPU side application code uses OpenGL API calls like

149
00:07:01.240 --> 00:07:03.560
<v Speaker 2>glow uniform matrix for a fee to send up a

150
00:07:03.560 --> 00:07:06.120
<v Speaker 2>four by four matrix or glow uniform three for a

151
00:07:06.160 --> 00:07:08.759
<v Speaker 2>three component vector to set the value of these uniforms, and.

152
00:07:08.720 --> 00:07:11.839
<v Speaker 1>Once set they're constant for like that whole draw call right,

153
00:07:11.879 --> 00:07:13.360
<v Speaker 1>read only inside the shader.

154
00:07:13.160 --> 00:07:15.879
<v Speaker 2>Correct global and read only for that shader execution. This

155
00:07:16.000 --> 00:07:19.160
<v Speaker 2>is the main pipe for getting dynamic data and light positions, colors,

156
00:07:19.519 --> 00:07:22.680
<v Speaker 2>time transformation matrices, texture units.

157
00:07:22.480 --> 00:07:25.920
<v Speaker 1>All that stuff. For textures, specifically, you declare a uniform

158
00:07:25.959 --> 00:07:28.360
<v Speaker 1>sampler two D or similar in the shader.

159
00:07:28.240 --> 00:07:30.319
<v Speaker 2>And then on the CPU side you bind a texture

160
00:07:30.360 --> 00:07:33.120
<v Speaker 2>to a texture unit and tell the shader uniform which

161
00:07:33.199 --> 00:07:35.399
<v Speaker 2>unit to use via gleet uniform one.

162
00:07:35.439 --> 00:07:37.759
<v Speaker 1>I got it. That makes the connection between CPU and

163
00:07:37.800 --> 00:07:41.920
<v Speaker 1>GPU much clearer. Okay, let's trace a data flow. First,

164
00:07:41.959 --> 00:07:44.600
<v Speaker 1>stop the vertex shader right.

165
00:07:44.879 --> 00:07:48.000
<v Speaker 2>The first programmable stage your vertex data hits. It's a

166
00:07:48.040 --> 00:07:51.439
<v Speaker 2>pervertex operation. That means the code you write here runs

167
00:07:51.600 --> 00:07:54.639
<v Speaker 2>once and only once, for each vertex you send to

168
00:07:54.639 --> 00:07:55.279
<v Speaker 2>the graphics card.

169
00:07:55.319 --> 00:07:58.079
<v Speaker 1>So if I have a Toddle with say ten thousand vertices,

170
00:07:58.519 --> 00:08:01.519
<v Speaker 1>this shader runs ten thousand times every single frame.

171
00:08:01.639 --> 00:08:05.560
<v Speaker 2>That's the idea. Its primary job transforming vertices, usually taking

172
00:08:05.600 --> 00:08:08.839
<v Speaker 2>the vertex position from its local model space, applying the

173
00:08:08.839 --> 00:08:11.600
<v Speaker 2>model view and projection matrices.

174
00:08:11.199 --> 00:08:14.279
<v Speaker 1>The classic projection view model vertex position.

175
00:08:13.959 --> 00:08:16.759
<v Speaker 2>Formula exactly to figure out where that vertex ends up

176
00:08:16.759 --> 00:08:18.920
<v Speaker 2>in clip space basically screen coordinates.

177
00:08:19.040 --> 00:08:20.120
<v Speaker 1>What does it take as input?

178
00:08:20.360 --> 00:08:23.879
<v Speaker 2>Its main inputs are the vertex attributes, the vertex data

179
00:08:24.000 --> 00:08:27.360
<v Speaker 2>like position may be a normal vector, textra coordinates, vertex color.

180
00:08:27.160 --> 00:08:30.680
<v Speaker 1>Defining your vertex, bucker objects on the CPU side YEP.

181
00:08:30.759 --> 00:08:34.519
<v Speaker 2>And those uniform variables we just discussed, like the transformation matrices.

182
00:08:34.120 --> 00:08:35.720
<v Speaker 1>Themselves, and its main output.

183
00:08:36.240 --> 00:08:40.240
<v Speaker 2>The one mandatory output is the final transformed vertex position.

184
00:08:40.720 --> 00:08:42.840
<v Speaker 2>You have to write this to the special built in

185
00:08:43.000 --> 00:08:44.240
<v Speaker 2>variable gl position.

186
00:08:44.360 --> 00:08:45.879
<v Speaker 1>But it can output other things too.

187
00:08:45.879 --> 00:08:48.679
<v Speaker 2>Yes, and this is super important. It can output other

188
00:08:48.840 --> 00:08:53.440
<v Speaker 2>values using out variables. These become interbelators interpolators.

189
00:08:53.519 --> 00:08:54.200
<v Speaker 1>Yeah, what do they do?

190
00:08:54.720 --> 00:08:58.240
<v Speaker 2>There are values that get well interpolated across the surface

191
00:08:58.240 --> 00:09:00.799
<v Speaker 2>of the primitive, like the triangle being drawn on after

192
00:09:00.879 --> 00:09:03.720
<v Speaker 2>the vertex shader runs, but before the fragment shader runs.

193
00:09:03.759 --> 00:09:05.519
<v Speaker 1>Okay, hang on. So if I have a triangle and

194
00:09:05.600 --> 00:09:10.120
<v Speaker 1>each vertex shader outputs a different color, the fragment shader

195
00:09:10.120 --> 00:09:11.559
<v Speaker 1>doesn't just get one of those colors.

196
00:09:11.720 --> 00:09:14.360
<v Speaker 2>No, it gets a smoothly blended color based on where

197
00:09:14.399 --> 00:09:17.759
<v Speaker 2>the fragment is inside the triangle. It interpolates the colors

198
00:09:17.759 --> 00:09:18.759
<v Speaker 2>from the three vertices.

199
00:09:18.879 --> 00:09:21.840
<v Speaker 1>Ah. Okay, that's how you get smooth the color gradients

200
00:09:21.840 --> 00:09:25.960
<v Speaker 1>across the surface, or how texture coordinates smoothly map across

201
00:09:25.960 --> 00:09:27.759
<v Speaker 1>a triangle even though you only define them at.

202
00:09:27.720 --> 00:09:31.720
<v Speaker 2>The corners exactly same for normal vectors, which is crucial

203
00:09:31.759 --> 00:09:35.960
<v Speaker 2>for smooth lighting. The book shows examples just transforming position

204
00:09:36.320 --> 00:09:39.440
<v Speaker 2>using a matrix uniform to scale or deform the geometry.

205
00:09:39.600 --> 00:09:43.039
<v Speaker 1>Right, it asks see how just applying the proper transform

206
00:09:43.080 --> 00:09:45.919
<v Speaker 1>matrix we get the desired deformation.

207
00:09:45.639 --> 00:09:49.080
<v Speaker 2>And examples passing color or texture coordinates through using those

208
00:09:49.080 --> 00:09:52.559
<v Speaker 2>out variables which become invariables in the next stage.

209
00:09:52.600 --> 00:09:58.159
<v Speaker 1>It also sets up lighting theory mentions the Fong model, ambient, diffuse, speculator.

210
00:09:57.759 --> 00:10:00.600
<v Speaker 2>Light, and the key vectors needed the surface normal, the

211
00:10:00.679 --> 00:10:03.000
<v Speaker 2>light direction, the view direction. It shows how you could

212
00:10:03.039 --> 00:10:07.080
<v Speaker 2>calculate lighting here at each vertex that's per vertex lighting

213
00:10:07.600 --> 00:10:08.639
<v Speaker 2>or growed shading.

214
00:10:09.000 --> 00:10:12.000
<v Speaker 1>So the vertex shader positions the geometry and can set

215
00:10:12.039 --> 00:10:16.559
<v Speaker 1>up data like color or normals for interpolation. What happens next.

216
00:10:16.440 --> 00:10:20.080
<v Speaker 2>The data, including those interpolated values, flows to the fragment

217
00:10:20.120 --> 00:10:22.519
<v Speaker 2>shader sometimes called the pixel shader.

218
00:10:22.240 --> 00:10:25.759
<v Speaker 1>And its job is basically coloring things in pretty.

219
00:10:25.519 --> 00:10:28.679
<v Speaker 2>Much, it's a per fragment operation. After the hardware figures

220
00:10:28.679 --> 00:10:31.279
<v Speaker 2>out which pixels are covered by a triangle, this shader

221
00:10:31.440 --> 00:10:34.080
<v Speaker 2>runs for each of those potential pixels or fragments to

222
00:10:34.120 --> 00:10:35.240
<v Speaker 2>determine its final color.

223
00:10:35.440 --> 00:10:37.399
<v Speaker 1>And the scale here is even bigger. Right the book

224
00:10:37.440 --> 00:10:40.559
<v Speaker 1>warns it can run millions of times per frame easily.

225
00:10:40.960 --> 00:10:44.080
<v Speaker 2>Think about a high resolution screen. That's why optimizing fragment

226
00:10:44.120 --> 00:10:46.600
<v Speaker 2>shaders is often more critical than in the vertex shaders.

227
00:10:47.000 --> 00:10:49.279
<v Speaker 2>Performance here directly hits your fill rate.

228
00:10:49.720 --> 00:10:50.679
<v Speaker 1>What are its inputs?

229
00:10:50.759 --> 00:10:54.159
<v Speaker 2>It gets uniform variables from the application, including those samplers

230
00:10:54.159 --> 00:10:58.120
<v Speaker 2>for accessing textures using built in functions like texture.

231
00:10:57.720 --> 00:11:00.720
<v Speaker 1>And critically, it gets the interpolated data from the vertex

232
00:11:00.759 --> 00:11:03.120
<v Speaker 1>shader via invariables.

233
00:11:02.600 --> 00:11:07.240
<v Speaker 2>Right, the interpolated texture coordinates, interpolated vertex colors, maybe that interpolated.

234
00:11:06.720 --> 00:11:08.320
<v Speaker 1>Normal vector, and its output.

235
00:11:08.559 --> 00:11:11.039
<v Speaker 2>The main output is the final color for that fragment,

236
00:11:11.200 --> 00:11:14.080
<v Speaker 2>usually in RGBA vec four written to an out variable

237
00:11:14.200 --> 00:11:17.840
<v Speaker 2>that's linked to the framebuffer like out vec four framebuffer color.

238
00:11:18.440 --> 00:11:21.799
<v Speaker 2>You can also optionally write depth to gl frag depth,

239
00:11:22.240 --> 00:11:25.440
<v Speaker 2>but there are limits. The book notes, a fragment shader

240
00:11:25.519 --> 00:11:28.919
<v Speaker 2>generally can't read other pixel colors from the screen it's

241
00:11:28.960 --> 00:11:29.679
<v Speaker 2>currently drawing to.

242
00:11:29.960 --> 00:11:33.720
<v Speaker 1>Right, That's why complex effects sometimes need multiple rendering passes.

243
00:11:34.279 --> 00:11:37.320
<v Speaker 1>Render something to a texture first, then use that texture

244
00:11:37.360 --> 00:11:38.200
<v Speaker 1>in a later pass.

245
00:11:38.480 --> 00:11:41.519
<v Speaker 2>And there's the discard keyword very useful.

246
00:11:41.559 --> 00:11:42.080
<v Speaker 1>What does that do?

247
00:11:42.279 --> 00:11:45.320
<v Speaker 2>It just tells the GPU to stop processing this specific

248
00:11:45.399 --> 00:11:48.360
<v Speaker 2>fragment completely. The frame buffer won't be updated at all

249
00:11:48.399 --> 00:11:50.600
<v Speaker 2>for that pixel. It's how you make parts of a

250
00:11:50.639 --> 00:11:54.399
<v Speaker 2>triangle transparent or create cutout effects based on a texture's

251
00:11:54.480 --> 00:11:55.759
<v Speaker 2>alpha channel for instance.

252
00:11:55.840 --> 00:11:59.799
<v Speaker 1>Okay, the example show this progression. Well, a simple solid

253
00:11:59.799 --> 00:12:02.320
<v Speaker 1>color color mash using a uniform.

254
00:12:02.120 --> 00:12:06.039
<v Speaker 2>Then using interpolated vertex colors for a smooth gradient.

255
00:12:05.799 --> 00:12:08.960
<v Speaker 1>Then using interpolated texture coordinates to look up color from

256
00:12:09.000 --> 00:12:09.919
<v Speaker 1>a texture.

257
00:12:09.559 --> 00:12:13.000
<v Speaker 2>Map, and then the big one fong lighting per pixel.

258
00:12:13.240 --> 00:12:16.120
<v Speaker 1>Ah. So instead of calculating lighting at the vertices and

259
00:12:16.200 --> 00:12:17.960
<v Speaker 1>interpolating the resulting color.

260
00:12:17.759 --> 00:12:20.919
<v Speaker 2>You interpolate the data needed for lighting, like the normal vector,

261
00:12:21.039 --> 00:12:23.919
<v Speaker 2>maybe the position, and then you perform the full lighting

262
00:12:23.960 --> 00:12:27.919
<v Speaker 2>calculation inside the fragment shader for every single fragment.

263
00:12:27.639 --> 00:12:30.799
<v Speaker 1>Which usually looks much better, right, especially for specular.

264
00:12:30.399 --> 00:12:34.799
<v Speaker 2>Highlights, way better, the book says, Notice the specular reflections

265
00:12:35.080 --> 00:12:38.559
<v Speaker 2>and the smooth darkening. You get much more accurate results

266
00:12:38.559 --> 00:12:41.759
<v Speaker 2>because you're calculating with more precise per fragment data.

267
00:12:42.080 --> 00:12:46.159
<v Speaker 1>And that final example combining texture color per pixel lighting

268
00:12:46.679 --> 00:12:50.279
<v Speaker 1>and maybe using the texture's alpha channel to mask the specularity.

269
00:12:51.039 --> 00:12:53.000
<v Speaker 1>That really shows how you layer things.

270
00:12:52.840 --> 00:12:55.720
<v Speaker 2>Up for realism exactly. It's where all the pieces come

271
00:12:55.759 --> 00:12:58.120
<v Speaker 2>together to decide that final pixel color you see.

272
00:12:58.159 --> 00:13:01.000
<v Speaker 1>Okay, vertext shader shapes it, frag mind shader colors it.

273
00:13:01.240 --> 00:13:04.759
<v Speaker 1>What's next? The book mentions a geometry shader. Right.

274
00:13:05.279 --> 00:13:08.559
<v Speaker 2>This one sits after the vertex shader, but before clipping

275
00:13:08.559 --> 00:13:12.159
<v Speaker 2>and rasterization. Its main purpose is pretty wild. It can

276
00:13:12.200 --> 00:13:14.240
<v Speaker 2>actually create new primitives.

277
00:13:14.000 --> 00:13:17.120
<v Speaker 1>New geometry on the fly, so it doesn't just modify vertices,

278
00:13:17.480 --> 00:13:19.279
<v Speaker 1>it can make more exactly.

279
00:13:19.360 --> 00:13:22.799
<v Speaker 2>That's the key difference. A vertex shader processes one vertex

280
00:13:22.840 --> 00:13:26.320
<v Speaker 2>at a time. A geometry shader receives an entire input

281
00:13:26.399 --> 00:13:29.519
<v Speaker 2>primitive like a point, a line, or a full triangle.

282
00:13:29.720 --> 00:13:32.960
<v Speaker 1>Even if you send a triangle strip, it gets individual triangles. Yep.

283
00:13:33.000 --> 00:13:36.159
<v Speaker 2>The hardware breaks down strips and fans into individual primitives

284
00:13:36.200 --> 00:13:38.679
<v Speaker 2>before they hit the geometry shader, so it gets the

285
00:13:38.679 --> 00:13:42.879
<v Speaker 2>whole primitive, sometimes even info about adjacent primitives. And then, crucially,

286
00:13:42.919 --> 00:13:46.279
<v Speaker 2>it can produce new vertices and emit new primitives using

287
00:13:46.320 --> 00:13:49.799
<v Speaker 2>commands like emitt vertex and n primitive wow.

288
00:13:50.159 --> 00:13:52.480
<v Speaker 1>So you could feed at one point and have it

289
00:13:52.559 --> 00:13:56.440
<v Speaker 1>output say a quad or even a more complex shape.

290
00:13:56.480 --> 00:14:00.399
<v Speaker 2>Precisely. That's its power. It can amplify or change the

291
00:14:00.440 --> 00:14:01.759
<v Speaker 2>geometry dramatically.

292
00:14:01.919 --> 00:14:04.279
<v Speaker 1>That sounds computationally expensive, though it can be.

293
00:14:04.480 --> 00:14:07.080
<v Speaker 2>Yeah, you have to be mindful of how much geometry

294
00:14:07.080 --> 00:14:11.120
<v Speaker 2>you're generating. Some other details, The input primitive type and

295
00:14:11.200 --> 00:14:14.440
<v Speaker 2>the output primitive type don't have to match. You tell

296
00:14:14.480 --> 00:14:17.840
<v Speaker 2>the shader what kind of primitive to expect layout points

297
00:14:17.879 --> 00:14:21.039
<v Speaker 2>in and what kind of will output layout triangle strip

298
00:14:21.039 --> 00:14:24.679
<v Speaker 2>max vertices evil four including the maximum number of vertices

299
00:14:24.720 --> 00:14:25.200
<v Speaker 2>it might.

300
00:14:25.080 --> 00:14:27.480
<v Speaker 1>Output, and how does data get passed to and from it?

301
00:14:27.559 --> 00:14:28.759
<v Speaker 1>You mentioned interface blocks.

302
00:14:28.919 --> 00:14:32.720
<v Speaker 2>Yeah, for geometry shaders in later stages, simple inout variables

303
00:14:32.720 --> 00:14:36.000
<v Speaker 2>aren't enough. Interface blocks are like name structs that group

304
00:14:36.120 --> 00:14:38.879
<v Speaker 2>variables together for input and output. The input from the

305
00:14:38.960 --> 00:14:41.440
<v Speaker 2>vertex shader comes in a built in block called Glenn,

306
00:14:41.519 --> 00:14:43.919
<v Speaker 2>which is an array holding the data for each vertex

307
00:14:43.919 --> 00:14:44.919
<v Speaker 2>of the input primitive.

308
00:14:45.000 --> 00:14:47.519
<v Speaker 1>Okay, the book has a simple pass through example.

309
00:14:47.559 --> 00:14:50.879
<v Speaker 2>First right, it just takes the input primitive loops through

310
00:14:50.879 --> 00:14:54.399
<v Speaker 2>its vertices in Glenn, calls immit vertex for each one,

311
00:14:54.480 --> 00:14:57.720
<v Speaker 2>and then n primitive to basically output the exact same

312
00:14:57.759 --> 00:14:59.639
<v Speaker 2>primitive just shows the basic structure.

313
00:15:00.000 --> 00:15:03.679
<v Speaker 1>The cool example is the crowd of butterflies. That sounds

314
00:15:03.720 --> 00:15:05.879
<v Speaker 1>like a classic geometry shader use case.

315
00:15:06.000 --> 00:15:10.320
<v Speaker 2>It really is. Imagine you want to draw thousands of butterflies.

316
00:15:10.840 --> 00:15:14.000
<v Speaker 2>Sending the full mesh for every butterfly from the CPU

317
00:15:14.080 --> 00:15:16.559
<v Speaker 2>would be a lot of data. You just send points.

318
00:15:16.759 --> 00:15:21.000
<v Speaker 2>Each point represents a butterfli's position. The geometry shader receives

319
00:15:21.000 --> 00:15:22.320
<v Speaker 2>a point primitive.

320
00:15:21.919 --> 00:15:24.320
<v Speaker 1>And then it generates the butterfly mash exactly.

321
00:15:24.720 --> 00:15:27.919
<v Speaker 2>Inside the shad you calculate the vertex positions needed to

322
00:15:28.000 --> 00:15:31.320
<v Speaker 2>draw say two triangle strips for the wings, maybe apply

323
00:15:31.399 --> 00:15:35.120
<v Speaker 2>some rotation or flapping animation based on time passes a uniform,

324
00:15:35.320 --> 00:15:38.120
<v Speaker 2>and then you emit vertex for all those calculated wing

325
00:15:38.200 --> 00:15:41.639
<v Speaker 2>vertices and end primitive twice once for each wing strip.

326
00:15:41.879 --> 00:15:45.399
<v Speaker 1>So the GPU is generating complex geometry from simple point data.

327
00:15:45.559 --> 00:15:49.080
<v Speaker 1>It's really clever offloading the CPU significantly precisely.

328
00:15:49.320 --> 00:15:51.799
<v Speaker 2>It shows how you can use the GPU to offload

329
00:15:51.799 --> 00:15:55.799
<v Speaker 2>the CPU and some rendering tasks, especially for things involving

330
00:15:55.960 --> 00:15:58.960
<v Speaker 2>lots of repeated, procedurally generated geometry.

331
00:15:59.159 --> 00:16:03.559
<v Speaker 1>Amazing. One more shader type mentioned, the compute shader. This

332
00:16:03.600 --> 00:16:06.879
<v Speaker 1>one sounds different outside the rendering pipeline rules.

333
00:16:07.000 --> 00:16:09.919
<v Speaker 2>Yeah, this one breaks the mold computechhaters. Let you use

334
00:16:09.960 --> 00:16:14.120
<v Speaker 2>the GPU's massive parallel processing power for well pretty much

335
00:16:14.159 --> 00:16:20.679
<v Speaker 2>anything generic computations. This is GPGPU general purpose computation on GPUs.

336
00:16:20.320 --> 00:16:23.679
<v Speaker 1>So not tied to drawing triangles or pixels, just raw

337
00:16:23.720 --> 00:16:25.120
<v Speaker 1>computation exactly.

338
00:16:25.200 --> 00:16:27.799
<v Speaker 2>The execution model is different. You don't think in terms

339
00:16:27.840 --> 00:16:30.919
<v Speaker 2>of vertices or fragments. You dispatch work in work groups,

340
00:16:30.919 --> 00:16:34.080
<v Speaker 2>and each group contains many parallel work items threads.

341
00:16:34.120 --> 00:16:35.399
<v Speaker 1>How do they know what data to work on?

342
00:16:35.519 --> 00:16:38.799
<v Speaker 2>They get built in IDs like gel Global Invocation ID,

343
00:16:38.960 --> 00:16:41.720
<v Speaker 2>which tells each thread its unique index within the overall

344
00:16:41.759 --> 00:16:44.759
<v Speaker 2>computation grid. You use these IDs to figure out which

345
00:16:44.759 --> 00:16:47.240
<v Speaker 2>piece of data to read or write. Work items within

346
00:16:47.279 --> 00:16:50.720
<v Speaker 2>a group can communicate and synchronize using shared local memory.

347
00:16:50.840 --> 00:16:54.120
<v Speaker 1>What's the advantage of using GLSL compute chats over something

348
00:16:54.159 --> 00:16:56.320
<v Speaker 1>like CUDA or OPENZL.

349
00:16:56.639 --> 00:17:00.200
<v Speaker 2>The big plus is integration because it's part of OPENNGL. Well,

350
00:17:00.279 --> 00:17:05.240
<v Speaker 2>a compute shader has seamless access to all your OPENNGL resources, textures, buffers,

351
00:17:05.319 --> 00:17:09.880
<v Speaker 2>using familiar GLSL types and functions. It's really powerful for say,

352
00:17:10.400 --> 00:17:14.400
<v Speaker 2>running a physics simulation on the GPU and then directly

353
00:17:14.519 --> 00:17:18.400
<v Speaker 2>using the results to render geometry or doing complex image

354
00:17:18.400 --> 00:17:19.400
<v Speaker 2>processing effects.

355
00:17:19.680 --> 00:17:22.279
<v Speaker 1>The book mentions it's a synchronous though, like you dispatch

356
00:17:22.359 --> 00:17:25.960
<v Speaker 1>the compute job, gl dispatch compute and the CPU code

357
00:17:26.000 --> 00:17:27.240
<v Speaker 1>continues immediately.

358
00:17:27.319 --> 00:17:31.559
<v Speaker 2>That's right, which means synchronization is absolutely critical. You need

359
00:17:31.599 --> 00:17:33.839
<v Speaker 2>to make sure the compute shader has finished writing its

360
00:17:33.839 --> 00:17:37.279
<v Speaker 2>results before another shader or the CPU tries to read them.

361
00:17:37.319 --> 00:17:40.119
<v Speaker 2>How do you do that using functions like yel memory barrier,

362
00:17:40.279 --> 00:17:42.880
<v Speaker 2>it tells the GPU to ensure certain types of memory

363
00:17:42.880 --> 00:17:47.400
<v Speaker 2>operations are complete before preceding. Getting synchronization right is key

364
00:17:47.519 --> 00:17:48.880
<v Speaker 2>for correctness and performance.

365
00:17:49.119 --> 00:17:51.599
<v Speaker 1>What kind of examples did the book show for compute?

366
00:17:51.759 --> 00:17:54.559
<v Speaker 2>One was basically rendering to a texture, using each work

367
00:17:54.599 --> 00:17:57.240
<v Speaker 2>item to calculate the color for a specific pixel, and

368
00:17:57.279 --> 00:18:01.839
<v Speaker 2>an output image buffer could be image filter, procedural generation, whatever.

369
00:18:01.920 --> 00:18:05.279
<v Speaker 2>And the other was pure number crunching, taking two big

370
00:18:05.359 --> 00:18:09.599
<v Speaker 2>arrays of numbers, having each work item ad corresponding elements together,

371
00:18:09.880 --> 00:18:11.960
<v Speaker 2>and writing the result to a third array.

372
00:18:12.519 --> 00:18:15.920
<v Speaker 1>Zero graphics involved just using the GPU as a massively

373
00:18:15.960 --> 00:18:18.359
<v Speaker 1>parallel math coprocessor.

374
00:18:17.759 --> 00:18:20.720
<v Speaker 2>Exactly, and the book highlights that for tasks like that,

375
00:18:20.799 --> 00:18:24.640
<v Speaker 2>you can potentially make mathematical algorithms two orders of magnitude

376
00:18:24.920 --> 00:18:27.000
<v Speaker 2>faster than doing it simply in CPU.

377
00:18:27.279 --> 00:18:30.799
<v Speaker 1>Wow, one hundred times faster ye, just by running it

378
00:18:30.839 --> 00:18:32.720
<v Speaker 1>on the GPU using a compute shader.

379
00:18:32.799 --> 00:18:34.279
<v Speaker 2>That's the promise of GPGPU.

380
00:18:34.400 --> 00:18:37.119
<v Speaker 1>Yeah, okay, So wrapping this all up, we've journeyed through

381
00:18:37.119 --> 00:18:41.759
<v Speaker 1>this programmable pipeline vertex shads, transforming points.

382
00:18:41.440 --> 00:18:43.759
<v Speaker 2>Fragment shaders, coloring pixels.

383
00:18:43.319 --> 00:18:45.640
<v Speaker 1>Geometry, shads creating new shapes on the fly.

384
00:18:45.680 --> 00:18:49.359
<v Speaker 2>And compute shaders breaking free for general purpose parallel tasks.

385
00:18:49.519 --> 00:18:52.279
<v Speaker 1>It really feels like understanding these stages and how to

386
00:18:52.319 --> 00:18:55.400
<v Speaker 1>program them with GLSL is like getting the keys to

387
00:18:55.440 --> 00:18:57.119
<v Speaker 1>the kingdom for modern graphics.

388
00:18:57.240 --> 00:19:02.000
<v Speaker 2>It absolutely is. You're directly instructing these credibly powerful specialized processors.

389
00:19:02.039 --> 00:19:04.759
<v Speaker 2>You're tapping into performance for graphics and computation that's just

390
00:19:04.920 --> 00:19:07.039
<v Speaker 2>impossible on a CPU alone.

391
00:19:07.079 --> 00:19:10.440
<v Speaker 1>And for you listening, this tech is behind almost everything

392
00:19:10.640 --> 00:19:15.440
<v Speaker 1>visually impressive you see today, games, simulations, visual effects.

393
00:19:15.599 --> 00:19:19.200
<v Speaker 2>It's the foundation. Knowing how these shaders work gives you

394
00:19:19.240 --> 00:19:22.920
<v Speaker 2>the power to understand and eventually create those effects yourself.

395
00:19:23.279 --> 00:19:26.000
<v Speaker 1>So thinking about the big picture for you based on

396
00:19:26.039 --> 00:19:27.680
<v Speaker 1>this deep tive, what's the takeaway?

397
00:19:27.960 --> 00:19:30.359
<v Speaker 2>The takeaway is that the GPU isn't just a dumb

398
00:19:30.359 --> 00:19:36.799
<v Speaker 2>display device anymore. It's a programmable powerhouse. Vertex, fragment, geometry, compute.

399
00:19:37.160 --> 00:19:40.880
<v Speaker 2>Each offers a unique way to leverage that power, which leads.

400
00:19:40.680 --> 00:19:43.359
<v Speaker 1>To that final really provocative thought from the book's conclusion.

401
00:19:43.359 --> 00:19:46.319
<v Speaker 1>It basically says, with the mechanisms that this book provides,

402
00:19:46.599 --> 00:19:48.480
<v Speaker 1>any visual effect that you may have seen in a

403
00:19:48.559 --> 00:19:51.559
<v Speaker 1>video game or even in a CG movie can be achieved.

404
00:19:51.839 --> 00:19:55.960
<v Speaker 2>That's a huge claim, but you know, fundamentally, the potential

405
00:19:56.039 --> 00:19:59.200
<v Speaker 2>is there. The complex, beautiful visuals we see, they're built

406
00:19:59.200 --> 00:20:02.920
<v Speaker 2>by cleverly binding and programming these different shader stages.

407
00:20:03.079 --> 00:20:06.119
<v Speaker 1>So this knowledge is your starting point, that power, that

408
00:20:06.160 --> 00:20:08.559
<v Speaker 1>potential the book talks about. It's there for you to

409
00:20:08.640 --> 00:20:10.359
<v Speaker 1>explore and build upon. Go experiment,
