WEBVTT

1
00:00:00.080 --> 00:00:01.919
<v Speaker 1>You know, I was looking at some Python code the

2
00:00:01.919 --> 00:00:05.360
<v Speaker 1>other day, just a simple script really to sort a

3
00:00:05.400 --> 00:00:08.679
<v Speaker 1>list of names, and it just struck me how much

4
00:00:08.960 --> 00:00:13.160
<v Speaker 1>magic is happening there. I type sort and the world

5
00:00:13.240 --> 00:00:18.120
<v Speaker 1>just arranges itself. It feels seamless, it feels well. Free.

6
00:00:18.239 --> 00:00:20.399
<v Speaker 2>Free is a very dangerous word in computing.

7
00:00:20.239 --> 00:00:23.559
<v Speaker 1>Exactly, and that's the fundamental disconnect we're really tackling today.

8
00:00:23.640 --> 00:00:27.160
<v Speaker 1>We live in this golden era of high level abstraction,

9
00:00:27.440 --> 00:00:31.640
<v Speaker 1>you know, Java, Swift, Python, Ruby, where we are deliberately

10
00:00:31.679 --> 00:00:35.159
<v Speaker 1>shielded from the machine. It's comfortable, it is comfortable, but

11
00:00:35.280 --> 00:00:38.039
<v Speaker 1>the source material we're covering today argues that this shield

12
00:00:38.119 --> 00:00:39.640
<v Speaker 1>is actually a blindfold.

13
00:00:39.799 --> 00:00:44.119
<v Speaker 2>It's a provocative stance. We're diving into. Write Great Code

14
00:00:44.399 --> 00:00:48.119
<v Speaker 2>Volume one, Understanding the Machine by Randall Hyde, and his

15
00:00:48.200 --> 00:00:51.039
<v Speaker 2>premise is, well, it's uncomfortable for a lot of modern developers.

16
00:00:51.399 --> 00:00:53.960
<v Speaker 2>He essentially says, if you don't know what the hardware

17
00:00:54.039 --> 00:00:57.119
<v Speaker 2>is doing with your variables, you aren't writing great code.

18
00:00:57.159 --> 00:00:59.119
<v Speaker 2>You're just writing code that work by accident.

19
00:00:59.359 --> 00:01:02.960
<v Speaker 1>Code that works by accident that stings a little.

20
00:01:03.119 --> 00:01:05.560
<v Speaker 2>It should and look, Heid isn't saying we need to

21
00:01:05.599 --> 00:01:08.239
<v Speaker 2>go back to writing assembly for everything. He's not a masochist,

22
00:01:09.319 --> 00:01:13.159
<v Speaker 2>but he is saying that if you want performance, if

23
00:01:13.159 --> 00:01:16.280
<v Speaker 2>you want that top tier efficiency, you have to be

24
00:01:16.319 --> 00:01:20.519
<v Speaker 2>able to mentally map your high level syntax down to

25
00:01:20.560 --> 00:01:21.760
<v Speaker 2>the low level reality.

26
00:01:22.359 --> 00:01:25.079
<v Speaker 1>So our mission for this deep dive is to well,

27
00:01:25.079 --> 00:01:27.040
<v Speaker 1>it's to tear down that abstraction layer. We're going to

28
00:01:27.079 --> 00:01:29.760
<v Speaker 1>look at how the machine actually thinks, not how we

29
00:01:29.799 --> 00:01:32.599
<v Speaker 1>want it to think exactly. And it starts with something

30
00:01:32.640 --> 00:01:36.319
<v Speaker 1>that seems philosophical but is actually purely mechanical. The difference

31
00:01:36.319 --> 00:01:37.840
<v Speaker 1>between a number and a representation.

32
00:01:38.079 --> 00:01:40.959
<v Speaker 2>Right, this is Chapter one stuff, but it trips people

33
00:01:41.040 --> 00:01:43.280
<v Speaker 2>up all the time in the human world. If I

34
00:01:43.319 --> 00:01:45.319
<v Speaker 2>write the number one hundred on a piece of paper,

35
00:01:45.519 --> 00:01:48.439
<v Speaker 2>the meaning is fixed. It's one hundred items.

36
00:01:48.000 --> 00:01:50.439
<v Speaker 1>One hundred apples, one hundred dollars, simple.

37
00:01:50.159 --> 00:01:53.400
<v Speaker 2>Exactly, It's an abstract quantity. But inside a machine there

38
00:01:53.400 --> 00:01:57.079
<v Speaker 2>are no abstract quantities. Their only representations. Hide uses this

39
00:01:57.120 --> 00:02:00.439
<v Speaker 2>great example. If you see the symbols one, zero, zero

40
00:02:01.040 --> 00:02:04.280
<v Speaker 2>inside a computer's memory, what quantity is that?

41
00:02:04.560 --> 00:02:06.680
<v Speaker 1>My first instinct is, well, it's one hundred.

42
00:02:06.640 --> 00:02:09.840
<v Speaker 2>And that's your decimal bias, showing what if I tell

43
00:02:09.879 --> 00:02:11.120
<v Speaker 2>you that representation is.

44
00:02:11.039 --> 00:02:14.560
<v Speaker 1>In binary binary one zero zero, Okay, that's the quantity

45
00:02:14.599 --> 00:02:16.479
<v Speaker 1>four precisely. And if I tell you it's.

46
00:02:16.360 --> 00:02:19.039
<v Speaker 2>Exodesimal hexodescimal, then it's two hundred and fifty.

47
00:02:18.840 --> 00:02:23.039
<v Speaker 1>Six same symbols, three totally different values. See, the machine

48
00:02:23.080 --> 00:02:25.319
<v Speaker 1>doesn't care about the ink on the page. It cares

49
00:02:25.319 --> 00:02:27.439
<v Speaker 1>about the interpretation of the bit pattern.

50
00:02:27.520 --> 00:02:31.080
<v Speaker 2>And that matters because the algorithms depend on that specific

51
00:02:31.199 --> 00:02:33.319
<v Speaker 2>internal structure to work efficiently.

52
00:02:33.560 --> 00:02:36.960
<v Speaker 1>Yes, if you treat everything as abstract math, you miss

53
00:02:37.000 --> 00:02:39.560
<v Speaker 1>out on all the shortcuts the hardware offers.

54
00:02:39.479 --> 00:02:42.080
<v Speaker 2>And the hardware is rigid. I mean, we have to

55
00:02:42.080 --> 00:02:45.000
<v Speaker 2>talk about the physical reality here. Why are we stuck

56
00:02:45.000 --> 00:02:47.639
<v Speaker 2>with binary? Why zeros and ones? Why not you know,

57
00:02:47.960 --> 00:02:48.719
<v Speaker 2>zero through nine?

58
00:02:48.840 --> 00:02:51.319
<v Speaker 1>It just comes down with reliability. At the hardware level,

59
00:02:51.479 --> 00:02:56.199
<v Speaker 1>you're dealing with electricity voltage, and it is incredibly difficult

60
00:02:56.280 --> 00:03:00.479
<v Speaker 1>to build a circuit that can reliably distinguish between ten

61
00:03:00.520 --> 00:03:01.919
<v Speaker 1>different voltage levels like.

62
00:03:01.840 --> 00:03:03.960
<v Speaker 2>Point one bolts point two volts point three.

63
00:03:04.120 --> 00:03:06.439
<v Speaker 1>Yeah, and do that billions of times a second without

64
00:03:06.439 --> 00:03:07.520
<v Speaker 1>making a single mistake.

65
00:03:07.599 --> 00:03:09.800
<v Speaker 2>Too much noise on the line, way too much noise.

66
00:03:10.159 --> 00:03:14.240
<v Speaker 2>But it's very very easy to distinguish between high voltage

67
00:03:14.479 --> 00:03:17.080
<v Speaker 2>and low voltage on and off saturation and cut off.

68
00:03:17.159 --> 00:03:17.840
<v Speaker 1>That's binary.

69
00:03:17.960 --> 00:03:20.680
<v Speaker 2>That's binary. It's the only way to build reliable circuits

70
00:03:20.719 --> 00:03:22.199
<v Speaker 2>to the scale we operate on today.

71
00:03:22.360 --> 00:03:25.120
<v Speaker 1>Okay, so we're stuck with binary because of physics. But

72
00:03:25.840 --> 00:03:28.560
<v Speaker 1>and I think I speak for all humans here, binary

73
00:03:28.639 --> 00:03:31.039
<v Speaker 1>is just terrible to read. If I have to debug

74
00:03:31.039 --> 00:03:33.759
<v Speaker 1>a memory dump and it's just pages of eleven hundred

75
00:03:33.840 --> 00:03:35.680
<v Speaker 1>under zero ten, I'm going to quit.

76
00:03:35.520 --> 00:03:38.360
<v Speaker 2>My job, which is exactly why we have hexadismal. A

77
00:03:38.400 --> 00:03:40.800
<v Speaker 2>lot of new programmers think hex is just some kind

78
00:03:40.840 --> 00:03:44.280
<v Speaker 2>of computer nerd numbers, but it serves a very specific

79
00:03:44.360 --> 00:03:48.000
<v Speaker 2>structural purpose. It bridges the gap between our brains in

80
00:03:48.039 --> 00:03:51.080
<v Speaker 2>the binary circuits. How so it's all about the nibble.

81
00:03:51.400 --> 00:03:53.520
<v Speaker 2>A nibble is a group of four bits. If you

82
00:03:53.520 --> 00:03:55.960
<v Speaker 2>look at all the possible combinations of four bits from

83
00:03:56.039 --> 00:03:59.159
<v Speaker 2>zero zero, zero, zero, zero eleven, how many possibilities is that?

84
00:03:59.159 --> 00:04:00.240
<v Speaker 1>That would be sixty.

85
00:04:00.439 --> 00:04:03.879
<v Speaker 2>Sixteen possibilities and exodecimal is base sixteen. It has digits

86
00:04:03.960 --> 00:04:07.280
<v Speaker 2>zero through nine and then A through f. That's sixteen digits.

87
00:04:07.599 --> 00:04:11.400
<v Speaker 2>So one single hex digit represents exactly four bits of binary.

88
00:04:11.479 --> 00:04:13.599
<v Speaker 1>Ah, so it's a perfect one to one mapping, a

89
00:04:13.599 --> 00:04:16.240
<v Speaker 1>perfect mapping. So it's not just a random choice. It's

90
00:04:16.279 --> 00:04:20.000
<v Speaker 1>more like a compression algorithm for our eyes. Instead of

91
00:04:20.000 --> 00:04:21.360
<v Speaker 1>writing eleven eleven, I can.

92
00:04:21.199 --> 00:04:24.879
<v Speaker 2>Just write f exactly. It lets us chunk binary into

93
00:04:24.920 --> 00:04:27.680
<v Speaker 2>readable pieces. That's why you see it everywhere in low

94
00:04:27.759 --> 00:04:28.480
<v Speaker 2>level debugging.

95
00:04:29.279 --> 00:04:31.839
<v Speaker 1>Now, speaking of debugging and performance, there was a section

96
00:04:31.879 --> 00:04:34.519
<v Speaker 1>in the book that honestly surprised me. We've talked about

97
00:04:34.560 --> 00:04:37.439
<v Speaker 1>how the machine sees numbers, but we often have to

98
00:04:37.480 --> 00:04:40.120
<v Speaker 1>get numbers into the machine from a user. Right, the

99
00:04:40.199 --> 00:04:43.319
<v Speaker 1>user types one, two, three on their keyboard, and Hyde

100
00:04:43.360 --> 00:04:45.800
<v Speaker 1>points out this hidden costs that I think most of

101
00:04:45.879 --> 00:04:47.399
<v Speaker 1>us just ignore.

102
00:04:47.240 --> 00:04:50.920
<v Speaker 2>The ioconversion cost. Oh yeah, this is a classic hidden bottleneck.

103
00:04:51.040 --> 00:04:54.480
<v Speaker 1>We see a line like sini in C plus plus

104
00:04:54.560 --> 00:04:56.759
<v Speaker 1>or input and Python, and we think, okay, the user

105
00:04:56.800 --> 00:04:58.680
<v Speaker 1>types of number, the computer gets the number.

106
00:04:58.800 --> 00:05:00.879
<v Speaker 2>That is the illusion. The computer doesn't get a number,

107
00:05:01.000 --> 00:05:03.600
<v Speaker 2>It gets a keystroke, It gets an ASSI character code.

108
00:05:03.680 --> 00:05:05.920
<v Speaker 1>Right, so if I type one, two, three, the computer

109
00:05:06.000 --> 00:05:09.959
<v Speaker 1>receives three separate characters one, two, and three.

110
00:05:09.839 --> 00:05:13.199
<v Speaker 2>And converting those characters into a single binary integer that

111
00:05:13.240 --> 00:05:17.879
<v Speaker 2>the CPU can actually do math with that is shockingly expensive.

112
00:05:17.920 --> 00:05:19.519
<v Speaker 1>Walk us through that. Why is it so heavy?

113
00:05:19.639 --> 00:05:22.040
<v Speaker 2>Okay? Well, think about the algorithm. You take the character

114
00:05:22.120 --> 00:05:25.720
<v Speaker 2>one first, you have to subtract the as key offset

115
00:05:26.040 --> 00:05:28.720
<v Speaker 2>usually forty eight to get the actual numeric value of one.

116
00:05:29.360 --> 00:05:32.879
<v Speaker 2>Then you look the next character two. First, to merge them,

117
00:05:33.040 --> 00:05:35.000
<v Speaker 2>you have to take your current total, which is one,

118
00:05:35.279 --> 00:05:36.439
<v Speaker 2>and multiply it by ten.

119
00:05:36.600 --> 00:05:39.560
<v Speaker 1>And multiplication is not a cheat instruction for the CPU,

120
00:05:39.720 --> 00:05:40.199
<v Speaker 1>not at all.

121
00:05:40.360 --> 00:05:43.040
<v Speaker 2>It's heavy. So you multiply by ten, then you add

122
00:05:43.040 --> 00:05:45.600
<v Speaker 2>the new digit. Now you have twelve. Then you get three,

123
00:05:45.680 --> 00:05:48.360
<v Speaker 2>foot you have to multiply that whole previous total by

124
00:05:48.360 --> 00:05:50.040
<v Speaker 2>ten again and add the three.

125
00:05:50.120 --> 00:05:52.199
<v Speaker 1>So if you're reading a million lines of data from

126
00:05:52.240 --> 00:05:56.519
<v Speaker 1>a CSV file, you're running that multiplication loop millions.

127
00:05:56.079 --> 00:05:59.720
<v Speaker 2>Of times millions. And that's just for input out. Putting

128
00:05:59.720 --> 00:06:02.160
<v Speaker 2>it back to the screen can be even worse. Why

129
00:06:02.279 --> 00:06:06.519
<v Speaker 2>worse because that requires division by ten to separate the digits,

130
00:06:07.240 --> 00:06:10.040
<v Speaker 2>and division is often the slowest math operation a modern

131
00:06:10.120 --> 00:06:13.879
<v Speaker 2>CPU can perform. Hyde points out that this text the

132
00:06:13.920 --> 00:06:18.240
<v Speaker 2>number conversion, is often the single biggest bottleneck in a program.

133
00:06:17.800 --> 00:06:20.480
<v Speaker 1>And developers just ignore it because the function call looks

134
00:06:20.480 --> 00:06:23.279
<v Speaker 1>so simple they do. That is a great takeaway. Don't

135
00:06:23.319 --> 00:06:25.879
<v Speaker 1>just print variables for fun inside a tight loop. You're

136
00:06:25.959 --> 00:06:29.519
<v Speaker 1>torturing the CPU. Okay, let's move on to the anatomy

137
00:06:29.560 --> 00:06:32.160
<v Speaker 1>of this data. We threw around the word nibble earlier,

138
00:06:32.240 --> 00:06:34.040
<v Speaker 1>which is cute, but we need to talk about the

139
00:06:34.079 --> 00:06:34.959
<v Speaker 1>heavy headers, right.

140
00:06:35.000 --> 00:06:38.560
<v Speaker 2>The container sizes. A nibble is four bits, a bite

141
00:06:38.600 --> 00:06:41.279
<v Speaker 2>is eight bits, and it's crucial to remember the bite

142
00:06:41.319 --> 00:06:43.800
<v Speaker 2>is usually the smallest addressable unit of.

143
00:06:43.800 --> 00:06:46.120
<v Speaker 1>Memory, meaning you can't just ask the CPO.

144
00:06:45.839 --> 00:06:48.120
<v Speaker 2>For bit three, No, you grab the whole bite, and

145
00:06:48.120 --> 00:06:49.839
<v Speaker 2>then you have to find bit three yourself.

146
00:06:49.920 --> 00:06:52.240
<v Speaker 1>And then we scale up in the context of this book,

147
00:06:52.240 --> 00:06:55.519
<v Speaker 1>which is vary by eighty six focused. A word is

148
00:06:55.600 --> 00:06:56.839
<v Speaker 1>sixteen bits.

149
00:06:56.879 --> 00:06:59.920
<v Speaker 2>Correct, and a double word or do word is three

150
00:07:00.000 --> 00:07:02.920
<v Speaker 2>two bits. A quad word is sixty four bits.

151
00:07:03.079 --> 00:07:05.360
<v Speaker 1>The scale of these things is just wild. A bite

152
00:07:05.439 --> 00:07:08.519
<v Speaker 1>gets you what, two hundred and fifty six values, that's it.

153
00:07:08.839 --> 00:07:11.680
<v Speaker 2>But a thirty two bit doer gets you over four billion.

154
00:07:11.759 --> 00:07:13.920
<v Speaker 1>It's that exponential growth two dollars.

155
00:07:13.680 --> 00:07:17.800
<v Speaker 2>A little exactly. And within these containers, bit numbering is standardized.

156
00:07:18.120 --> 00:07:21.079
<v Speaker 2>Bit zero is your low order bit, the least significant.

157
00:07:21.079 --> 00:07:22.639
<v Speaker 2>The highest number is your high order bit.

158
00:07:22.720 --> 00:07:24.480
<v Speaker 1>And if you mix those up, you're in for a

159
00:07:24.519 --> 00:07:25.519
<v Speaker 1>world of pain.

160
00:07:25.600 --> 00:07:27.560
<v Speaker 2>A very bad time. Interpreting your data.

161
00:07:27.600 --> 00:07:31.000
<v Speaker 1>Speaking of having a bad time, let's talk about negative numbers.

162
00:07:31.279 --> 00:07:32.800
<v Speaker 1>This is one of those things where I just assume

163
00:07:32.839 --> 00:07:36.199
<v Speaker 1>the computer knows the numbers negative, but it's just zeros

164
00:07:36.199 --> 00:07:38.600
<v Speaker 1>and ones. There's no minus sign in memory.

165
00:07:38.759 --> 00:07:41.519
<v Speaker 2>This is one of the most elegant hacks in computer science.

166
00:07:41.680 --> 00:07:44.360
<v Speaker 2>If you are designing a computer from scratch, you might think, okay,

167
00:07:44.439 --> 00:07:47.360
<v Speaker 2>let's use the first bit as a sign zero for positive,

168
00:07:47.360 --> 00:07:48.120
<v Speaker 2>one for negative.

169
00:07:48.279 --> 00:07:49.279
<v Speaker 1>That seems logical.

170
00:07:49.399 --> 00:07:53.040
<v Speaker 2>It is logical, but hardware hate special cases. If you

171
00:07:53.079 --> 00:07:56.079
<v Speaker 2>did that, you'd need separate circuits for addition and subtraction.

172
00:07:56.480 --> 00:08:00.360
<v Speaker 2>You'd need special logic to handle positive zero and negative zero.

173
00:08:00.600 --> 00:08:01.240
<v Speaker 2>It's a mess.

174
00:08:01.319 --> 00:08:03.879
<v Speaker 1>So instead we use two's complement.

175
00:08:03.560 --> 00:08:06.199
<v Speaker 2>Right, and two's compliment is pure genius because it turns

176
00:08:06.199 --> 00:08:07.319
<v Speaker 2>subtraction into additions.

177
00:08:07.399 --> 00:08:09.600
<v Speaker 1>Okay, I looked at the recipe in the book invert

178
00:08:09.639 --> 00:08:12.480
<v Speaker 1>all bits and add one. It sounds like a sorcery spell.

179
00:08:12.920 --> 00:08:14.480
<v Speaker 1>Why does adding one make it work?

180
00:08:14.600 --> 00:08:17.040
<v Speaker 2>Okay? Imagine a mechanical o doometer in an old car.

181
00:08:17.480 --> 00:08:20.360
<v Speaker 2>It's set at zero zero zero zero. If you roll

182
00:08:20.360 --> 00:08:21.959
<v Speaker 2>it backward one mile, what does it show?

183
00:08:22.399 --> 00:08:27.360
<v Speaker 1>It rolls over to nine nine exactly.

184
00:08:26.959 --> 00:08:29.759
<v Speaker 2>The system wraps around in the computer's binary world. If

185
00:08:29.759 --> 00:08:32.440
<v Speaker 2>you are at zero, zero, zero, and you subtract one,

186
00:08:32.679 --> 00:08:36.759
<v Speaker 2>you roll backwards to eleven eleven all ones. In a

187
00:08:36.840 --> 00:08:39.440
<v Speaker 2>sign system, we decide to interpret that all one state

188
00:08:39.759 --> 00:08:41.919
<v Speaker 2>not as a huge number, but as negative one.

189
00:08:42.000 --> 00:08:43.759
<v Speaker 1>So the invert and add one rule is just a

190
00:08:43.799 --> 00:08:47.039
<v Speaker 1>mathematical shortcut to find that specific bit pattern.

191
00:08:47.200 --> 00:08:50.799
<v Speaker 2>Precisely, it calculates the rollover value. It aligns the negative

192
00:08:50.879 --> 00:08:52.840
<v Speaker 2>numbers so that if you add five and negative five,

193
00:08:53.120 --> 00:08:55.679
<v Speaker 2>the binary actually adds up to zero. It essentially rolls

194
00:08:55.679 --> 00:08:58.600
<v Speaker 2>the odometer back to all zero's. Naturally, the CPU doesn't

195
00:08:58.600 --> 00:09:01.080
<v Speaker 2>even know it's doing subtraction. Just adding that is.

196
00:09:01.000 --> 00:09:04.320
<v Speaker 1>Elegant, But Hyde warns us about the edge case from hell,

197
00:09:04.720 --> 00:09:06.320
<v Speaker 1>the number that cannot be negated.

198
00:09:06.399 --> 00:09:10.120
<v Speaker 2>Ah, yes, the minimum negative number. In a sixteen bit system,

199
00:09:10.279 --> 00:09:12.679
<v Speaker 2>your range goes from plus thirty two thousand, seven hundred

200
00:09:12.679 --> 00:09:15.240
<v Speaker 2>and sixty seven down to negative thirty two thousand, seven

201
00:09:15.320 --> 00:09:16.080
<v Speaker 2>hundred and sixty eight.

202
00:09:16.159 --> 00:09:17.120
<v Speaker 1>Wait, those don't match.

203
00:09:17.240 --> 00:09:19.399
<v Speaker 2>They don't because zero takes up one of the positive

204
00:09:19.399 --> 00:09:21.840
<v Speaker 2>slots effectively, so the range is lopsided. You have one

205
00:09:21.840 --> 00:09:23.519
<v Speaker 2>more negative number than positive numbers.

206
00:09:23.519 --> 00:09:25.879
<v Speaker 1>So if I try to negate negative thirty two thousand,

207
00:09:25.919 --> 00:09:26.960
<v Speaker 1>seven hundred and sixty eight.

208
00:09:27.080 --> 00:09:29.879
<v Speaker 2>There is no plus thirty two thousand, seven hundred and

209
00:09:29.879 --> 00:09:31.759
<v Speaker 2>sixty eight to turn it into. It doesn't exist in

210
00:09:31.799 --> 00:09:35.360
<v Speaker 2>sixteen bits, so the operation overflows, and in two's compliment,

211
00:09:35.519 --> 00:09:38.200
<v Speaker 2>due to the math, it actually wraps right back round

212
00:09:38.559 --> 00:09:41.279
<v Speaker 2>to negative thirty two thousand, seven hundred and sixty eight.

213
00:09:41.480 --> 00:09:45.759
<v Speaker 1>That is terrifying. So x mclix could return the same.

214
00:09:45.600 --> 00:09:48.279
<v Speaker 2>Number only for that one specific value. And if your

215
00:09:48.279 --> 00:09:52.000
<v Speaker 2>code relies on using absolute values to sanitize inputs, say

216
00:09:52.039 --> 00:09:54.840
<v Speaker 2>you're calculating distance and you assume it must be positive

217
00:09:55.039 --> 00:09:56.440
<v Speaker 2>that one number will crash.

218
00:09:56.200 --> 00:09:58.080
<v Speaker 1>Your system or just create a logic bomb.

219
00:09:58.200 --> 00:09:58.759
<v Speaker 2>Exactly.

220
00:09:58.919 --> 00:10:01.360
<v Speaker 1>That is exactly the kind of low level detail that

221
00:10:01.480 --> 00:10:05.519
<v Speaker 1>high level languages hide until it bites you. Now relate

222
00:10:05.600 --> 00:10:07.440
<v Speaker 1>to this as sign extension. Let's say I have that

223
00:10:07.519 --> 00:10:10.120
<v Speaker 1>number metative five and a tiny eight bit byte, and

224
00:10:10.159 --> 00:10:12.960
<v Speaker 1>I want to move it into a big, spacious sixteen.

225
00:10:12.559 --> 00:10:14.360
<v Speaker 2>Bit word, very common operation.

226
00:10:14.639 --> 00:10:17.720
<v Speaker 1>My instinct is to just pad the extra space with zeros, and.

227
00:10:17.679 --> 00:10:19.600
<v Speaker 2>If you do that, you break the number. Remember, in

228
00:10:19.720 --> 00:10:22.480
<v Speaker 2>two's complement, negative numbers have a one in the high

229
00:10:22.519 --> 00:10:24.519
<v Speaker 2>order bit. If you add zeros in front of it.

230
00:10:24.799 --> 00:10:26.440
<v Speaker 2>That one is no longer the high order bit.

231
00:10:26.519 --> 00:10:29.039
<v Speaker 1>You've just turned a small negative number into a generic

232
00:10:29.080 --> 00:10:29.840
<v Speaker 1>positive number.

233
00:10:29.919 --> 00:10:31.639
<v Speaker 2>Right, so you have to copy the sign.

234
00:10:31.399 --> 00:10:32.679
<v Speaker 1>Bit smeared across the top.

235
00:10:33.000 --> 00:10:36.120
<v Speaker 2>Yes, you smear that sign bit across all the new

236
00:10:36.240 --> 00:10:39.759
<v Speaker 2>upper positions. That preserves the negativity, so to speak. It

237
00:10:39.840 --> 00:10:42.600
<v Speaker 2>keeps the odometer rolled over correctly in the larger container.

238
00:10:42.600 --> 00:10:45.759
<v Speaker 1>All right, let's get into the wizardry bitwise operations. This

239
00:10:45.840 --> 00:10:48.240
<v Speaker 1>is where I feel like the real hacker stuff happens.

240
00:10:48.279 --> 00:10:52.120
<v Speaker 1>We have logic gates a niro or exo hoto jus.

241
00:10:52.440 --> 00:10:55.039
<v Speaker 2>These are the fundamental building blocks of the CPU, but

242
00:10:55.159 --> 00:10:58.440
<v Speaker 2>software developers can use them for some incredible optimizations.

243
00:10:58.519 --> 00:11:01.039
<v Speaker 1>Let's look at the A and D operation. It compares

244
00:11:01.080 --> 00:11:03.799
<v Speaker 1>two bits and only returns one if both are one.

245
00:11:04.440 --> 00:11:06.559
<v Speaker 1>The book mentions a trick for checking if a number

246
00:11:06.600 --> 00:11:07.440
<v Speaker 1>is odd or even.

247
00:11:07.519 --> 00:11:10.240
<v Speaker 2>Using this this is a classic. Usually people use the

248
00:11:10.240 --> 00:11:13.879
<v Speaker 2>modulo operator x to b two. If it's zero, it's even.

249
00:11:14.440 --> 00:11:15.879
<v Speaker 2>But remember what we said about division.

250
00:11:15.919 --> 00:11:16.879
<v Speaker 1>The machine hates it.

251
00:11:16.879 --> 00:11:19.919
<v Speaker 2>It's slow, right, But if you look at binary, any

252
00:11:19.919 --> 00:11:22.399
<v Speaker 2>odd number ends with the one, any even number ends

253
00:11:22.399 --> 00:11:24.559
<v Speaker 2>with a zero. It's that simple. Okay, So if you

254
00:11:24.559 --> 00:11:25.519
<v Speaker 2>simply do xa and d.

255
00:11:25.559 --> 00:11:27.120
<v Speaker 1>One, you're just isolating that last bit.

256
00:11:27.279 --> 00:11:30.120
<v Speaker 2>You just check in that last bit instantaneously. If the

257
00:11:30.159 --> 00:11:33.440
<v Speaker 2>result is one, it's odd, if zero it's even. It's

258
00:11:33.480 --> 00:11:35.360
<v Speaker 2>massively faster than a division operation.

259
00:11:35.600 --> 00:11:37.600
<v Speaker 1>That is cool. And there's another trick with modular right.

260
00:11:38.000 --> 00:11:39.960
<v Speaker 1>If you want to cycle a counter, say from zero

261
00:11:40.039 --> 00:11:41.919
<v Speaker 1>to thirty one, and then loop back to zero.

262
00:11:41.720 --> 00:11:45.159
<v Speaker 2>Yes, module encounter. Normally you do x plus one percent

263
00:11:45.200 --> 00:11:48.039
<v Speaker 2>thirty two. Again, division is expensive, but because thirty two

264
00:11:48.120 --> 00:11:50.279
<v Speaker 2>is a power of two, we can use a mask.

265
00:11:50.720 --> 00:11:53.600
<v Speaker 2>Thirty two in binary is a one followed by five zeros.

266
00:11:53.639 --> 00:11:57.480
<v Speaker 2>The number thirty one is just five ones zero, zero, zero, zero,

267
00:11:57.559 --> 00:11:58.759
<v Speaker 2>zero one on one one one.

268
00:11:59.240 --> 00:12:01.480
<v Speaker 1>So if we a counter with thirty one.

269
00:12:01.559 --> 00:12:04.000
<v Speaker 2>It forces all the upper bits to zero. It effectively

270
00:12:04.120 --> 00:12:07.200
<v Speaker 2>chops off any value greater than thirty one. So x

271
00:12:07.200 --> 00:12:09.120
<v Speaker 2>a and d thirty one gives you the exact same

272
00:12:09.159 --> 00:12:11.440
<v Speaker 2>result as x percent and thirty two, but in terms

273
00:12:11.480 --> 00:12:14.000
<v Speaker 2>of speed it's a Ferrari versus a bicycle.

274
00:12:14.200 --> 00:12:17.120
<v Speaker 1>And this leads us right into shifting. We talked about

275
00:12:17.120 --> 00:12:21.000
<v Speaker 1>how expensive multiplication and division are, but shifting bits left

276
00:12:21.039 --> 00:12:23.679
<v Speaker 1>or right is practically free for the CPU right.

277
00:12:23.679 --> 00:12:26.399
<v Speaker 2>If you shift a binary number to the left, moving

278
00:12:26.440 --> 00:12:28.679
<v Speaker 2>all abyss one slot over and adding a zero at

279
00:12:28.679 --> 00:12:30.320
<v Speaker 2>the end, you've just multiplied by two.

280
00:12:30.360 --> 00:12:32.200
<v Speaker 1>Shift it left again, you've multiplied by four, and.

281
00:12:32.240 --> 00:12:35.200
<v Speaker 2>Shifting right divines by two. Simple enough, with a caveat,

282
00:12:35.320 --> 00:12:36.840
<v Speaker 2>you have to use the right kind of shift. There's

283
00:12:36.879 --> 00:12:39.159
<v Speaker 2>a logical shift right which just fills in with zeros.

284
00:12:39.200 --> 00:12:41.120
<v Speaker 2>That's fine for unsigned numbers. But if you have a

285
00:12:41.120 --> 00:12:42.200
<v Speaker 2>negative number.

286
00:12:42.120 --> 00:12:44.639
<v Speaker 1>Oh right, the sign bit. If you fill with zeros,

287
00:12:44.679 --> 00:12:46.559
<v Speaker 1>you lose the negative sign exactly.

288
00:12:46.600 --> 00:12:49.399
<v Speaker 2>So you need an arithmetic shift, which preserves the sign bit.

289
00:12:49.480 --> 00:12:51.840
<v Speaker 2>It copies a sign bit as it shifts. If you

290
00:12:51.919 --> 00:12:55.000
<v Speaker 2>use the wrong one, your negative number suddenly becomes a

291
00:12:55.080 --> 00:12:57.720
<v Speaker 2>huge positive number and your math breaks completely.

292
00:12:57.799 --> 00:12:59.360
<v Speaker 1>Now I want to push back on something. In the

293
00:12:59.559 --> 00:13:05.600
<v Speaker 1>Packed Data chapter, Hyde talks about squashing a date, month, day,

294
00:13:05.919 --> 00:13:08.200
<v Speaker 1>year into a single sixteen bit word.

295
00:13:08.480 --> 00:13:11.639
<v Speaker 2>It's a classic optimization. Four bits for the month, five

296
00:13:11.679 --> 00:13:13.639
<v Speaker 2>bits for the day, seven bits for the year.

297
00:13:13.840 --> 00:13:17.080
<v Speaker 1>Sure it's saved space, but RAM is cheap Today. I

298
00:13:17.080 --> 00:13:19.440
<v Speaker 1>have thirty two gigs on my laptop. Why would I

299
00:13:19.480 --> 00:13:22.039
<v Speaker 1>burn brain cycles trying to fit a date into two

300
00:13:22.080 --> 00:13:24.879
<v Speaker 1>bytes when I can just use integers. Why make my

301
00:13:24.960 --> 00:13:26.960
<v Speaker 1>code unreadable just to save a few bytes?

302
00:13:27.200 --> 00:13:30.120
<v Speaker 2>That is the billion dollar question. RAM is cheap is

303
00:13:30.120 --> 00:13:33.039
<v Speaker 2>the mantra of modern development. But here's the counter argument.

304
00:13:33.240 --> 00:13:34.320
<v Speaker 2>Cash is expensive.

305
00:13:34.559 --> 00:13:35.720
<v Speaker 1>The CPU cash right.

306
00:13:35.759 --> 00:13:38.600
<v Speaker 2>The CPU is incredibly fast main memory. Your thirty two

307
00:13:38.600 --> 00:13:41.720
<v Speaker 2>GB of RAM is incredibly slow. By comparison. It's like

308
00:13:41.720 --> 00:13:43.720
<v Speaker 2>a library on the other side of town. The cash

309
00:13:43.840 --> 00:13:45.960
<v Speaker 2>is the bookshelf right next to your desk. If you

310
00:13:46.200 --> 00:13:50.080
<v Speaker 2>use big, bloated integers for everything, you fill up that

311
00:13:50.120 --> 00:13:53.320
<v Speaker 2>bookshelf with junk. You get cash misses where the CPU

312
00:13:53.399 --> 00:13:56.440
<v Speaker 2>has to sit idle, twiddling its thumbs, waiting to fetch

313
00:13:56.559 --> 00:13:57.840
<v Speaker 2>more data from across town.

314
00:13:58.200 --> 00:14:01.200
<v Speaker 1>So packing the data isn't just about saving hard drive space.

315
00:14:01.360 --> 00:14:04.159
<v Speaker 1>It's about keeping more data close to the CPU to

316
00:14:04.240 --> 00:14:05.919
<v Speaker 1>keep it fed exactly.

317
00:14:06.679 --> 00:14:09.879
<v Speaker 2>However, and this is the trade off. Hide emphasizes, you

318
00:14:10.000 --> 00:14:12.399
<v Speaker 2>pay a tax every time you want to read.

319
00:14:12.240 --> 00:14:15.240
<v Speaker 1>That pack data because the CPU can't read middle bits.

320
00:14:15.440 --> 00:14:18.360
<v Speaker 2>No, it can't just look at bits five through nine.

321
00:14:18.399 --> 00:14:20.919
<v Speaker 2>It has to fetch the whole word, load a mask

322
00:14:21.120 --> 00:14:23.039
<v Speaker 2>to zero out the other bits, and then shift the

323
00:14:23.039 --> 00:14:25.480
<v Speaker 2>bits over to the right to read the value that's

324
00:14:25.519 --> 00:14:27.639
<v Speaker 2>three or four instructions just to read the day.

325
00:14:27.720 --> 00:14:30.519
<v Speaker 1>Variable, so it's a balance. Pack data uses fewer cash

326
00:14:30.559 --> 00:14:33.720
<v Speaker 1>lines but requires more instructions to unpack, which.

327
00:14:33.559 --> 00:14:36.320
<v Speaker 2>Means if you are just moving data around like a

328
00:14:36.320 --> 00:14:39.919
<v Speaker 2>network router moving at pack at night, save the bandwidth.

329
00:14:40.159 --> 00:14:42.879
<v Speaker 2>But if you're doing heavy calculations on that data, maybe

330
00:14:42.960 --> 00:14:44.960
<v Speaker 2>keep it unpacked so the CPU doesn't have to fight

331
00:14:45.039 --> 00:14:46.279
<v Speaker 2>to read it every single time.

332
00:14:46.559 --> 00:14:49.720
<v Speaker 1>That is a crucial insight. It's not about one right way,

333
00:14:49.799 --> 00:14:52.159
<v Speaker 1>it's about the right way for the constraints.

334
00:14:51.720 --> 00:14:54.679
<v Speaker 2>You have, and that applies to everything we've discussed, whether

335
00:14:54.720 --> 00:14:57.080
<v Speaker 2>it's choosing a data type using a bit wise trick,

336
00:14:57.360 --> 00:15:00.279
<v Speaker 2>or deciding between binary and decimal represent.

337
00:15:01.080 --> 00:15:03.879
<v Speaker 1>We've covered a massive amount of ground, from the physical

338
00:15:03.960 --> 00:15:07.480
<v Speaker 1>voltage to the odometer of two's complement. But I want

339
00:15:07.480 --> 00:15:09.879
<v Speaker 1>to leave our listeners with one final thought from the

340
00:15:09.879 --> 00:15:11.519
<v Speaker 1>book regarding scaled numerics.

341
00:15:11.919 --> 00:15:13.559
<v Speaker 2>This is the one that changes how you look at

342
00:15:13.559 --> 00:15:14.320
<v Speaker 2>a bank statement.

343
00:15:14.679 --> 00:15:17.679
<v Speaker 1>We rely so heavily on float and double types for

344
00:15:17.759 --> 00:15:22.039
<v Speaker 1>decimal numbers, but hide challenges us do we really need them.

345
00:15:22.320 --> 00:15:26.320
<v Speaker 2>Floating point is inherently imprecise. It's an approximation. You add

346
00:15:26.320 --> 00:15:28.559
<v Speaker 2>point one point two in a computer and often you

347
00:15:28.600 --> 00:15:31.919
<v Speaker 2>get point three zero zero zero zero zero zero zero.

348
00:15:31.759 --> 00:15:34.639
<v Speaker 1>Zero four, which is a complete nightmare for financial software.

349
00:15:34.679 --> 00:15:36.840
<v Speaker 1>You can't just lose pennies in the rounding errors.

350
00:15:36.840 --> 00:15:39.519
<v Speaker 2>So the provocative thought is this, can you switch your

351
00:15:39.559 --> 00:15:43.120
<v Speaker 2>mindset to integers. Instead of storing a dollar a fifty cent,

352
00:15:43.240 --> 00:15:44.639
<v Speaker 2>you store one hundred and fifty pennies.

353
00:15:44.919 --> 00:15:47.080
<v Speaker 1>You essentially move the decimal point yourself.

354
00:15:47.399 --> 00:15:50.200
<v Speaker 2>You manage the decimal point in your head, or rather

355
00:15:50.320 --> 00:15:54.039
<v Speaker 2>in your CODs logic. But the machine does pure, fast,

356
00:15:54.240 --> 00:15:58.000
<v Speaker 2>precise integer math. It's a technique called fixed point or

357
00:15:58.039 --> 00:16:01.639
<v Speaker 2>scaled numerics. It forces you to really understand your data's

358
00:16:01.720 --> 00:16:04.840
<v Speaker 2>range and precision before you write a single line of code.

359
00:16:04.639 --> 00:16:06.519
<v Speaker 1>And honestly, that feels like the theme of the whole

360
00:16:06.519 --> 00:16:07.759
<v Speaker 1>book intentionality.

361
00:16:07.960 --> 00:16:11.399
<v Speaker 2>Intentionality. Don't just let the compiler make the decisions for you.

362
00:16:11.399 --> 00:16:12.000
<v Speaker 2>You make the.

363
00:16:11.960 --> 00:16:15.360
<v Speaker 1>Decisions, and that is how you write great code. Thanks

364
00:16:15.360 --> 00:16:17.279
<v Speaker 1>for diving deep with us today, Happy coding.
