WEBVTT

1
00:00:00.080 --> 00:00:03.600
<v Speaker 1>Welcome back to another deep dive. Today. We're opening up

2
00:00:03.640 --> 00:00:06.639
<v Speaker 1>a topic that I think a lot of us in

3
00:00:06.679 --> 00:00:08.880
<v Speaker 1>the crunches have a bit of a love hate relationship with.

4
00:00:09.000 --> 00:00:09.519
<v Speaker 2>OHI JEFFA.

5
00:00:09.679 --> 00:00:12.240
<v Speaker 1>We are talking about vSphere seven point X, and I

6
00:00:12.320 --> 00:00:15.759
<v Speaker 1>know the immediate reaction for you listening is probably great.

7
00:00:16.480 --> 00:00:19.519
<v Speaker 1>Another update, another version number to track. But we've been

8
00:00:19.640 --> 00:00:22.679
<v Speaker 1>pouring over the official cert guide for EXAM two V

9
00:00:22.800 --> 00:00:26.039
<v Speaker 1>zero DASH twenty one point two. Yeah, that's by Davis,

10
00:00:26.199 --> 00:00:31.199
<v Speaker 1>Baka and Thomas And honestly, this feels different. It really

11
00:00:31.199 --> 00:00:32.719
<v Speaker 1>doesn't feel like just a service pack.

12
00:00:33.000 --> 00:00:35.000
<v Speaker 2>It really isn't. I mean, when you actually dig into

13
00:00:35.000 --> 00:00:38.079
<v Speaker 2>the architecture changes let out in this guide, VRE seven

14
00:00:38.159 --> 00:00:40.240
<v Speaker 2>represents a massive pivot.

15
00:00:40.359 --> 00:00:40.439
<v Speaker 1>Ye.

16
00:00:40.600 --> 00:00:44.840
<v Speaker 2>It's the exact moment where VMware stopped just virtualizing servers

17
00:00:44.920 --> 00:00:47.560
<v Speaker 2>and really started enforcing the software to find data center

18
00:00:47.840 --> 00:00:51.119
<v Speaker 2>the SDDC. We're moving from a world where we manage

19
00:00:51.119 --> 00:00:54.000
<v Speaker 2>individual boxes to a world where we manage policies in

20
00:00:54.039 --> 00:00:54.960
<v Speaker 2>desired states.

21
00:00:55.119 --> 00:00:58.079
<v Speaker 1>Desired state that is the buzzword, right, Yeah, But looking

22
00:00:58.119 --> 00:01:02.039
<v Speaker 1>at the source material here, there is some serious engineering

23
00:01:02.079 --> 00:01:04.920
<v Speaker 1>behind that marketing term. Absolutely, we've got the death of

24
00:01:04.959 --> 00:01:08.560
<v Speaker 1>the external platform services controller, finally long.

25
00:01:08.359 --> 00:01:12.239
<v Speaker 2>Overdue, the complete overhaul of how storage is handled with

26
00:01:12.359 --> 00:01:16.040
<v Speaker 2>VSN and vvols, and some pretty scary warnings about where

27
00:01:16.159 --> 00:01:19.239
<v Speaker 2>you can and cannot install ESXi anymore.

28
00:01:19.599 --> 00:01:22.359
<v Speaker 1>Yeah, it's a lot to unpack. The guide is dense

29
00:01:22.480 --> 00:01:26.200
<v Speaker 1>because the changes are so fundamental. They're basically ripping out

30
00:01:26.519 --> 00:01:29.879
<v Speaker 1>legacy code that has been there for a decade and

31
00:01:29.920 --> 00:01:33.280
<v Speaker 1>replacing it with this modern container aware architecture.

32
00:01:33.920 --> 00:01:37.480
<v Speaker 2>So let's start with the brain of the beast vsenter server. Now,

33
00:01:37.519 --> 00:01:39.760
<v Speaker 2>if you're listening and you were running visphere six point

34
00:01:39.799 --> 00:01:42.760
<v Speaker 2>zero or six point five, you probably have battle scars

35
00:01:42.799 --> 00:01:46.519
<v Speaker 2>from dealing with the Platform Services Controller, the ps.

36
00:01:46.519 --> 00:01:47.560
<v Speaker 1>OH, the topologies.

37
00:01:47.680 --> 00:01:50.959
<v Speaker 2>I remember having to design these incredibly complex topologies with

38
00:01:51.079 --> 00:01:54.719
<v Speaker 2>external ps sitting behind third party load balancers just to

39
00:01:54.760 --> 00:01:57.079
<v Speaker 2>get single sign on to work across sites. It was

40
00:01:57.120 --> 00:01:58.000
<v Speaker 2>frankly a nightmare.

41
00:01:58.040 --> 00:02:00.920
<v Speaker 1>It was incredibly complex. You had to manage replication agreements

42
00:02:00.959 --> 00:02:03.400
<v Speaker 1>between those PSCs, You had to manage the certificates for

43
00:02:03.480 --> 00:02:04.640
<v Speaker 1>each individual one.

44
00:02:04.760 --> 00:02:07.359
<v Speaker 2>The certificates were the worst, right and if the load

45
00:02:07.400 --> 00:02:11.120
<v Speaker 2>balance er misbehaved, your admins literally couldn't log in. But

46
00:02:11.199 --> 00:02:13.599
<v Speaker 2>the good news from the cert guide is that VS

47
00:02:13.919 --> 00:02:16.759
<v Speaker 2>seven effectively kills the external psc.

48
00:02:16.560 --> 00:02:18.680
<v Speaker 1>So it's dead, Like I don't have to build them anymore.

49
00:02:18.759 --> 00:02:21.960
<v Speaker 2>It is dead. The entire architectures that collapsed back into

50
00:02:22.000 --> 00:02:26.599
<v Speaker 2>a simplified single appliance model. Okay, all those services, single

51
00:02:26.639 --> 00:02:31.319
<v Speaker 2>sign on, the license service, the VMware Certificate Authority, They're

52
00:02:31.360 --> 00:02:34.680
<v Speaker 2>all now running natively inside the v Center Server appliance

53
00:02:35.280 --> 00:02:36.080
<v Speaker 2>the VCSA.

54
00:02:36.319 --> 00:02:39.080
<v Speaker 1>Okay, that sounds great for a greenfield deployment, but what

55
00:02:39.199 --> 00:02:41.840
<v Speaker 1>about the poor admin out there who has that complex

56
00:02:41.919 --> 00:02:46.280
<v Speaker 1>six point seven topology with external PSCs. Is the upgrade

57
00:02:46.280 --> 00:02:48.439
<v Speaker 1>path going to be a complete rip and replace scenario?

58
00:02:49.000 --> 00:02:52.840
<v Speaker 2>Surprisingly, no, The guide highlights that the upgrade tool is

59
00:02:52.840 --> 00:02:55.400
<v Speaker 2>actually a converge tool. Oh interesting, When you run the

60
00:02:55.479 --> 00:02:58.599
<v Speaker 2>VS verse seven installer against your existing environment, it actually

61
00:02:58.680 --> 00:03:02.840
<v Speaker 2>detects those external ps migrates their data and identity into

62
00:03:02.879 --> 00:03:07.159
<v Speaker 2>the new vCenter appliance, and then essentially decommissions the old nodes. Ah,

63
00:03:07.439 --> 00:03:09.560
<v Speaker 2>it converges the topology automatically for you.

64
00:03:09.759 --> 00:03:12.639
<v Speaker 1>That is a huge relief to hear. But there's another

65
00:03:12.639 --> 00:03:17.280
<v Speaker 1>obituary in here. The guide is pretty explicit that vCenter

66
00:03:17.319 --> 00:03:18.479
<v Speaker 1>Server for Windows is.

67
00:03:18.479 --> 00:03:22.080
<v Speaker 2>Gone correct, done, no more installing v Center on top

68
00:03:22.120 --> 00:03:25.000
<v Speaker 2>of Windows Server. We are strictly in the world of

69
00:03:25.000 --> 00:03:28.240
<v Speaker 2>the Photonos appliance now, right, which is great for security

70
00:03:28.280 --> 00:03:31.520
<v Speaker 2>and patching, but it does mean if you are relying

71
00:03:31.560 --> 00:03:35.120
<v Speaker 2>on Windows specific scripts or agents running locally on your

72
00:03:35.199 --> 00:03:36.240
<v Speaker 2>v center box.

73
00:03:36.039 --> 00:03:38.280
<v Speaker 1>Which a lot of people did exactly.

74
00:03:38.039 --> 00:03:40.439
<v Speaker 2>That workflow is completely broken. Now you have to adapt.

75
00:03:40.520 --> 00:03:45.199
<v Speaker 1>Okay, So we have this single monolithic appliance now handling everything.

76
00:03:45.199 --> 00:03:48.039
<v Speaker 1>It's the brain, the heart, and the nervous system. But

77
00:03:48.080 --> 00:03:51.599
<v Speaker 1>if that appliance dies, we are flying blind. I know

78
00:03:51.719 --> 00:03:55.319
<v Speaker 1>v Center High Availability existed before, but the guide makes

79
00:03:55.319 --> 00:03:58.360
<v Speaker 1>it sound like the architecture under the hood has really changed.

80
00:03:58.879 --> 00:04:00.479
<v Speaker 1>How are we keeping this thing alive?

81
00:04:00.599 --> 00:04:03.639
<v Speaker 2>So? vCenter ha and version seven is very slick, but

82
00:04:03.680 --> 00:04:06.560
<v Speaker 2>you have to understand it requires specific networking. It uses

83
00:04:06.599 --> 00:04:09.159
<v Speaker 2>a three node cluster. Okay, you've got an active node,

84
00:04:09.199 --> 00:04:10.960
<v Speaker 2>a passive node and a witness note walk.

85
00:04:10.879 --> 00:04:12.919
<v Speaker 1>Us through the replication. There is this just doing a

86
00:04:12.960 --> 00:04:14.159
<v Speaker 1>standard storage mirror.

87
00:04:14.280 --> 00:04:17.120
<v Speaker 2>No, No, it's much more intelligent than the active node.

88
00:04:17.319 --> 00:04:19.879
<v Speaker 2>The one you are actually logged into and using is

89
00:04:19.959 --> 00:04:23.879
<v Speaker 2>replicating data to the passive node through two distinct channels.

90
00:04:23.879 --> 00:04:27.800
<v Speaker 2>Two channels, right, it uses native postgracle replication for the database,

91
00:04:28.319 --> 00:04:31.240
<v Speaker 2>ensuring that all your inventory and events are SYNCD transactionally.

92
00:04:32.079 --> 00:04:35.920
<v Speaker 2>And then it uses a separate file level replication basically

93
00:04:36.079 --> 00:04:39.000
<v Speaker 2>zer sync to keep the configuration files in check.

94
00:04:39.279 --> 00:04:42.040
<v Speaker 1>And the WITNESS is that just a third copy of

95
00:04:42.040 --> 00:04:43.199
<v Speaker 1>the database.

96
00:04:42.879 --> 00:04:45.040
<v Speaker 2>Not at all. The WITNESS is just a tiebreaker. It's

97
00:04:45.079 --> 00:04:48.000
<v Speaker 2>a very lightweight clone. It doesn't hold the database or

98
00:04:48.040 --> 00:04:48.519
<v Speaker 2>the files.

99
00:04:48.600 --> 00:04:49.319
<v Speaker 1>Noo, what does it do?

100
00:04:49.839 --> 00:04:53.040
<v Speaker 2>Its only job is to provide quorum. If your network

101
00:04:53.120 --> 00:04:55.439
<v Speaker 2>kickups and the active and passive nodes lose sight of

102
00:04:55.439 --> 00:04:57.839
<v Speaker 2>each other, you run the massive risk of a split

103
00:04:57.879 --> 00:05:00.920
<v Speaker 2>brain scenario where both think they are the mas.

104
00:05:00.439 --> 00:05:02.399
<v Speaker 1>Right, which corrupts everything exactly.

105
00:05:02.720 --> 00:05:06.240
<v Speaker 2>The WITNESS basically casts the deciding vote on who actually.

106
00:05:06.040 --> 00:05:08.959
<v Speaker 1>Owns the cluster, and the guide mentions some strict requirements

107
00:05:08.959 --> 00:05:11.639
<v Speaker 1>for this right. You can't just throw these nodes anywhere

108
00:05:11.680 --> 00:05:12.399
<v Speaker 1>on your network.

109
00:05:12.639 --> 00:05:16.120
<v Speaker 2>No, you really can't. You need a dedicated vCenter HA

110
00:05:16.279 --> 00:05:20.639
<v Speaker 2>network interface on each node, and the latency required is strict,

111
00:05:20.720 --> 00:05:23.759
<v Speaker 2>how strange, less than ten milliseconds between the active and

112
00:05:23.800 --> 00:05:26.720
<v Speaker 2>passive nodes. If you try to stretch this across a

113
00:05:26.800 --> 00:05:30.319
<v Speaker 2>laggy wham link, the Postgres school replication will time out

114
00:05:30.480 --> 00:05:31.800
<v Speaker 2>and the cluster will just fail.

115
00:05:31.920 --> 00:05:34.920
<v Speaker 1>Get to know. So we've bulletproofed the brain, but the

116
00:05:34.959 --> 00:05:39.279
<v Speaker 1>brain is useless if the body, meaning the esx offs themselves,

117
00:05:39.360 --> 00:05:40.120
<v Speaker 1>are crumbling.

118
00:05:40.240 --> 00:05:41.319
<v Speaker 2>That's true, and.

119
00:05:41.240 --> 00:05:44.319
<v Speaker 1>That brings us to a somewhat controversial change in the

120
00:05:44.319 --> 00:05:46.759
<v Speaker 1>guide regarding how we actually boot these servers.

121
00:05:46.800 --> 00:05:48.399
<v Speaker 2>Oh yeah, the boot media.

122
00:05:48.600 --> 00:05:51.399
<v Speaker 1>For years in my home lab, and honestly even in

123
00:05:51.399 --> 00:05:55.000
<v Speaker 1>some production environments, I just slap ESXi on a generic

124
00:05:55.079 --> 00:05:58.040
<v Speaker 1>eight gig USB stick or an SD card. It was cheap,

125
00:05:58.079 --> 00:06:00.959
<v Speaker 1>it worked. But reading this guide, it sounds like VMware

126
00:06:01.040 --> 00:06:03.000
<v Speaker 1>is declaring war on USB boot drives.

127
00:06:03.079 --> 00:06:05.319
<v Speaker 2>War might be a strong word, but they are definitely

128
00:06:05.360 --> 00:06:08.399
<v Speaker 2>waving a massive red flag for you. The issue isn't

129
00:06:08.439 --> 00:06:10.879
<v Speaker 2>just the capacity, it's the partition structure.

130
00:06:10.959 --> 00:06:12.040
<v Speaker 1>Okay, break that down.

131
00:06:12.319 --> 00:06:14.680
<v Speaker 2>In previous versions, that boot drive really just held the

132
00:06:14.879 --> 00:06:17.959
<v Speaker 2>hypervisor image, which is tiny. It loads into memory and

133
00:06:18.000 --> 00:06:23.079
<v Speaker 2>it's done. But VS seven introduces a completely new partition layout.

134
00:06:23.480 --> 00:06:26.120
<v Speaker 2>The big one you need to know about is ESXOS data.

135
00:06:26.519 --> 00:06:29.560
<v Speaker 1>ESXOS data. That sounds ominous. What goes in there?

136
00:06:29.680 --> 00:06:33.240
<v Speaker 2>It's consolidation. It takes the old scratch partition, the locker

137
00:06:33.240 --> 00:06:36.319
<v Speaker 2>for VMware tools and the core dump location and puts

138
00:06:36.360 --> 00:06:39.800
<v Speaker 2>them in one place. But here is the kicker. It

139
00:06:39.879 --> 00:06:42.199
<v Speaker 2>is formatted with VMFSL.

140
00:06:42.480 --> 00:06:45.680
<v Speaker 1>VMFSL like the filesystem used in VSL.

141
00:06:45.360 --> 00:06:48.879
<v Speaker 2>Exactly like that. It's a high performance filesystem designed specifically

142
00:06:48.920 --> 00:06:52.800
<v Speaker 2>for frequent reads and writes. This partition stores system logs, traces,

143
00:06:52.800 --> 00:06:54.759
<v Speaker 2>and live database entries for the host itself.

144
00:06:54.800 --> 00:06:56.120
<v Speaker 1>Okay, I see where this is going.

145
00:06:56.240 --> 00:06:58.600
<v Speaker 2>Right. If you put that kind of heavy IO load

146
00:06:58.639 --> 00:07:01.519
<v Speaker 2>on a cheap consumer grade USB stick or an SD card,

147
00:07:01.800 --> 00:07:04.079
<v Speaker 2>you're going to burn out those nan flash cells in

148
00:07:04.120 --> 00:07:05.759
<v Speaker 2>a matter of months, maybe even weeks.

149
00:07:05.839 --> 00:07:07.680
<v Speaker 1>So the hypervisor is literally.

150
00:07:07.360 --> 00:07:10.199
<v Speaker 2>Right rights to drive to death. Yes, And because of this,

151
00:07:10.360 --> 00:07:12.600
<v Speaker 2>if the installer detects you or booting from a low

152
00:07:12.680 --> 00:07:16.519
<v Speaker 2>quality USB device, it creates the esx O STATA partition,

153
00:07:16.879 --> 00:07:17.920
<v Speaker 2>but it runs it into.

154
00:07:17.759 --> 00:07:19.480
<v Speaker 1>Graded mode degraded mode.

155
00:07:19.600 --> 00:07:21.759
<v Speaker 2>Yeah, it tries to limit the rights to save the drive,

156
00:07:22.279 --> 00:07:26.480
<v Speaker 2>but you lose functionality and your logs are seriously at risk.

157
00:07:27.079 --> 00:07:29.639
<v Speaker 1>So what is the actual recommendation from the cert guide?

158
00:07:29.720 --> 00:07:31.879
<v Speaker 1>Are we going back to spinning rest for boot drives?

159
00:07:32.120 --> 00:07:36.079
<v Speaker 2>The guide firmly recommends a local persistent disc that means

160
00:07:36.079 --> 00:07:39.639
<v Speaker 2>an HDD or an SSD of at least thirty two gigabytes.

161
00:07:40.160 --> 00:07:42.800
<v Speaker 2>That gives you enough room for the boot banks and

162
00:07:42.879 --> 00:07:45.800
<v Speaker 2>a fully functional esx ostate a partition.

163
00:07:45.519 --> 00:07:47.680
<v Speaker 1>And if you absolutely have to use USB.

164
00:07:47.879 --> 00:07:50.040
<v Speaker 2>If you must use USB, you have to pair it

165
00:07:50.079 --> 00:07:53.519
<v Speaker 2>with a local disc to offload that scratch partition, or

166
00:07:53.560 --> 00:07:55.000
<v Speaker 2>you are living on borrowed time.

167
00:07:55.120 --> 00:07:56.720
<v Speaker 1>That is going to catch a lot of people off

168
00:07:56.759 --> 00:07:59.399
<v Speaker 1>guard during a hardware refresh. Speaking of things that catch

169
00:07:59.399 --> 00:08:02.839
<v Speaker 1>people off guard are DNS and NTP classics. I feel

170
00:08:02.839 --> 00:08:04.639
<v Speaker 1>like we talk about this every year, but the guide

171
00:08:04.720 --> 00:08:06.079
<v Speaker 1>is just relentless about it.

172
00:08:06.120 --> 00:08:08.560
<v Speaker 2>This time. It has to be in vSphere seven. The

173
00:08:08.600 --> 00:08:12.240
<v Speaker 2>dependencies are hard coded. Take DNS for example, when you

174
00:08:12.240 --> 00:08:16.160
<v Speaker 2>are deploying that new VCSA, the installer pauses and actually

175
00:08:16.240 --> 00:08:19.079
<v Speaker 2>performs a reverse look up on the IP address you provided.

176
00:08:19.319 --> 00:08:22.240
<v Speaker 2>If it cannot resolve that IP back to the fully

177
00:08:22.279 --> 00:08:27.240
<v Speaker 2>qualified domain name the FQDN, the installation literally fails. It

178
00:08:27.279 --> 00:08:29.120
<v Speaker 2>doesn't warn you, it just stops.

179
00:08:29.279 --> 00:08:30.040
<v Speaker 1>Zero tolerance.

180
00:08:30.160 --> 00:08:33.000
<v Speaker 2>Zero tolerance and NTP is even more critical because of

181
00:08:33.080 --> 00:08:36.440
<v Speaker 2>sso how so the authentication tokens, the insale tokens used

182
00:08:36.440 --> 00:08:39.279
<v Speaker 2>between v center and the hosts. They're all time stamped.

183
00:08:39.600 --> 00:08:41.960
<v Speaker 2>If your host drifts more than a few minutes away

184
00:08:41.960 --> 00:08:45.200
<v Speaker 2>from the v center time those tokens are rejected. Oh man,

185
00:08:45.320 --> 00:08:48.840
<v Speaker 2>suddenly your backups fail, v motion fails and you literally

186
00:08:48.919 --> 00:08:50.120
<v Speaker 2>can't log into the host.

187
00:08:50.279 --> 00:08:52.879
<v Speaker 1>So for everyone listening, check your PTR records and your

188
00:08:52.879 --> 00:08:55.200
<v Speaker 1>time servers before you even download the IO.

189
00:08:55.360 --> 00:08:57.840
<v Speaker 2>The un sexy work that saves the deployment, let's.

190
00:08:57.639 --> 00:09:01.080
<v Speaker 1>Shift gears to something a little sexier, store orridge. This

191
00:09:01.159 --> 00:09:04.480
<v Speaker 1>seems to be where the software defined part of SDBC

192
00:09:04.679 --> 00:09:07.840
<v Speaker 1>really kicks into overdrive. Absolutely, we still have the classics

193
00:09:07.960 --> 00:09:11.279
<v Speaker 1>VMFS and NFS. Are they just legacy support now or

194
00:09:11.320 --> 00:09:12.919
<v Speaker 1>have they actually improved in this version?

195
00:09:13.000 --> 00:09:15.919
<v Speaker 2>Oh, they've definitely evolved. VMFS six is the absolute standard

196
00:09:15.919 --> 00:09:20.120
<v Speaker 2>now and the big thing it handles is automatic unmap.

197
00:09:20.440 --> 00:09:21.679
<v Speaker 1>You m in how that works.

198
00:09:21.879 --> 00:09:24.039
<v Speaker 2>In the old days, if you deleted one hundred gigs

199
00:09:24.039 --> 00:09:27.480
<v Speaker 2>of data inside a Windows VM, the underlying storage ray

200
00:09:27.679 --> 00:09:30.440
<v Speaker 2>had no idea that space was free. It stayed marked

201
00:09:30.440 --> 00:09:34.320
<v Speaker 2>as used. VMFS six automatically sends commands down to the

202
00:09:34.440 --> 00:09:35.960
<v Speaker 2>array to reclaim that space.

203
00:09:36.360 --> 00:09:38.840
<v Speaker 1>That's a huge space saver. And what about NFS. I

204
00:09:38.919 --> 00:09:43.399
<v Speaker 1>usually associate NFS with holding isophiles, not running high performance workloads.

205
00:09:43.480 --> 00:09:46.600
<v Speaker 2>Right, But VS seven pushes NFS four point one, which

206
00:09:46.679 --> 00:09:49.320
<v Speaker 2>is a massive leap over NFS three. The two big

207
00:09:49.320 --> 00:09:51.639
<v Speaker 2>features you get are multipathing and cerberos.

208
00:09:51.759 --> 00:09:53.240
<v Speaker 1>Okay, multipathing makes sense.

209
00:09:53.320 --> 00:09:56.120
<v Speaker 2>Yeah, NFS three relied on a single TCP session. If

210
00:09:56.120 --> 00:09:59.200
<v Speaker 2>that link got saturated, you were bottlenecked. NFS four point

211
00:09:59.240 --> 00:10:03.200
<v Speaker 2>one supports true multipathing across multiple links, and Cabero's means

212
00:10:03.200 --> 00:10:05.240
<v Speaker 2>we can finally encrypt that storage traffic on the.

213
00:10:05.159 --> 00:10:07.559
<v Speaker 1>Wire, which is a huge compliance requirement for a lot

214
00:10:07.559 --> 00:10:10.240
<v Speaker 1>of organizations. Now exactly, now, I saw an acronym in

215
00:10:10.279 --> 00:10:13.000
<v Speaker 1>the guy that was new to me, HPP, the high

216
00:10:13.039 --> 00:10:16.360
<v Speaker 1>performance plug in. For the last decade, we've relied on NMP,

217
00:10:16.559 --> 00:10:19.320
<v Speaker 1>the native multipathing plug in. Why do we need a

218
00:10:19.360 --> 00:10:20.000
<v Speaker 1>new one? Now?

219
00:10:20.200 --> 00:10:23.279
<v Speaker 2>This is entirely driven by the rise of NVMe non

220
00:10:23.360 --> 00:10:28.279
<v Speaker 2>volatile memory express. Think about NMP as a traffic cup

221
00:10:28.519 --> 00:10:32.000
<v Speaker 2>that was designed for traffic in the nineteen nineties, spinning discs.

222
00:10:32.159 --> 00:10:32.399
<v Speaker 1>Right.

223
00:10:32.720 --> 00:10:35.279
<v Speaker 2>It has these complex locks and queues that made total

224
00:10:35.320 --> 00:10:38.600
<v Speaker 2>sense when drives were slow, But modern in VM flash

225
00:10:38.919 --> 00:10:42.799
<v Speaker 2>is so incredibly fast that the software stack NMP itself

226
00:10:43.000 --> 00:10:44.399
<v Speaker 2>actually became the bottleneck.

227
00:10:44.519 --> 00:10:46.559
<v Speaker 1>The software couldn't click the send button fast enough for

228
00:10:46.559 --> 00:10:47.440
<v Speaker 1>the hardware.

229
00:10:47.080 --> 00:10:50.840
<v Speaker 2>Precisely, so VMware wrote the HPP specifically for ENVME and

230
00:10:50.960 --> 00:10:55.320
<v Speaker 2>nvmey over fabrics. It removes those legacy locks and optimizes

231
00:10:55.360 --> 00:10:59.240
<v Speaker 2>the whole iopath to handle millions of IOPs without killing

232
00:10:59.279 --> 00:11:00.000
<v Speaker 2>your CPU over.

233
00:11:00.360 --> 00:11:02.960
<v Speaker 1>So, if you're buying an all flash NVM array today,

234
00:11:03.200 --> 00:11:04.480
<v Speaker 1>you need to be using HPP.

235
00:11:04.720 --> 00:11:06.720
<v Speaker 2>If you aren't, you're just wasting the money you spend

236
00:11:06.799 --> 00:11:07.279
<v Speaker 2>on that array.

237
00:11:07.559 --> 00:11:11.320
<v Speaker 1>This moves us nicely into the no More Lams conversation

238
00:11:11.759 --> 00:11:14.840
<v Speaker 1>vivols or virtual volumes. The concept has been around for

239
00:11:14.840 --> 00:11:16.600
<v Speaker 1>a while, but the guide really treats it as a

240
00:11:16.600 --> 00:11:19.720
<v Speaker 1>primary citizen in version seven for those who haven't deployed it.

241
00:11:19.799 --> 00:11:21.240
<v Speaker 1>What is the actual mechanism here?

242
00:11:21.519 --> 00:11:25.200
<v Speaker 2>So vviles changes the entire relationship between vCenter and the

243
00:11:25.240 --> 00:11:28.399
<v Speaker 2>storage array. It introduces a component called the VSA.

244
00:11:28.200 --> 00:11:32.159
<v Speaker 1>Provider vSphere APIs for Storage Awareness right.

245
00:11:32.320 --> 00:11:35.240
<v Speaker 2>This acts as a translator. Instead of the array presenting

246
00:11:35.279 --> 00:11:39.600
<v Speaker 2>a dumb ten terabyte block of space a LN, the

247
00:11:40.240 --> 00:11:44.200
<v Speaker 2>VISA provider tells vsenter, hey, I can do replication, I

248
00:11:44.200 --> 00:11:47.159
<v Speaker 2>can do dduplication, and I can do encryption, And then

249
00:11:47.240 --> 00:11:51.120
<v Speaker 2>v center pushes that policy down to the individual VM level. Exactly.

250
00:11:51.320 --> 00:11:53.840
<v Speaker 2>When you create a VM, the array creates a specific

251
00:11:54.000 --> 00:11:57.120
<v Speaker 2>virtual volume just for that VM's disc. If you need

252
00:11:57.159 --> 00:12:00.360
<v Speaker 2>to snapshot that VM, the array snapshots only that volume.

253
00:12:00.360 --> 00:12:02.600
<v Speaker 2>You aren't snapshotting a whole li in with twenty other

254
00:12:02.679 --> 00:12:03.399
<v Speaker 2>vms on it.

255
00:12:03.399 --> 00:12:06.679
<v Speaker 1>It gives you granular control that matches the application, not

256
00:12:06.759 --> 00:12:09.759
<v Speaker 1>the hardware limitations exactly. But if we really want to

257
00:12:09.799 --> 00:12:12.559
<v Speaker 1>talk about true software defined storage, we have to talk

258
00:12:12.559 --> 00:12:15.039
<v Speaker 1>about VSAM. This really feels like the heart of the

259
00:12:15.120 --> 00:12:18.840
<v Speaker 1>modern VMware stack. Conceptually, we're taking local discs in the

260
00:12:18.879 --> 00:12:21.639
<v Speaker 1>servers and pulling them across the network. But the devil

261
00:12:21.679 --> 00:12:24.759
<v Speaker 1>is always in the details, specifically regarding disc groups.

262
00:12:24.919 --> 00:12:27.480
<v Speaker 2>The disc group is the fundamental building block of VSAN.

263
00:12:28.080 --> 00:12:30.919
<v Speaker 2>Each host participating needs at least one and the strict

264
00:12:31.000 --> 00:12:33.600
<v Speaker 2>rule laid out in the guide is one cash device

265
00:12:33.960 --> 00:12:36.120
<v Speaker 2>and one or more capacity devices.

266
00:12:35.720 --> 00:12:38.639
<v Speaker 1>Per group, and that cash device is non.

267
00:12:38.399 --> 00:12:42.080
<v Speaker 2>Negotiable, completely non negotiable. It must be flash. Even in

268
00:12:42.120 --> 00:12:44.840
<v Speaker 2>a hybrid cluster where your capacity tier is made of

269
00:12:44.919 --> 00:12:48.440
<v Speaker 2>cheap spinning discs, that cash tier has to be high

270
00:12:48.440 --> 00:12:51.720
<v Speaker 2>performance SSD. Why is that because it absorbs one hundred

271
00:12:51.720 --> 00:12:54.000
<v Speaker 2>percent of the right operations. It acts as a buffer.

272
00:12:54.600 --> 00:12:58.200
<v Speaker 2>And here's the critical part. If that cash drive fails,

273
00:12:58.720 --> 00:13:00.679
<v Speaker 2>the entire disc group go offline.

274
00:13:00.720 --> 00:13:03.320
<v Speaker 1>Wow. Now the guide gets into some math that I

275
00:13:03.360 --> 00:13:06.360
<v Speaker 1>think is really important for anyone doing capacity planning. It

276
00:13:06.399 --> 00:13:09.799
<v Speaker 1>talks about RAD five and RADE six erasure coding. Traditionally,

277
00:13:09.879 --> 00:13:13.399
<v Speaker 1>if I wanted redundancy, I use RAD one mirroring. I

278
00:13:13.440 --> 00:13:15.600
<v Speaker 1>have one hundred gigs of data. I need two hundred

279
00:13:15.639 --> 00:13:17.000
<v Speaker 1>gigs of disk space.

280
00:13:17.039 --> 00:13:19.840
<v Speaker 2>Right, a two hundred percent overhead. That gets incredibly expensive

281
00:13:19.840 --> 00:13:24.320
<v Speaker 2>when you're buying enterprise flash drives. Erasure coding changes the algorithm.

282
00:13:24.360 --> 00:13:26.960
<v Speaker 2>It stripes the data across the host with parity.

283
00:13:26.600 --> 00:13:29.159
<v Speaker 1>Bits like old school hardware RAID five.

284
00:13:29.120 --> 00:13:32.159
<v Speaker 2>Very similar logic, but distributed across the network instead of

285
00:13:32.159 --> 00:13:35.000
<v Speaker 2>a bad plane. With RAD five eraser coding, you need

286
00:13:35.039 --> 00:13:38.000
<v Speaker 2>a minimum of four hosts. It uses a three plus

287
00:13:38.000 --> 00:13:41.159
<v Speaker 2>one calculation, so your overhead drops from two x down

288
00:13:41.200 --> 00:13:43.399
<v Speaker 2>to about one point three to three acts. That's signific

289
00:13:43.519 --> 00:13:45.480
<v Speaker 2>You get the exact same level of protection, but you

290
00:13:45.559 --> 00:13:49.120
<v Speaker 2>save a massive amount of raw storage capacity.

291
00:13:49.200 --> 00:13:51.279
<v Speaker 1>That is a huge cost difference at scale. But it's

292
00:13:51.279 --> 00:13:53.639
<v Speaker 1>only available on all flash configurations right.

293
00:13:53.679 --> 00:13:58.120
<v Speaker 2>Correct, The parody calculation requires significant CBU and random io performance.

294
00:13:58.720 --> 00:14:02.519
<v Speaker 2>Spitting discs simply cannot keep up with the read modify

295
00:14:02.639 --> 00:14:07.200
<v Speaker 2>right penalty of erasure coding without totally tanking your VM performance.

296
00:14:07.279 --> 00:14:10.639
<v Speaker 1>Got it now. One of the most fascinating topologies in

297
00:14:10.679 --> 00:14:14.279
<v Speaker 1>the guide is the stretched cluster. This is the scenario

298
00:14:14.279 --> 00:14:16.879
<v Speaker 1>where you have two data centers, Site A and Site B,

299
00:14:17.240 --> 00:14:20.080
<v Speaker 1>and you want them to essentially act as one giant cluster.

300
00:14:20.679 --> 00:14:22.960
<v Speaker 1>But I've always wondered about the split brain problem here.

301
00:14:23.279 --> 00:14:25.960
<v Speaker 1>If the fiber line between the buildings gets cut, how

302
00:14:25.960 --> 00:14:27.840
<v Speaker 1>do you stop them from fighting over who is the

303
00:14:27.879 --> 00:14:28.600
<v Speaker 1>active site.

304
00:14:28.639 --> 00:14:31.679
<v Speaker 2>That's the classic two generals problem, and VSAN solves this

305
00:14:31.759 --> 00:14:34.279
<v Speaker 2>with a witness host. You place this witness in a

306
00:14:34.279 --> 00:14:36.840
<v Speaker 2>third location, a totally different fault domain from Site A

307
00:14:36.919 --> 00:14:41.759
<v Speaker 2>Insight BK, and it holds the metadata components of the VSN.

308
00:14:41.399 --> 00:14:43.159
<v Speaker 1>Objects, so it acts as the referee.

309
00:14:43.399 --> 00:14:47.360
<v Speaker 2>Exactly imagine Site A in Sight B lose connection. Site

310
00:14:47.360 --> 00:14:49.480
<v Speaker 2>A looks at the witness and says, hey, can you

311
00:14:49.480 --> 00:14:52.600
<v Speaker 2>see me? The witness says yes. Sit A then knows

312
00:14:52.639 --> 00:14:55.240
<v Speaker 2>it has a quorum. It has two out of three votes,

313
00:14:55.360 --> 00:14:58.600
<v Speaker 2>so it stays online and Site B Site B can't

314
00:14:58.639 --> 00:15:01.679
<v Speaker 2>see the witness or site A, so it mathematically knows

315
00:15:01.679 --> 00:15:04.080
<v Speaker 2>it has lost the vote. It immediately shuts down its

316
00:15:04.159 --> 00:15:05.840
<v Speaker 2>VMS to prevent any data corruption.

317
00:15:06.000 --> 00:15:08.720
<v Speaker 1>And since the witness only holds metadata, it can run

318
00:15:08.720 --> 00:15:10.360
<v Speaker 1>on a pretty small connection, right, Yeah.

319
00:15:10.200 --> 00:15:12.320
<v Speaker 2>It's tiny. You can run it inside a small cloud

320
00:15:12.360 --> 00:15:15.200
<v Speaker 2>instance or just an old server in a remote office.

321
00:15:15.399 --> 00:15:18.759
<v Speaker 2>It's not storing the actual VMDK data, just the state

322
00:15:18.799 --> 00:15:19.440
<v Speaker 2>of the data.

323
00:15:19.559 --> 00:15:22.840
<v Speaker 1>This all leads to the overarching philosophy of vSphere seven,

324
00:15:22.919 --> 00:15:27.080
<v Speaker 1>which is SPBM storage policy based management. It feels like

325
00:15:27.120 --> 00:15:30.360
<v Speaker 1>we are finally moving away from the old gold, silver

326
00:15:30.480 --> 00:15:32.360
<v Speaker 1>bronze lun mindset.

327
00:15:32.600 --> 00:15:36.279
<v Speaker 2>That is absolutely the goal. In the past, the infrastructure

328
00:15:36.320 --> 00:15:38.600
<v Speaker 2>dictated the policy. You'd say, I have a fast lun,

329
00:15:38.639 --> 00:15:42.000
<v Speaker 2>so put the database there. With SPBM, the application dictates

330
00:15:42.000 --> 00:15:42.679
<v Speaker 2>the infrastructure.

331
00:15:42.799 --> 00:15:44.000
<v Speaker 1>How does that look in practice?

332
00:15:44.120 --> 00:15:47.679
<v Speaker 2>You create a policy and vCenter say mission critical encryption

333
00:15:47.879 --> 00:15:51.120
<v Speaker 2>enabled RAD one protection. Yeah, you just assigned that policy

334
00:15:51.159 --> 00:15:51.679
<v Speaker 2>directly to.

335
00:15:51.639 --> 00:15:54.480
<v Speaker 1>The VM, and v center acts as the broker exactly.

336
00:15:54.600 --> 00:15:56.759
<v Speaker 2>V center looks at your VM data store or your

337
00:15:56.799 --> 00:16:02.639
<v Speaker 2>vfles and checks can I satisfy this requirement? If yes,

338
00:16:03.279 --> 00:16:06.639
<v Speaker 2>it places the VM. If say six months later, a

339
00:16:06.720 --> 00:16:10.120
<v Speaker 2>drive fails and the VM is no longer protected. V

340
00:16:10.240 --> 00:16:12.279
<v Speaker 2>Center flags it as non compliant.

341
00:16:12.399 --> 00:16:15.000
<v Speaker 1>That's a big shift. It changes the admin's job from

342
00:16:15.000 --> 00:16:17.879
<v Speaker 1>provisioning storage to monitoring compliance.

343
00:16:18.080 --> 00:16:19.399
<v Speaker 2>It's all about desired state.

344
00:16:19.519 --> 00:16:21.000
<v Speaker 1>Before we wrap up, we have to touch on the

345
00:16:21.039 --> 00:16:24.279
<v Speaker 1>integration of modern apps. The guide mentions first class discs

346
00:16:24.279 --> 00:16:27.440
<v Speaker 1>and Kubernetes support. I think a lot of infrastructure admins

347
00:16:27.480 --> 00:16:30.120
<v Speaker 1>here Kubernetes and just tune out, thinking it's purely a

348
00:16:30.159 --> 00:16:33.759
<v Speaker 1>developer problem. But vsp seven makes it an infrastructure problem.

349
00:16:33.799 --> 00:16:36.440
<v Speaker 2>It really does. The concept of the first class disc

350
00:16:36.519 --> 00:16:39.279
<v Speaker 2>or STD is crucial here. In the past, a virtual

351
00:16:39.320 --> 00:16:41.840
<v Speaker 2>disc was always a child of a virtual machine. If

352
00:16:41.879 --> 00:16:43.840
<v Speaker 2>you deleted the VM, the disc died.

353
00:16:43.639 --> 00:16:45.960
<v Speaker 1>With it, which is fine for a traditional server, but

354
00:16:46.159 --> 00:16:48.200
<v Speaker 1>really bad for a container exactly.

355
00:16:48.279 --> 00:16:50.639
<v Speaker 2>Containers are ephemeral. They spin up and die in seconds,

356
00:16:51.120 --> 00:16:53.720
<v Speaker 2>but the data they generate, like a database file, needs

357
00:16:53.759 --> 00:16:57.440
<v Speaker 2>to persist. A first class disc is a managed storage

358
00:16:57.480 --> 00:16:59.960
<v Speaker 2>object that exists completely independently of any.

359
00:16:59.879 --> 00:17:01.519
<v Speaker 1>V so it just floats out there.

360
00:17:01.679 --> 00:17:05.079
<v Speaker 2>Yes, and kuberd eddies can request storage. vSphere creates an

361
00:17:05.119 --> 00:17:07.720
<v Speaker 2>FCD and that disc can be attached and detached to

362
00:17:07.759 --> 00:17:10.720
<v Speaker 2>different container worker nodes as needed. It allows you to

363
00:17:10.799 --> 00:17:13.839
<v Speaker 2>run staple apps on the exact same VSR platform you

364
00:17:13.960 --> 00:17:15.799
<v Speaker 2>use for your traditional Windows servers.

365
00:17:16.119 --> 00:17:19.240
<v Speaker 1>So bringing it all together, vSphere seven isn't just a facelift.

366
00:17:19.359 --> 00:17:22.799
<v Speaker 1>It is a fundamental re architecture. We've killed the external

367
00:17:22.839 --> 00:17:26.279
<v Speaker 1>PSC and simplified the management plane. We've tightened the screws

368
00:17:26.279 --> 00:17:30.319
<v Speaker 1>on hardware reliability with VMFSL boot partitions and those strict

369
00:17:30.400 --> 00:17:34.039
<v Speaker 1>DNS requirements. And we've moved storage from a static hardware

370
00:17:34.119 --> 00:17:39.000
<v Speaker 1>mapping to a dynamic, policy driven engine. With VAN and SPBM,

371
00:17:39.359 --> 00:17:40.079
<v Speaker 1>it's really.

372
00:17:39.839 --> 00:17:43.680
<v Speaker 2>About vSphere becoming the universal platform. Whether it's a legacy

373
00:17:43.680 --> 00:17:46.400
<v Speaker 2>SQL server or cloud native micro service, the goal is

374
00:17:46.400 --> 00:17:49.759
<v Speaker 2>to manage them with the exact same policies, the same availability,

375
00:17:50.000 --> 00:17:51.039
<v Speaker 2>and the same security.

376
00:17:51.160 --> 00:17:52.960
<v Speaker 1>I want to leave you with one final thought from

377
00:17:53.000 --> 00:17:56.799
<v Speaker 1>the guide something called life Cycle Manager. We didn't have

378
00:17:56.839 --> 00:17:59.119
<v Speaker 1>time to deep dive into it today, but it applies

379
00:17:59.200 --> 00:18:03.200
<v Speaker 1>that same desig hired state logic to the hardware firmware itself.

380
00:18:03.599 --> 00:18:06.720
<v Speaker 1>It can actually push BIOS and HbA driver updates to

381
00:18:06.720 --> 00:18:09.240
<v Speaker 1>the physical servers to match a policy you set.

382
00:18:09.400 --> 00:18:12.000
<v Speaker 2>It's an absolute game changer. It means you aren't just

383
00:18:12.160 --> 00:18:16.720
<v Speaker 2>patching ESXC anymore. You are patching the actual metal underneath

384
00:18:16.720 --> 00:18:18.519
<v Speaker 2>it all directly from vCenter.

385
00:18:18.839 --> 00:18:22.039
<v Speaker 1>Right, and if vSphere is managing the physical firmware, the

386
00:18:22.079 --> 00:18:25.839
<v Speaker 1>storage controller, and the application policy, are we looking at

387
00:18:25.880 --> 00:18:28.839
<v Speaker 1>the end of the traditional hardware maintenance window as we

388
00:18:28.880 --> 00:18:30.720
<v Speaker 1>know it. It feels like we are getting closer to

389
00:18:30.759 --> 00:18:32.759
<v Speaker 1>the dream of a truly fluid infrastructure.

390
00:18:32.839 --> 00:18:35.039
<v Speaker 2>We are definitely getting there. The hardware is basically just

391
00:18:35.079 --> 00:18:36.319
<v Speaker 2>becoming code at this point.

392
00:18:36.480 --> 00:18:39.440
<v Speaker 1>Plenty to think about before your next upgrade. Check those

393
00:18:39.440 --> 00:18:42.839
<v Speaker 1>boot partitions, verify your dns, and maybe start playing with

394
00:18:43.000 --> 00:18:45.400
<v Speaker 1>SPBM in your lab. Thanks for joining us on this

395
00:18:45.480 --> 00:18:46.799
<v Speaker 1>deep dive into VC or seven.

396
00:18:46.839 --> 00:18:47.680
<v Speaker 2>So glad to be here.
