Labworks 2:5-8 – Get-Me -ConvergedSwitching -For “Hyper-V” | Now-Please

Hello Labworks fans, detractors and partisans alike, hope you had a nice Easter / Resurrection / Agnostic Spring Celebration weekend.

Last time on Labworks 2:1-4, we looked at some of the awesome teaming options Microsoft gave us with Server 2012 via its multiplexor driver. We also made the required configuration adjustments on our switch for jumbo frames & VLAN trunking, then we built ourselves some port channel interfaces flavored with LACP.

I think the multiplexor driver/protocol is one of the great (unsung?) enhancements of Server 2012/R2 because it’s a sort of pre-virtualization abstraction layer (That is to say, your NICs are abstracted & standardized via this driver before we build our important virtual switches) and because it’s a value & performance multiplier you can use on just about any modern NIC, from the humble RealTek to the Mighty Intel Server 10GbE.

But I’m getting too excited here; let’s get back to the curriculum and get started shall we?

Goals

5.  Understand what Microsoft’s multiplexor driver/LBFO has done to our NICs

6. Build our Virtual Machine Switch for maximum flexibility & performance

7. The vEthernets are Coming

8. Next Steps: Jumbo frames from End-to-end and performance tuning

Schematic:

Lab 2 - Daisetta Labs overview

2:5 Understand what Microsoft’s Multiplexor driver/LBFO has done to our NICs

So as I said above, the best way to think about the multiplexor driver & Microsoft’s Load Balancing/Failover tech is by viewing it as a pre-virtualization abstraction layer for your NICs. Let’s take a look.

Our Network Connections screen doesn’t look much different yet, save for one new decked-out icon labeled “Daisetta-Team:”

daisettateam

Meanwhile, this screen is still showing the four NICs we joined into a team in Labworks 2:3, so what gives?

A click on the properties of any of those NICs (save for the RealTek) reveals what’s happened:

Egads! My Intel NIC has been neutered by LBFO
Egads! My Intel NIC has been neutered by LBFO

The LBFO process unbinds many (though not all) settings, configurations, protocols and certain driver elements from your physical NICs, then binds the fabulous Multiplexor driver/protocol to the NIC as you see in the screenshot above.

In the dark days of 2008 R2 & Windows core, when we had to walk up hill to school both ways in the snow I had to download and run a cmd tool called nvspbind to get this kind of information.

Fortunately for us in 2012 & R2, we have some simple cmdlets:

daisettateam3

So notice Microsoft has essentially stripped “Ethernet 4” of all that would have made it special & unique amongst my 4x1GbE NICs; where I might have thought to tag a VLAN onto that Intel GbE, the multiplexor has stripped that option out. If I had statically assigned an IP address to this interface, TCP/IP v4 & v6 are now no longer bound to the NIC itself and thus are incapable of having an IP address.

And the awesome thing is you can do this across NICs, even NICs made by separate vendors. I could, for example, mix the sacred NICs (Intel) with the profane NICs (RealTek)…it don’t matter, all NICs are invited to the LBFO party.

No extra licensing costs here either; if you own a Server 2012 or 2012 R2 license, you get this for free, which is all kinds of kick ass as this bit of tech has allowed me in many situations to delay hardware spend. Why go for 10GbE NICs & Switches when I can combine some old Broadcom NICs, leverage LACP on the switch, and build 6×1 or 8x1GbE Converged LACP teams?

LBFO even adds up all the NICs you’ve given it and teases you with a calculated LinkSpeed figure, which we’re going to hold it to in the next step:

4GbS LACP team sounds great, but is it really 4Gb/s?
4GbS LACP team sounds great, but is it really 4Gb/s?

2:6 Build our Virtual Machine Switch for maximum flexibility & performance

If we just had the multiplexor protocol & LBFO available to us, it’d be great for physical server performance & durability. But if you’re deploying Hyper-V, you get to have your LBFO cake and eat it too, by putting a virtual switch atop the team.

This is all very easy to do in Hyper-V manager. Simply right click your server, select Virtual Switch Manager, make sure the Multiplexor driver is selected as the NIC, and press OK.

Bob’s your Uncle:

daisettaconverged1

But let’s go a bit deeper and do this via powershell, where we get some extra options & control:

PS C:usersjeff.DAISETTALABS> new-vmswitch -NetAdapterInterfaceDescription “Microsoft Network Adapter Multiplexor Driver” -AllowManagementOS 1 -MinimumBandwidthMode Weight -name “Daisetta-Converged”

Let’s go through each of these:

  • New-vmswitch : the cmdlet we’re invoking to build the switch. Run get-help new-vmswitch for a rundown of the cmdlet’s structure & options
  • -NetAdapterInterfaceDescription : here we’re telling Windows which NIC to build the VM Switch on top of. Get the precise name from Get-NetAdapter and enclose it in quotes
  • -Allow ManagementOS 1 : Recall the diagram above. This boolean switch (1 yes, 0 no) tells Windows to create the VM Switch & plug the Host/Management Operating System into said Switch. You may or may not want this; in the lab I say yes; at work I’ve used No.
  • -Minimum Bandwidth Mode Weight: We lay out the rules for how the switch will apportion some of the 4Gb/s bandwidth available to it. By using “Weight,” we’re telling the switch we’ll assign some values later
  • Name: Name your switch

A few seconds later, and congrats Mr. Hyper-V admin, you have built a converged virtual switch!

2:7 The vEthernets are Coming

Now that we’ve built our converged virtual switch, we need to plug some things into it. And that starts on the physical host.

If you’re building a Hyper-V cluster or stand-alone Hyper-V host with VMs on networked storage, you’ll approach vEthernet adpaters differently than if you’re building Hyper-V for VMs on attached/internal storage or on SMB 3.0 share storage. In the former, you’re going to need storage vEthernet adpters; in the latter you won’t need as many vEthernets unless you’re going multi-channel SMB 3.0, which we’ll cover in another labworks session.

I’m going to show you the iSCSI + Failover Clustering model.

In traditional Microsoft Failover Clustering for Virtual Machines, we need a minimum of five discrete networks. Here’s how that shakes out in the Daisetta Lab:

[table]

Network Name, VLAN ID, Purpose, Notes

Management, 1, Host & VM management network, You can separate the two if you like

CSV, 14, Host Cluster & communication and coordination, Important for clustering Hyper-V hosts

LM, 15, Live Migration network, When you must send VMs from broke host to host with the most LM is there for you

iSCSI 1-3, 11-13, Storage, Soemwhat controversial but supported

[/table]

Now you should be connecting that dots: remember in Labworks 2:1, we built a trunked port-channel on our Cisco 2960S for the sole purpose of these vEthernet adapters & our converged switch.

So, we’re going to attach tagged vethernet adapters to our host via powershell. Pay attention here to the “-managementOS” tag; though our Converged switch is for virtual machines, we’re using it for our physical host as well.

You can script his out of course (and VMM does that for you), but if you just want to copy paste, do it in this order:

  • Add the vEthernets
add-vmnetworkadapter -managementos -name CSV -switchname Daisetta-converged
add-vmnetworkadapter -managementos -name iSCSI-1 -switchname Daisetta-converged add-vmnetworkadapter -managementos -name iSCSI-2 -switchname Daisetta-converged
add-vmnetworkadapter -managementos -name iSCSI-3 -switchname Daisetta-converged
add-vmnetworkadapter -managementos -name LM -switchname Daisetta-converged
  • Tag those vEthernets!
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 15 -VMNetworkAdapterName LM
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 14 -VMNetworkAdapterName CSV
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 13 -VMNetworkAdapterName iSCSI-3
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 12 -VMNetworkAdapterName iSCSI-2
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 11 -VMNetworkAdapterName iSCSI-1
  • Now set IPs
New-NetIPAddress -IPAddress 172.16.14.12 -InterfaceAlias "vEthernet (CSV)" -AddressFamily IPv4 -PrefixLength 24
 
New-NetIPAddress -IPAddress 172.16.15.12 -InterfaceAlias “vEthernet (LM)” -AddressFamily IPv4 -PrefixLength 24
New-NetIPAddress -IPAddress 172.16.13.12 -InterfaceAlias "vEthernet (iSCSI-3)" -AddressFamily IPv4 -PrefixLength 24
New-NetIPAddress -IPAddress 172.16.12.12 -InterfaceAlias "vEthernet (iSCSI-2)" -AddressFamily IPv4 -PrefixLength 24
New-NetIPAddress -IPAddress 172.16.11.12 -InterfaceAlias "vEthernet (iSCSI-1)" -AddressFamily IPv4 -PrefixLength 24
 

Notice we didn’t include a Gateway in the New-NetIPAddress cmdlet; that’s because when we built our Virtual Switch with the “-managementOS 1” switch attached, Windows automatically provisioned a vEthernet adapter for us, which either got an IP via DHCP or took an apipa address.

So now we have our vEthernets and their appropriate VLAN tags:

daisettaconverged2
Ignore the DMZ vEthernet for now. Notice Daisetta-Converged, our VM Switch, is seen as a VMNetworkAdapter and is untagged. In my lab, this interface functions as my Host Management interface. In a production scenario, you’ll probably use separate vEthernet adapters for Host Management and not expose the switch itself to the management OS

 

 

 

 

 

 

 

2:8: Next Steps : Jumbo Frames from end-to-end & Performance Tuning

So if you’ve made it this far, congrats. If you do nothing else, you now have a converged Hyper-V virtual switch, tagged vEthernets on your host, and a virtualized infrastructure that’s ready for VMs.

But there’s more you can do; stay tuned for the next labworks post where we’ll get into jumbo frames & performance tuning this baby so she can run with all the bandwidth we’ve given her.

Links/Knowledge/Required Reading Used in this Post:

[table]
Resource, Author, Summary
New-VMSwitch Technet, Microsoft, Always good to have Technet reference
Building a Converged Fabric with Server 2012, Hans “The Hyper-Dutchman” Vredevoort, A 2012 post which helped me when I was struggling through 2008 R2 to 2012 Hyper-V migration

Hyper-V 3.0 Converged Networks with Force 10 and DCB, Dell, Neat Wiki & diagram with iSCSI as separate virtual switch but with DCB

[/table]

 

 

Live Migration Performance in Theory & Practice -or- In Which I take on Aidan Finn

Aidan Finn, upstanding Irishman, apparent bear-cub puncher, hobbyist photog, MVP all-star  and one of my favorite Hyper-V bloggers (seriously, he’s good, and along with DidierV & the Hyper-Dutchman has probably saved my vAss more times than I can vCount) appeared on one of my favorite podcasts last week, RunAs Radio with Canuck Richard Campbell.

Which is all sorts of awesome as these are a few of my favorite things piled on top of each other (Finn on RunAs).

The subject? Hyper-V, scale out file servers (SoFS) in 2012 R2, SMB 3.0 multichannel and Microsoft storage networking, which are just about my favoritest subjects in the whole wide world. I mean what are the odds that one of my favorite Hyper-V bloggers would appear on one of my favorite tech podcasts? Remote. And talk about storage networking tech, Redmond-style, during that podcast?

Where Perfmon is king, you will find Hyper-V bloggers like DidierV, who gets to play with 10G RDMA NICs
Where Perfmon is king, you will find Hyper-V bloggers like DidierV, who gets to play with 10G RDMA NICs

All that and an adorable Irish brogue?

This is Instant nerdgasm territory here people; if you’re into these black arts as I am, it’s a must-listen.

Anyway, Finn reminded me of his famous powershell demos in which he demonstrates all the options we Hyper-V admins have at our disposal now when it comes to Live Migrating VMs from host to host.

And believe me, we have so many now it’s almost embarrassing, especially if you cut your teeth on Hyper-V 2.0 in 2008 R2, where successfully Live Migrating VMs off a host (or draining one during production) involved a few right clicks, chicken sacrifice, Earth-Jupiter-Moon alignment, a reliable Geiger counter by your side and a tolerance for Pucker Factor Values greater than 10* **.

Nowadays, we can:

  • Live Migrate VMs between hosts in a cluster (.vhdx parked in a Cluster Shared Volume, VM config, RAM & CPU on a host….block storage, the Coke Classic option)
  • Live Migrate VMs parked on SMB 3.0 shares, just like you NFS jockeys do
  • Shared-nothing Live Migration, either storage + VM, just storage, or just VM!
    • A for instance:  from my Dell Latitude i7 ultrabook with Windows 8.1 and client hyper-v installed (natch), I can storage Live Migrate a .vhdx off my skinny but fast 256GB SSD to a spacious SMB share at work, then drop it back on my laptop at the end of the day, all via Scheduled Task or powershell with no downtime for the VM
    •  With Server 2012/2012 R2 you get all those options + SMB 3.0 multichannel

Not only that, but we have some cool new toys with which to make the cost of Live Migration a VM to the host with the most a little less painful:

  • Standard TCP/IP : I like this because I’m old school and anything that stresses the network and LACP is fun because it makes the network guy sweat
  • Compression: Borrow spare cycles from the host CPU, compress the VM’s RAM, and Live Migrate your way out of a tight spot
  • SMB via Remote Direct Memory Access : the holy of holies in Live Migration. As Finn points out, this bit of tech can scale beyond the bandwidth capabilities of the PCIe 3 bus. SMB 3.0 + RDMA makes you hate your Northbridge

Finn*** of course provided some Live Migration start:finish times resulting from the various methods above, which I then, of course, interpreted as Finn daring me personally over the radio to try and beat those times in my humble Daisetta Lab.

Now this is just for fun people; not a Labworks-style list of repeatable results, so let’s not nerd-out on how my testing methodology isn’t sound & I’m a stupidhead, ok?

Anyway, Sysinternals has a nice little tool to redline the RAM in your Windows VM. I don’t know how Finn does it, but I don’t have workloads (yet!) in the Lab that would fill 4GB of RAM with non-random data on a VM, so off to the cmd we go:

You type this (haven’t played with all the switches yet) in this navy blue screen:

ramtest

And then this happen and the somewhat pink graph goes full pink:

ramthevm

Then we press this button to test Live Migration w/ compression, as the Daisetta Lab doesn’t have fancy RDMA NICs like certain well-connected Irish Hyper-V bloggers:

Wish I had some RDMA NICs :sadface:
Wish I had some RDMA NICs :sadface:

Which makes this blue celeste denim Azure colored line get all spikey:

Oh...my NUMAs, they're spikey. Second test was more dramatic than first. Why?
Oh! My NUMA! Second spike somewhat higher than first. Why?

all of which results in a wicked-fast Live Migrations & really cool orange-colored charts in my totally non-random, non-scientific but highly enjoyable laboratory experiment

ramthevm2

Still, in the end, I like my TCP/IP uncompressed Live Migrations because 1) sackcloth & ashes, and 2) I didn’t go to the trouble of building a multiplexed LACP team -with a virtual switch on top!- just to let the Cat5es in my attic have an easy day at the office:

livemigration1gbe

But at work: yes. I love this compression stuff and echo Finn’s observations on how Hyper-V doesn’t slam your host CPUs beyond what the host & its VM fleet could bear.

Anyway, did I beat Finn’s Live Migration times in this fun little test? Will the Irish MVP have to admit he’s not so esteemed after all and surrender his Hyper_V_MVP_badge.gif to me?

Of course I did and yes he will.

But not really.

[table caption=”Daisetta Lab LM vs Finn’s Powershell LM Scripts – 4GB VM” width=”500″ colwidth=”20|100|50″ colalign=”left|left|left|left|left”]
Who,TCP/IP LM,Compressed LM,RDMA & SMB 3 LM,Notes
Finn,78 seconds, 15 seconds,6.8 seconds, “Mr. I once moved a VM with 56GB of RAM in 35 seconds probably has a few Xeons”
D-Lab,38 seconds,Like 12 or something,Who’s ass do I need to kiss to get RDMA/iWarp?, But seriously my VM RAM was probably not random
[/table]

Finn notes in his posts that he’s dedicating an entire 1GbE NIC for his Live Migration Demos, wheras I’m embracing the converged switch model and haven’t even played with bandwidth or QOS settings on my Hyper-V switch.

How do my VMware colleagues & friends measure this stuff & think about vMotion performance & reliability? I know NFS can scale & perform, but am ignorant on the nuances of v3 vs v4, how it works on the host and Distributed vSwitch and your “Shared nothing” storage vMotion. And what’s this I hear that vSphere won’t begin a vMotion without knowing it will complete? How’s that determined?

I mean I could spend an hour or two googling it, or you could, I don’t know, post a comment and save me the time and spread some of your knowledge 😀

I’m jazzed about SMB 3.0, but there are only a handful of storage vendors who have support for the new stack, and among them, as Finn points out, Microsoft is #1 storage vendor for SMB 3 fans, with NetApp probably in 2nd place.

 

 

* Just kidding, it wasn’t that bad. Most days. 

** Pucker Factor Value can be measured by querying obscure wmi class win32_pfv

*** Finn is a consultant. So you can hire him. I have no relationship with him other than admiration for his scripting skillz

Labworks 2:1-4 : Converged Hyper-V Switching like a boss

Greetings Labworks fans, today we’re going to learn how to build converged Hyper-V switches, switches so cool they’re nearly identical to the ones available to enterprise users with their fancy System Center licenses.

If you’re coming from a VMware mindset, a Hyper-V converged switch is probably most similar to Distributed vSwitches, though admittedly I’m a total n00b on VMware, so take that statement with a grain of salt. The idea here is to build an advanced switching fabric on your Hyper-V hosts that is fault-tolerant & performance-oriented, and like a Distributed vSwitch, common among your physical hosts and your guests. 

This is one of my favorite topics because I have a serious & problematic love-affair with LACP and a Terrets-like urge to team things up & jumbo, but you don’t need an LACP-capable switch or jumbo frame to enjoy Converged Switching goodness.

Let’s dive in, shall we?

Goals

  1. Prepare the physical switch for Jumbo Frames
  2. Understand LBFO: Microsoft’s Load Balancing/Fail Over teaming technology introduced in Server 2012
  3. Enable LACP on the Switch and on the Server
  4. Build the Switch on the Team & Next Steps

Required Tools ‘n Tech:

  • Server 2012 or 2012 R2…sorry Windows 8.1 Professional/Enterprise fans…LBFO is not available for 8.1. I know, I feel your pain. But the naked Hyper-V 3.0 Hypervisor (Core only) is free, so what are you waiting for?
  • A switch, preferably gigabit. LACP not required but a huge performance multiplier
  • NICs: As in plural. You need at least two. Yes, you can use your Keepin’ it RealTek NICs..Hyper-V doesn’t care that your NICs aren’t server-grade, but I advise against consumer-NICs for production!!

Schematic

State of the Lab as of today. Ag_node_1 is new, with a core i7 Haswell (Yay!), ag_node_2 is the same, still running CSVs off my ZFS box, and check it out, bottom right: a new host, SMB1:

Lab 2 - Daisetta Labs overview

SMB1 Detail:

 

labworks 2

2:1 Prepare the Physical Switch for Jumbo Frames

You can skip this section if all you have at your disposal is a dumb switch.

Commands below are off of a Cisco 2960s. Commands are similar on the new SG300 & 500 series Cisco switches. PowerConnect 5548 switches from Dell aren’t terribly different either, though I seem to recall you have you enable jumbo mtu on each port as well as the switch.

First we’re going to want to turn on Jumbo Frames, system-wide, which usually requires a reload of your switch, so schedule for a maintenance window!

daisettalabs.net(config)#system mtu jumbo 9198

You can run a show system mtu after the reload to be sure the switch is ready for the corpulent frames you will soon send its way:

daisettalabs.net#show system mtu

System MTU size is 1514 bytes
System Jumbo MTU size is 9198 bytes
System Alternate MTU size is 1514 bytes
Routing MTU size is 1514 bytes

2:2 Load Balancing & Failover

Load Balancing & Failover, or LBFO as it’s known, was the #1 feature I was looking forward to in Server 2012.

And boy did Microsoft deliver.

LBFO is a driver/framework that takes whatever NICs you have, “teams” them, applies a mature & resilient multiplexor driver to them, and gives you redundancy & performance in just a few clicks or powershell cmdlets. Let’s do GUI for the team, and later on, we’ll use Powershell to build a switch on that team.

Sidenote: Don’t bother applying IP addresses, VLANs to your LBFO-destined physical NICs at this point. Do bother installing your manufacturer’s latest driver, or hacking one on as I’ve had to do with my new ag_node_1 Intel NIC. (SideSideNote: as this blogger states, Intel can eat a bag of d**** for dropping so many NICs from Server 2012 support. Broadcom, for all the hassles I’ve had with them, still updates drivers on four year old cards!)

On SMB1 from the above schematic, I’ve got five gigabit NICs. One is a RealTek on the motherboard, and the other four are Intel; 1-4 on a PCIe Quad Gigabit network card, i350 x4 I believe.

nics1

The RealTek NIC has a static IP and is my management interface for the purposes of this labworks. We’ll only be teaming the four Intel NICs here. Be sure to leave at least one of your NICs out of the LBFO team unless you are sitting in front of your server console; you can always add it in later.

Launch Server Manager in the GUI and click on “All Servers,” then right click on SMB1 and select Configure NIC Teaming:

nics2

A new window will emerge,titled, NIC Teaming.

In the NIC Teaming window, notice on the right the five GbE adapters you have and their status (Green Arrow). Click on “Tasks” and select “New Team” (Red Arrow):

nics3

The New Team window is where all the magic happens. Let’s pause for a moment and go to our switch.

On my old 2960s, we’re building LACP-flavored port channels by using the “channel group _ mode active” command, which tells the switch to use the genuine-article LACP/802.11ax protocol rather than the older Cisco proprietary Port Aggregation Protocol (PAgp) system, which is activated by running “channel group _ mode auto.”

However, if you have a newer switch, perhaps a nice little SG 300 or something similar, PAgp is dead and not available to you, but the process for LACP is like the old PAgp command: “channel group _ mode auto”  will turn on LACP.

Here’s the 2960s process. Note that my Intel NICs are plugged into Gig 1/0/20-23, with spanning-tree portfast enabled (which we’ll change once our Converged virtual switch is built):

daisettalabs.net#show run int gig 1/0/20
Building configuration...

Current configuration : 63 bytes
!
interface GigabitEthernet1/0/20
spanning-tree portfast

daisettalabs.net#conf t
Enter configuration commands, one per line. End with CNTL/Z.
daisettalabs.net(config)#int range gig 1/0/20-23
daisettalabs.net(config-if-range)#description SMB1 TEAM
daisettalabs.net(config-if-range)#speed 1000
daisettalabs.net(config-if-range)#duplex full
daisettalabs.net(config-if-range)#channel-group 3 mode active
daisettalabs.net(config-if-range)#switchport mode trunk
daisettalabs.net(config-if-range)#
daisettalabs.net(config-if-range)#do wr
Building configuration...
[OK]

Presto! That wasn’t so hard was it?

Note that I’ve trunked all four interfaces; that’s important in Hyper-V Converged switching. We’ll need to trunk po3 as well. 

Let’s take a look at our new port channel:

daisettalabs.net(config-if-range)#do show run int po3
Building configuration…

Current configuration : 54 bytes
!
interface Port-channel3
switchport mode trunk
end

daisettalabs.net(config-if-range)#

Now let’s check the state of the port channel:

daisettalabs.net#show etherchannel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator

M - not in use, minimum links not met u - unsuitable for bundling w - waiting to be aggregated d - default port Number of channel-groups in use: 3 Number of aggregators: 3 Group Port-channel Protocol Ports ------+-------------+-----------+----------------------------------------------- 
1 Po1(SU) LACP Gi1/0/1(P) Gi1/0/2(P) Gi1/0/3(P)
2 Po2(SU) LACP Gi1/0/11(D) Gi1/0/13(P) Gi1/0/14(P) Gi1/0/15(P) Gi1/0/16(P) 
3 Po3(SD) LACP Gi1/0/19(s) Gi1/0/20(D) Gi1/0/21(s) Gi1/0/22(s) Gi1/0/23(D)

po3 is in total disarray, but not for long. Back on SMB1, it’s time to team those NICs:

nic5

I’m a fan of naming-conventions even if this screenshot doesn’t show it; All teams on all hosts have the same “Daisetta-Team” name, and I usually rename NICs as well, but honestly, you could go mad trying to understand why Windows names NICs the way it does (Seriously. It’s a Thing). There’s no /dev/eth0 for us in MIcroosft-land, it’s always something obscure and strange and out-of-sequence, which is part of the reason why Converged Switching & LBFO kick ass; who cares what your interfaces are named so long as they are identically configured?

If you don’t have an LACP-capable switch, you’ll select “Switch Independent” here.

As for Load Balancing modes: in server 2012, you get Address Hash (Source/Dest MAC or IP in Layer 3 LACP), or Hyper-V Port, which is sort of a round-robin approach (VM1 goes to one port in the team, VM2 to the other).

I prefer the new (with 2012 R2) Dynamic mode which negotiates with the physical switch. More color on those choices & what they mean for you in the References section at the bottom.

Press ok, sit back, and watch my gifcam shot:

Mmmm, taste the convergence.

2:4 Build a Switch on top of that team & Next Steps

If you’ve ever built a switch for Hyper-V, you’ll find building the converged switch immediately familiar, save for one technicality: you’re going to build a switch on top of that multiplexor driver you just created!

Sounds scary? Perhaps. I’ll go into some of the intricacies and gotchas and show some cool powershell bits ‘n bobs on the next episode of Labworks.

Eventually we’re going to dangle all sorts of things off this virtual switch-atop-a-multiplexor-driver!

nic6

 

Links/Knowledge/Required Reading Used in this Post:

[table]
Resource, Author, Summary
Windows Server 2012 LBFO Whitepaper, Microsoft, Must-have though a bit dated at this point
Etherchannel Considerations, Jeremy Stretch at Packetlife.net, Great overview on Cisco aggregation tech including LACP and PAgp

VLAN Tricks with NICS – Teaming & Hyper-V, Keith Mayer, LBFO + VLANs – Hyper-V = still a win

[/table]

Labworks #1: Building a durable, performance-oriented ZFS box for Hyper-V, VMware

Welcome to my first Labworks post in which I test, build & validate a ZFS storage solution for my home Hyper-V & VMware lab.

Be sure to check out the followup lab posts on this same topic in the table below!

[table]

Labworks Chapter, Section, Subject, Title & URL

Labworks 1:, 1, Storage, Building a Durable and Performance-Oriented ZFS Box for Hyper-V & VMware

,2-3, Storage, I Heart the ARC & Let’s Pull Some Drives!

[/table]

Labworks  #1: Building a durable, performance-oriented ZFS box for Hyper-V, VMware

Primary Goal: To build a durable and performance-oriented storage array using Sun’s fantastic, 128 bit, high-integrity Zetabyte File System for use with Lab Hyper-V CSVs & Windows clusters, VMware ESXi 5.5, other hypervisors,

 

The ARC: My RAM makes your SSD look like 15k drives
The ARC: My RAM makes your SSD look like a couplel of old, wheezing 15k drives

Secondary Goal: Leverage consumer-grade SSDs to increase/multiply performance by using them as ZFS Intent Log (ZIL) write-cache and L2ARC read cache

Bonus: The Windows 7 PC in the living room that’s running Windows Media Center with CableCARD & HD Home Run was running out of DVR disk space and can’t record to SMB shares but can record to iSCSI LUNs.

Technologies used: iSCSI, MPIO, LACP, Jumbo Frames, IOMETER, SQLIO, ATTO, Robocopy, CrystalDiskMark, FreeBSD, NAS4Free, Windows Server 2012 R2, Hyper-V 3.0, Converged switch, VMware, standard switch, Cisco SG300

Schematic: 

Click for larger
Click for larger.

Hardware Notes:
[table]
System, Motherboard, Class, CPU, RAM, NIC, Hypervisor
Node-1, Asus Z87-K, Consumer, Haswell i-5, 24GB, 2x1GbE Intel I305, Hyper-V
Node-2, Biostar HZZMU3, Consumer, Ivy Bridge i-7, 24GB, 2x1GbE Broadcom BC5709C, Hyper-V
Node-3, MSI 760GM-P23, Consumer, AMD FX-6300, 16GB, 2x1GbE Intel i305, ESXi 5.5
san2, Gigabyte GA-F2A88XM-D3H, Consumer, AMD A8-5500, 24GB, 4x1GbE Broadcom BC5709C, NAS4Free
sw01, Cisco SG300-10 Port, Small Busines, n/a, n/a, 10x1GbE, n/a
[/table]

Array Setup:

I picked the Gigabyte board above because it’s got an outstanding eight SATA 6Gbit ports, all running on the native AMD A88x Bolton-D4 chipset, which, it turns out, isn’t supported well in Illumos (see Lab Notes below).

I added to that a cheap $20 Marve 9128se two port SATA 6gbit PCIe card, which hosts the boot volume & the SanDisk SSD.

[table]

Disk Type, Quantity, Size, Format, Speed, Function

WD Red 2.5″ with NASWARE, 6, 1TB, 4KB AF, SATA 3 5400RPM, Zpool Members

Samsung 840 EVO SSD, 1, 128GB, 512byte, 250MB/read, L2ARC Read Cache

SanDisk Ultra Plus II SSD, 1, 128GB, 512byte, 250MB/read & 250MB/write?, ZIL

Seagate 2.5″ Momentus, 1, 500GB, 512byte, 80MB/r/w, Boot/swap/system

[/table]

Performance Tests:

I’m not finished with all the benchmarking, which is notoriously difficult to get right, but here’s a taste. Expect a followup soon.

All shots below involved lzp2 compression on SAN2

SQLIO Short Test: 

sqlio lab 1 short test
Obviously seeing the benefit of ZFS compression & ARC at the front end. IOPS become more realistic toward the middle and right as read cache is exhausted. Consistently in around 150MB-240Mb/s though, the limit of two 1GbE cables.

 

ATTO standard run:

atto
I’ve got a big write problem somewhere. Is it the ZIL, which don’t seem to be performing under BSD as they did under Nexenta? Something else? Could also be related to the Test Volume being formatted NTFS 64kb. Still trying to figure it out

 

NFS Tests:

None so far. From a VMware perspective, I want to rebuild the Standard switch as a distributed switch now that I’ve got a VCenter appliance running. But that’s not my priority at the moment.

Durability Tests:

Pulled two drives -the limit on RAIDZ2- under normal conditions. Put them back in, saw some alerts about the “administrator pulling drives” and the Zpool being in a degraded state. My CSVs remained online, however. Following a short zpool online command, both drives rejoined the pool and the degraded error went away.

Fun shots:

Because it’s not all about repeatable lab experiments. Here’s a Gifcam shot from Node-1 as it completely saturates both 2x1GbE Intel NICs:

test

and some pretty blinking lights from the six 2.5″ drives:

0303141929-MOTION

Lab notes & Lessons Learned:

First off, I’d like to buy a beer for the unknown technology enthusiast/lab guy who uttered these sage words of wisdom, which I failed to heed:

You buy cheap, you buy twice

Listen to that man, would you? Because going consumer, while tempting, is not smart. Learn from my mistakes: if you have to buy, buy server boards.

Secondly, I prefer NexentaStor to NAS4Free with ZFS, but like others, I worry about and have been stung by Open Solaris/Illumos hardware support. Most of that is my own fault, cf the note above, but still: does Illumos have a future? I’m hopeful, NextentaStor is going to appear at next month’s Storage Field Day 5, so that’s a good sign, and version 4.0 is due out anytime.

The Illumos/Nexenta command structure is much more intuitive to me than FreeBSD. In place of your favorite *nix commands, Nexenta employs some great, verb-noun show commands, and dtrace, the excellent diagnostic/performance tool included in Solaris is baked right into Nexenta. In NAS4Free/FreeBSD 9.1, you’ve got to add a few packages to get the equivalent stats for the ARC, L2ARC and ZFS, and adding dtrace involves a make & kernel modification, something I haven’t been brave enough to try yet.

Next: Jumbo Frames for the win. From Node-1, the desktop in my office, my Core i5-4670k CPU would regularly hit 35-50% utilization during my standard SQLIO benchmark before I configured jumbo frames from end-to-end. Now, after enabling Jumbo frames on the Intel NICs, the Hyper-V converged switch, the SG-300 and the ZFS box, utilization peaks at 15-20% during the same SQLIO test, and the benchmarks have show an increase as well. Unfortunately in FreeBSD world, adding jumbo frames is something you have to do on the interface & routing table, and it doesn’t persist across reboots for me, though that may be due to a driver issue on the Broadcom card.

The Western Digital 2.5″ drives aren’t stellar performers and they aren’t cheap, but boy are they quiet, well-built, and run cool, asking politely for only 1 watt under load. I’ve returned the hot, loud & failure prone HGST 3.5″ 2 TB drives I borrowed from work; it’s too hard to put them in a chassis that’s short-depth.

Lastly, ZFS’ adaptive replacement cache, which I’ve enthused over a lot in recent weeks, is quite the value & performance-multiplier. I’ve tested Windows Server 2012 R2 Storage Appliance’s tiered storage model, and while I was impressed with it’s responsiveness, ReFS, and ability to pool storage in interesting ways, nothing can compete with ZFS’ ARC model. It’s simply awesome; deceptively-simple, but awesome.

Lesson is that if you’re going to lose an entire box to storage in your lab, your chosen storage system better use every last ounce of that box, including its RAM, to serve storage up to you. 2012 R2 doesn’t, but I’m hopeful soon that it may (Update 1 perhaps?)

Here’s a cool screenshot from Nexenta, my last build before I re-did everything, showing ARC-hits following a cold boot of the array (top), and a few days later, when things are really cooking for my Hyper-V VMs stored, which are getting tagged with ZFS’ “Most Frequently Used” category and thus getting the benefit of fast RAM & L2ARC:

cache

Next Steps:

  • Find out why my writes suck so bad.
  • Test Nas4Free’s NFS performance
  • Test SMB 3.0 from a virtual machine inside the ZFS box
  • Sell some stuff so I can buy a proper SLC SSD drive for the ZIL
  • Re-build the rookie Standard Switch into a true Distributed Switch in ESXi

Links/Knowledge/Required Reading Used in this Post:

[table]
Resource, Author, Summary
Three Example Home Lab Storage Designs using SSDs and Spinning Disk, Chris Wahl, Good piece on different lab storage models
ZFS, Wikipedia, Great overview of ZFS history and features
Activity of the ZFS Arc, Brendan Gregg, Excellent overview of ZFS’ RAM-as-cache
Hybrid Storage Pool Performance, Brendan Gregg, Details ZFS performance
FreeBSD Jumbo Frames, NixCraft, Applying MTU correctly
Hyper-V vEthernet Jumbo Frames, Darryl Van der Peijl, Great little powershell script to keep you out of regedit
Nexenta Community Edition 3.1.5, NexentaStor, My personal preference for a Solaris-derived ZFS box
Nas4Free, Nas4Free.org, FreeBSD-based ZFS; works with more hardware
[/table]