Just how hard is it in 2015 to order & deploy a cheap commodity internet circuit to connect a remote office/branch office (ROBO) to the rest of your corporate WAN via the internet? ((Commodity = business class internet, something less reliable but orders of magnitude less expensive than a traditional private line, T1, or managed MPLS circuit. Commodity also means fat, dumb internet pipe, a product that cable internet companies consider an existential threat))
Pretty damned hard.
Why so difficult Jeff?!? you’re thinking. I stand-up tunnels and tear them down all day long, I route/switch in my sleep and verily I say unto you that my packets always find their way home, tags intact, whether on the WAN, between switch closets in the campus, or between nodes in the datacenter!
Verily they do indeed, and I salute you, you herder of stray packets!
It’s not that the technology connecting core to branch is hard or difficult, no, what I’m bitching about today is connecting the branch site to the internet in the first place.
It’s layer 1, stupid.
Truly, ordering internet service for a small or even medium-sized branch office is one of the most painful exercises in modern IT.
Here, let me show you:
You Bing/Google various iterations of “Lake Winnepesaukah ISPs,” , “Punxatawney Packet Delivery,” , “Broadband Service in Topeka,” “Ethernet over Copper + Albuquerque,” “Business Cable Internet – Pompano Beach, FL” and such. Dismissing the spam URL results on Page 1-12, you eventually arrive at Comcast, Time Warner, or Charternee Spectrum Business, or whatever little coax fiefdom has carved out a franchise at the edge of your business. You visit their website, click “Business” and fight your way through pop-ups and interstitials to a page that says it can verify service at your branch office’s address.
Right, you think,I’ll just Tab-tab my way through this form, input my branch office address here, punch that green submit button there, and get these nasty Layer 1 bits out of the way. But this isn’t the old days of 2009 when you could order a circuit online or at least verify service…oh no, no sir, this is the future…this is 2015. In 2015, you see, the Cable providers demand audience with you, so that they can add value.
Pay the Last Mile Toll: So you surrender your digits and wait for a phone call. When it rings 36-72 hours later, you’re determined to keep it short. What you want is a simple yes/no on service at your ROBO, or an install date, but what you get is a salesperson who can’t spell TCP/IP and wants to sell you substandard VoIP & TV. “Will you be uploading or downloading with this internet connection?” is just one of the questions you’ll suffer through to mollify the last mile gatekeepers standing between you and #PacketGlory on the WAN.
At long last, install day arrives: You’ve drop-shipped the edge router/overlay device, you’ve coordinated with the L-con, and the CableCo tech is on site at your ROBO to install your circuit. Hallalelujah, you think, as you wait for the tunnel to come up. But it never does, because between your awesome zero-touch edge device & your datacenter lies some crazy bespoke 2Wire gateway device that NATs or offers up a free wifi connection to the public on your dime. Another phone call, another fight to get those things turned off.
Nuts to all that, I say.
This is America jack, and the great thing about America is choice. Even when you don’t have choice (and you don’t in the case of cable franchises & municipalities), all you may need is line of sight to one of these things:
That’s right. Fixed wireless, baby. I’m hot on fixed wireless in 2015. It’s everything CableCo isn’t. It’s:
Friction free: In place of the coax fiefdoms and gatekeepers, the 1-800 numbers, and the aggressive salespeople, there’s just Joe, a real engineer at a local fixed wireless ISP. Joe’s great because Joe’s local, and Joe takes your order, gives you his mobile, installs the antenna at your branch, and hands you a blue wire with three static IPs.
Super-fast to deploy. You want internet at your ROBO? Well guess what? It’s already there, you just need the equipment to catch it.
More reliable than it used to be: Now of course this all depends on the application you’re trying to deliver to your ROBO, but I’ll say this: Fixed Wireless has improved. You don’t need to fear (as much) a freak snowstorm, a confused flock of Canada Geese, or rain. For a small ROBO, a fixed wireless connection might be enough to serve as the primary WAN link. For larger ROBOs, I think the technology is mature enough to serve as a secondary WAN link, or even your primary Internet circuit. ((Routing business traffic over the expensive wired link and internet over the cheap fixed wireless link is a recipe I’d recommend all day long and twice on Sundays ))
As Secure as Anything Else These Days: How difficult would it be to perform a man in the middle attack via interception of a fixed wireless connection? I’m not sure, to be honest, but if you aren’t encrypting your data before it leaves your datacenter, you have a whole lot more to worry about than a blackhat with a laptop, a stick, and a microwave antenna.
Cost competitive: I’ve deployed a couple of fixed wireless connections and I find the cost to be very competitive with traditional cable company offerings. Typically you’ll pay about $200 for the antenna install, but unlike the fee Comcast would charge you to install their modem, I think this is justified as it involves real labor and a certain amount of risk.
Regional/Hyper-local but still innovative: For whatever reason, fixed wireless ISPs have proven resistant to the same market forces that killed off your local dial-up/DSL ISP. Yet this isn’t a stagnant industry; quite the opposite in fact, with players like Ubiquiti Networks releasing new products.
I’ve been working on the WAN a lot lately and I’ve deployed two fixed wireless circuits at ROBOs. If you’ve got similar ROBO WAN pains, you should have a look at fixed wireless, you might be surprised!
Devoted readers of Agnostic Computing.com, I write today to implore you to set your powershell scripts to Signed, get your Windows Key + R trigger fingers ready, and prep your forests and domains for a functional upgrade because today ladies and gentlemen, today, we get a new Windows.
There’s some excitement in Microsoft Country again.
No one knows what it’ll be called. Windows 9 is the front-runner, but late-breaking rumors say big MS could throw us for a loop too and name it Windows TH (Threshold?!?! the pundits echo) or just plain old Windows.
I say they should name it Windows TNS: Windows The New Shiny. Because among the rumors I’ve enjoyed hearing most is the one Microsoft may offer a sort of Windows 365 subscription for fanbois like me, a continuously morphing and changing OS, just like my O365 experience has been. New Shiny Windows every month…well maybe I’d tell ConfigMan to delay updates for a week or so, just to shake the bugs loose. But still. A subscription OS would be great.
But that’s a long-shot and probably not a very strong selling point for today’s event, which is, as everyone has noted, focused entirely on enterprise computing.
You see, Microsoft is trying desperately to court Enterprise IT people, to bring us back into the fold, targeting this entire event today at IT people like me who were aghast & horrified two years ago when they first installed Windows 8 in a VM.
“No. No. To get to start screen, hover your mouse in the lower corner. The lower corner, not the charms bar.There it is. Click that. Ahh shit, you missed it. Try again.” was how the conversation went throughout IT departments in ‘Merica.
As I’ve written before, the experience of Windows 8 & Server 2012 was so shocking and painful, it sent me running and crying into the Mac OS X camp, and then into ChromeBook fantasyland.
But I got over it. I overcame, and I figured out how to move all that nonsense touch stuff away when Windows 8.1/Server 2012 R2 debuted about a year ago.
Apparently other IT pros haven’t, and are still sticking to Windows 7 as if it’s the greatest thing since Active Directory. Thus today’s event.
To them I say: get with the program, or get left behind. Windows 8 did suck, but 8.1 & 2012 R2 were fine recoveries. If you decided to punt on learning about Windows 8.1/2012 R2, you missed a whole bunch of incredible advancements that are only going to improve with Windows TNS. Have fun catching up on this:
Baked in Hyper-V. Free on Windows 8.1 Pro and up. A virtual lab on every desktop.
Tiered Storage Spaces in Windows server 2012 R2: yet another software abstraction framework, but for your storage! You missed out on this too!
An awesome networking stack, totally rewritten: Native support for teaming, network function virtualizations, Layer 3 routing protocols via PowerShell…oh my. I’d hate to be you stuck with a Server 2008 R2 box, running your old tired batch files, your dated vbs scripts and ipconfig. You missed out on some incredible advancements
And the great thing is that all this is going to get better, I think (hope). True, we won’t be learning about Windows Server today (Aidan Finn reckons that + nextgen System Center will be next month) but there will be lots of detail about our next Enterprise desktop product, by which you can bet people like me will make inferences for the next server product.
Things are looking up in Microsoft Country. We’ve a ten year head start on Trustworthy Computing (ShellShock couldn’t have had better timing for MS), a highly-modular & secure OS, a mature cloud stack, a SaaS offering second to-none (O365) and now, today, a new Windows OS.
This is a really lame but (IMHO) effective drawing of what I think of as a modern small/medium business enterprise ‘stack’:
As you can see, just about every element of a modern IT is portrayed.
Down at the base of the pyramid, you got your storage. IOPS, RAID, rotational & ssd, snapshots, dedupes, inline compression, site to site storage replication, clones and oh me oh my…all the things we really really love are right here. It’s the Luntastic layer and always will be.
Above that, your compute & Memory. The denser the better, 2U Pizza Boxes don’t grow on trees and the business isn’t going to shell out more $$$ if you get it wrong.
Above that, we have what my networking friends would call the “Underlay network.” Right. Some cat 6, twinax, fiber, whatever. This is where we push some packets, whether to our storage from our compute, northbound out to the world, southbound & down the stack, or east/west across it. Leafs, spines, encapsulation, control & data planes, it’s all here.
And going higher -still in Infrastructure Land mind you- we have the virtualization layer. Yeah baby. This is what it’s all about, this is the layer that saved my career in IT and made things interesting again. This layer is designed to abstract all that is beneath it with two goals in mind: cost savings via efficiency gains & ease of provisioning/use.
And boy,has this layer changed the game, hasn’t it?
So if you’re a virtualization engineer like I am, maybe this is all you care about. I wouldn’t blame you. The infrastructure layer is, after all, the best part of the stack, the only part of the stack that can claim to be #Glorious.
But in my career, I always get roped in (willingly or not) into the upper layers of the stack. And so that is where I shall take you, if you let me.
Next up, the Platform layer. This is the layer where that special DBA in your life likes to live. He optimizes his query plans atop your Infrastructure layer, and though he is old-school in the ways of storage, he’s learned to trust you and your fancy QoS .vhdxs, or your incredibly awesome DRS fault-tolerant vCPUs.
Or maybe you don’t have a DBA in your Valentine’s card rotation. Maybe this is the layer at which the devs in your life, whether they are running Eclipse or Visual Studio, make your life hell. They’re always asking for more x (x= memory, storage, compute, IP), and though they’re highly-technical folks, their eyes kind of glaze over when you bring up NVGRE or VXLAN or Converged/Distributed Switching or whatever tech you heart at the layer below.
Then again, maybe you work in this layer. Maybe you’re responsible for building & maintaining session virtualization tech like RDS or XenApp, or maybe you maintain file shares, web farms, or something else.
Point is, the people at this layer are platform builders. To borrow from the automotive industry, platform guys build the car that travels on the road infrastructure guys build. It does no good for either of us if the road is bumpy or the car isn’t reliable, does it? The user doesn’t distinguish between ‘road’ and ‘car’, do they? They just blame IT.
Next up: software & service layer. Our users exist here, and so do we. Maybe for you this layer is about supporting & deploying Android & iPhone handsets and thinking about MDM. Or maybe you spend your day supporting old-school fat client applications, or pushing them out.
And finally, now we arrive to the top of the pyramid. User-space. The business.
This is where (and the metaphor really fits, doesn’t it?) the rubber meets the road ladies and gentlemen. It’s where the business user drives the car (platform) on the road (infrastructure). This is where we sink or swim, where wins are tallied and heros made, or careers are shattered and the cycle of failure>begets>blame>begets>fear>begets failure begins in earnest.
That’s the stack. And if you’re in IT, you’re in some part of that stack, whether you know it or not.
But the stack is changing. I made a silly graphic for that too. Maybe tomorrow.
from a vm here, through an F5 there, out the traffic shaper and then to the next hop
The Great Unknown, the Slash 8
Truly one packet in its time plays many routes
alas, aggregate, balance or seek diverse routes
the packets do not
Into oblivion go the flows
when the WAN LED no longer glows
Let’s take a step together into a place unfamiliar and dark. A place that is, by all rights, strange and bewildering. A little place I like to think of as just one order of magnitude less rational than the Twilight Zone…a place few understand, and even fewer have mastered. A place just beyond my gateway, a place I really don’t care about except when I do, a place I like to call, the Wide Area Network.
That’s right. Let’s talk about the next hop. The land of BGP and OSPF and NAT and VPNs and QoS and CoS and DSCP and the “Goddamn ASA” and static routes and the “Goddamn firewall” all these words, phrases and acronyms you heard once, but dismissed as just so much babble out of the networking guy’s mouth, the one guy on your team who seems to age faster than all the others.
Hell, if it were up to you, Mr. Storage Networking Engineer, you’d do some LACP trunks or hook up MPIO up to that WAN and call it a day, amiright? I mean what’s so complicated here? Of course links go down, that’s why teams (and virtual teams-of-teams!) are so cool!
But alas, all the world’s not a storage array, and all links to it are not teamed GigE interfaces with sub-millisecond latency.
And your business WAN, particularly the links to/from remote sites that comprise the RFC-1918d, encapsulated, virtual private wide-area network your typical mid-sized business with a large footprint depend on, fail far too often.
Or at least they have for me when I look back and survey the glories & wreckage of my 15 year IT career.
Verily I say unto you, the WAN is my White Whale, and I am an IT Ahab.
Here are some of the tools & techniques networking firms, engineers, architects and people way smarter than I have come up with to deal with the multiple pains of the WAN, followed by my snarky, yet honest, hurt, yet hopeful, lust-filled yet realistic view of them:
Multiprotocol layer switching (MPLS): The go-to solution for WAN pain, particularly for businesses that can’t/won’t employ a networking wonk equal to Mr. Ivan Pelpjnak. MPLS is a god-send for some firms, but it’s very costly. To really get value out of an MPLS strategy, you almost have to couple it with a session vritualization or in-datacenter-computing model (XenApp, RDS, VDI etc). Why? While MPLS makes the WAN as reliable and as accessible as your LAN, it doesn’t defeat latency. And latency is a hard thing to explain. Go on. Try it. On your spouse or significant other.
MPLS part two: And just so that I can get it off my chest…when the primary link at a branch site does go down, why do MPLS providers have such a hard time failing over to a secondary? I mean for real guys? Just keep the secondary WAN/VPN link up, or do something fancy with VRRP or VARP or something. Without a failover link, a downed-MPLS is worth less than a regular commodity internet circuit.
MPLS part three: In previous roles, I worried that maintenance of the MPLS became an end unto itself. I can see how this would happen, and I’ve been guilty of it myself; sometimes IT guys think in IP addresses, when they should have an eye to the future and think in FQDN, as the former is and shall forever be not routable, while the latter is the future. Underlining this point is the argument (well-supported in 2014, I think) that MPLS is, at best, a transitional technology. Build your business on it if you have to, but don’t tie anything to it, in other words. Sure it’s cloud-compatible, but so is dial up.
Inline Compression/dedupe: As a storage networking nerd, I Heart me some Riverbed and SilverPeak. But those are tools on the WAN that, in my experience, are just one CapEx ask too much. I’ve never actually used one of them. Love the idea, can never justify the cost. Open source alternatives? There’s really none (Except for this brave guy), speaking, perhaps, to how sophisticated and well-engineered these devices are, which justifies their cost but also makes them unobtainable for SMB shops.
Pertino and the like: I’ve been a fan of Pertino since I first started using this “Cloud VPN” product, which I likened more to a Layer 2 switch in the sky than a traditional VPN service. It’s some great tech; not clear that it can scale to 100s and 100s of users though. But very promising nonetheless, especially for really small but geographically-diverse environments.
Link aggregation + VPN all in one device: If you’re going to go hub & spoke because MPLS costs too much, or you can’t quite do full-cloud yet, this is a promising strategy, and one I’ll soon be testing out. I know I’ m not alone in the WAN-is-my-white-whale meme because companies like Peplink, Talari Networks, and even Cisco are still building products that address WAN problems. I have used Peplink before; was impressed, would use again, want one in my home with a second internet line, A+++++. The only thing that scuttled wider adoption in my last role was voice, a particularly difficult problem to sort out when you slap some good ol’ LACP-style magic onto your WAN ills. These devices, ranging from a few hundred bucks to several thousand, are almost too good to be true, as they tell the IT Pro that yes, he can have his cheap but rapidly-deploy-able commodity internet circuits aggregate into one, high speed, fault-tolerant link, and yes, that “unbreakable VPN” (as Peplink dubs it), can connect back to the HQ. Doesn’t defeat latency, true, but it sure makes the ASA look old-hat doesn’t it?
Cloud: The default winner, of course. But OpEx is hard to quantify. Sure, I guess I could up and move my datacenter assets to a CDN and let the network take care of the rest, or I could stand up a VM in a datacenter close to my users. But replication to on-prem assets/sources can be difficult, and, in some ways, in a really wide WAN, don’t we start worrying about version control, that what the New York branch is looking at is the same as the Seattle branch? Even so, I’m down with it, just need to fully comprehend it first.
Hello Labworks fans, detractors and partisans alike, hope you had a nice Easter / Resurrection / Agnostic Spring Celebration weekend.
Last time on Labworks 2:1-4, we looked at some of the awesome teaming options Microsoft gave us with Server 2012 via its multiplexor driver. We also made the required configuration adjustments on our switch for jumbo frames & VLAN trunking, then we built ourselves some port channel interfaces flavored with LACP.
I think the multiplexor driver/protocol is one of the great (unsung?) enhancements of Server 2012/R2 because it’s a sort of pre-virtualization abstraction layer (That is to say, your NICs are abstracted & standardized via this driver before we build our important virtual switches) and because it’s a value & performance multiplier you can use on just about any modern NIC, from the humble RealTek to the Mighty Intel Server 10GbE.
But I’m getting too excited here; let’s get back to the curriculum and get started shall we?
5. Understand what Microsoft’s multiplexor driver/LBFO has done to our NICs
6. Build our Virtual Machine Switch for maximum flexibility & performance
7. The vEthernets are Coming
8. Next Steps: Jumbo frames from End-to-end and performance tuning
2:5 Understand what Microsoft’s Multiplexor driver/LBFO has done to our NICs
So as I said above, the best way to think about the multiplexor driver & Microsoft’s Load Balancing/Failover tech is by viewing it as a pre-virtualization abstraction layer for your NICs. Let’s take a look.
Our Network Connections screen doesn’t look much different yet, save for one new decked-out icon labeled “Daisetta-Team:”
Meanwhile, this screen is still showing the four NICs we joined into a team in Labworks 2:3, so what gives?
A click on the properties of any of those NICs (save for the RealTek) reveals what’s happened:
The LBFO process unbinds many (though not all) settings, configurations, protocols and certain driver elements from your physical NICs, then binds the fabulous Multiplexor driver/protocol to the NIC as you see in the screenshot above.
In the dark days of 2008 R2 & Windows core, when we had to walk up hill to school both ways in the snow I had to download and run a cmd tool called nvspbind to get this kind of information.
Fortunately for us in 2012 & R2, we have some simple cmdlets:
So notice Microsoft has essentially stripped “Ethernet 4” of all that would have made it special & unique amongst my 4x1GbE NICs; where I might have thought to tag a VLAN onto that Intel GbE, the multiplexor has stripped that option out. If I had statically assigned an IP address to this interface, TCP/IP v4 & v6 are now no longer bound to the NIC itself and thus are incapable of having an IP address.
And the awesome thing is you can do this across NICs, even NICs made by separate vendors. I could, for example, mix the sacred NICs (Intel) with the profane NICs (RealTek)…it don’t matter, all NICs are invited to the LBFO party.
No extra licensing costs here either; if you own a Server 2012 or 2012 R2 license, you get this for free, which is all kinds of kick ass as this bit of tech has allowed me in many situations to delay hardware spend. Why go for 10GbE NICs & Switches when I can combine some old Broadcom NICs, leverage LACP on the switch, and build 6×1 or 8x1GbE Converged LACP teams?
LBFO even adds up all the NICs you’ve given it and teases you with a calculated LinkSpeed figure, which we’re going to hold it to in the next step:
2:6 Build our Virtual Machine Switch for maximum flexibility & performance
If we just had the multiplexor protocol & LBFO available to us, it’d be great for physical server performance & durability. But if you’re deploying Hyper-V, you get to have your LBFO cake and eat it too, by putting a virtual switch atop the team.
This is all very easy to do in Hyper-V manager. Simply right click your server, select Virtual Switch Manager, make sure the Multiplexor driver is selected as the NIC, and press OK.
Bob’s your Uncle:
But let’s go a bit deeper and do this via powershell, where we get some extra options & control:
New-vmswitch : the cmdlet we’re invoking to build the switch. Run get-help new-vmswitch for a rundown of the cmdlet’s structure & options
-NetAdapterInterfaceDescription : here we’re telling Windows which NIC to build the VM Switch on top of. Get the precise name from Get-NetAdapter and enclose it in quotes
-Allow ManagementOS 1 : Recall the diagram above. This boolean switch (1 yes, 0 no) tells Windows to create the VM Switch & plug the Host/Management Operating System into said Switch. You may or may not want this; in the lab I say yes; at work I’ve used No.
-Minimum Bandwidth Mode Weight: We lay out the rules for how the switch will apportion some of the 4Gb/s bandwidth available to it. By using “Weight,” we’re telling the switch we’ll assign some values later
Name: Name your switch
A few seconds later, and congrats Mr. Hyper-V admin, you have built a converged virtual switch!
2:7 The vEthernets are Coming
Now that we’ve built our converged virtual switch, we need to plug some things into it. And that starts on the physical host.
If you’re building a Hyper-V cluster or stand-alone Hyper-V host with VMs on networked storage, you’ll approach vEthernet adpaters differently than if you’re building Hyper-V for VMs on attached/internal storage or on SMB 3.0 share storage. In the former, you’re going to need storage vEthernet adpters; in the latter you won’t need as many vEthernets unless you’re going multi-channel SMB 3.0, which we’ll cover in another labworks session.
I’m going to show you the iSCSI + Failover Clustering model.
In traditional Microsoft Failover Clustering for Virtual Machines, we need a minimum of five discrete networks. Here’s how that shakes out in the Daisetta Lab:
Network Name, VLAN ID, Purpose, Notes
Management, 1, Host & VM management network, You can separate the two if you like
CSV, 14, Host Cluster & communication and coordination, Important for clustering Hyper-V hosts
LM, 15, Live Migration network, When you must send VMs from broke host to host with the most LM is there for you
iSCSI 1-3, 11-13, Storage, Soemwhat controversial but supported
Now you should be connecting that dots: remember in Labworks 2:1, we built a trunked port-channel on our Cisco 2960S for the sole purpose of these vEthernet adapters & our converged switch.
So, we’re going to attach tagged vethernet adapters to our host via powershell. Pay attention here to the “-managementOS” tag; though our Converged switch is for virtual machines, we’re using it for our physical host as well.
You can script his out of course (and VMM does that for you), but if you just want to copy paste, do it in this order:
Notice we didn’t include a Gateway in the New-NetIPAddress cmdlet; that’s because when we built our Virtual Switch with the “-managementOS 1” switch attached, Windows automatically provisioned a vEthernet adapter for us, which either got an IP via DHCP or took an apipa address.
So now we have our vEthernets and their appropriate VLAN tags:
2:8: Next Steps : Jumbo Frames from end-to-end & Performance Tuning
So if you’ve made it this far, congrats. If you do nothing else, you now have a converged Hyper-V virtual switch, tagged vEthernets on your host, and a virtualized infrastructure that’s ready for VMs.
But there’s more you can do; stay tuned for the next labworks post where we’ll get into jumbo frames & performance tuning this baby so she can run with all the bandwidth we’ve given her.
Links/Knowledge/Required Reading Used in this Post:
Aidan Finn, upstanding Irishman, apparent bear-cub puncher, hobbyist photog, MVP all-star and one of my favorite Hyper-V bloggers (seriously, he’s good, and along with DidierV & the Hyper-Dutchman has probably saved my vAss more times than I can vCount) appeared on one of my favorite podcasts last week, RunAs Radio with Canuck Richard Campbell.
Which is all sorts of awesome as these are a few of my favorite things piled on top of each other (Finn on RunAs).
The subject? Hyper-V, scale out file servers (SoFS) in 2012 R2, SMB 3.0 multichannel and Microsoft storage networking, which are just about my favoritest subjects in the whole wide world. I mean what are the odds that one of my favorite Hyper-V bloggers would appear on one of my favorite tech podcasts? Remote. And talk about storage networking tech, Redmond-style, during that podcast?
All that and an adorable Irish brogue?
This is Instant nerdgasm territory here people; if you’re into these black arts as I am, it’s a must-listen.
Anyway, Finn reminded me of his famous powershell demos in which he demonstrates all the options we Hyper-V admins have at our disposal now when it comes to Live Migrating VMs from host to host.
And believe me, we have so many now it’s almost embarrassing, especially if you cut your teeth on Hyper-V 2.0 in 2008 R2, where successfully Live Migrating VMs off a host (or draining one during production) involved a few right clicks, chicken sacrifice, Earth-Jupiter-Moon alignment, a reliable Geiger counter by your side and a tolerance for Pucker Factor Values greater than 10* **.
Nowadays, we can:
Live Migrate VMs between hosts in a cluster (.vhdx parked in a Cluster Shared Volume, VM config, RAM & CPU on a host….block storage, the Coke Classic option)
Live Migrate VMs parked on SMB 3.0 shares, just like you NFS jockeys do
Shared-nothing Live Migration, either storage + VM, just storage, or just VM!
A for instance: from my Dell Latitude i7 ultrabook with Windows 8.1 and client hyper-v installed (natch), I can storage Live Migrate a .vhdx off my skinny but fast 256GB SSD to a spacious SMB share at work, then drop it back on my laptop at the end of the day, all via Scheduled Task or powershell with no downtime for the VM
With Server 2012/2012 R2 you get all those options + SMB 3.0 multichannel
Not only that, but we have some cool new toys with which to make the cost of Live Migration a VM to the host with the most a little less painful:
Standard TCP/IP : I like this because I’m old school and anything that stresses the network and LACP is fun because it makes the network guy sweat
Compression: Borrow spare cycles from the host CPU, compress the VM’s RAM, and Live Migrate your way out of a tight spot
SMB via Remote Direct Memory Access : the holy of holies in Live Migration. As Finn points out, this bit of tech can scale beyond the bandwidth capabilities of the PCIe 3 bus. SMB 3.0 + RDMA makes you hate your Northbridge
Finn*** of course provided some Live Migration start:finish times resulting from the various methods above, which I then, of course, interpreted as Finn daring me personally over the radio to try and beat those times in my humble Daisetta Lab.
Now this is just for fun people; not a Labworks-style list of repeatable results, so let’s not nerd-out on how my testing methodology isn’t sound & I’m a stupidhead, ok?
Anyway, Sysinternals has a nice little tool to redline the RAM in your Windows VM. I don’t know how Finn does it, but I don’t have workloads (yet!) in the Lab that would fill 4GB of RAM with non-random data on a VM, so off to the cmd we go:
You type this (haven’t played with all the switches yet) in this navy blue screen:
And then this happen and the somewhat pink graph goes full pink:
Then we press this button to test Live Migration w/ compression, as the Daisetta Lab doesn’t have fancy RDMA NICs like certain well-connected Irish Hyper-V bloggers:
Which makes this bluecelestedenim Azure colored line get all spikey:
all of which results in a wicked-fast Live Migrations & really cool orange-colored charts in my totally non-random, non-scientific but highly enjoyable laboratory experiment
Still, in the end, I like my TCP/IP uncompressed Live Migrations because 1) sackcloth & ashes, and 2) I didn’t go to the trouble of building a multiplexed LACP team -with a virtual switch on top!- just to let the Cat5es in my attic have an easy day at the office:
But at work: yes. I love this compression stuff and echo Finn’s observations on how Hyper-V doesn’t slam your host CPUs beyond what the host & its VM fleet could bear.
Anyway, did I beat Finn’s Live Migration times in this fun little test? Will the Irish MVP have to admit he’s not so esteemed after all and surrender his Hyper_V_MVP_badge.gif to me?
Of course I did and yes he will.
But not really.
[table caption=”Daisetta Lab LM vs Finn’s Powershell LM Scripts – 4GB VM” width=”500″ colwidth=”20|100|50″ colalign=”left|left|left|left|left”]
Who,TCP/IP LM,Compressed LM,RDMA & SMB 3 LM,Notes
Finn,78 seconds, 15 seconds,6.8 seconds, “Mr. I once moved a VM with 56GB of RAM in 35 seconds probably has a few Xeons”
D-Lab,38 seconds,Like 12 or something,Who’s ass do I need to kiss to get RDMA/iWarp?, But seriously my VM RAM was probably not random
Finn notes in his posts that he’s dedicating an entire 1GbE NIC for his Live Migration Demos, wheras I’m embracing the converged switch model and haven’t even played with bandwidth or QOS settings on my Hyper-V switch.
How do my VMware colleagues & friends measure this stuff & think about vMotion performance & reliability? I know NFS can scale & perform, but am ignorant on the nuances of v3 vs v4, how it works on the host and Distributed vSwitch and your “Shared nothing” storage vMotion. And what’s this I hear that vSphere won’t begin a vMotion without knowing it will complete? How’s that determined?
I mean I could spend an hour or two googling it, or you could, I don’t know, post a comment and save me the time and spread some of your knowledge 😀
I’m jazzed about SMB 3.0, but there are only a handful of storage vendors who have support for the new stack, and among them, as Finn points out, Microsoft is #1 storage vendor for SMB 3 fans, with NetApp probably in 2nd place.
* Just kidding, it wasn’t that bad. Most days.
** Pucker Factor Value can be measured by querying obscure wmi class win32_pfv
*** Finn is a consultant. So you can hire him. I have no relationship with him other than admiration for his scripting skillz
Greetings Labworks fans, today we’re going to learn how to build converged Hyper-V switches, switches so cool they’re nearly identical to the ones available to enterprise users with their fancy System Center licenses.
If you’re coming from a VMware mindset, a Hyper-V converged switch is probably most similar to Distributed vSwitches, though admittedly I’m a total n00b on VMware, so take that statement with a grain of salt. The idea here is to build an advanced switching fabric on your Hyper-V hosts that is fault-tolerant & performance-oriented, and like a Distributed vSwitch, common among your physical hosts and your guests.
This is one of my favorite topics because I have a serious & problematic love-affair with LACP and a Terrets-like urge to team things up & jumbo, but you don’t need an LACP-capable switch or jumbo frame to enjoy Converged Switching goodness.
Let’s dive in, shall we?
Prepare the physical switch for Jumbo Frames
Understand LBFO: Microsoft’s Load Balancing/Fail Over teaming technology introduced in Server 2012
Enable LACP on the Switch and on the Server
Build the Switch on the Team & Next Steps
Required Tools ‘n Tech:
Server 2012 or 2012 R2…sorry Windows 8.1 Professional/Enterprise fans…LBFO is not available for 8.1. I know, I feel your pain. But the naked Hyper-V 3.0 Hypervisor (Core only) is free, so what are you waiting for?
A switch, preferably gigabit. LACP not required but a huge performance multiplier
NICs: As in plural. You need at least two. Yes, you can use your Keepin’ it RealTek NICs..Hyper-V doesn’t care that your NICs aren’t server-grade, but I advise against consumer-NICs for production!!
State of the Lab as of today. Ag_node_1 is new, with a core i7 Haswell (Yay!), ag_node_2 is the same, still running CSVs off my ZFS box, and check it out, bottom right: a new host, SMB1:
2:1 Prepare the Physical Switch for Jumbo Frames
You can skip this section if all you have at your disposal is a dumb switch.
Commands below are off of a Cisco 2960s. Commands are similar on the new SG300 & 500 series Cisco switches. PowerConnect 5548 switches from Dell aren’t terribly different either, though I seem to recall you have you enable jumbo mtu on each port as well as the switch.
First we’re going to want to turn on Jumbo Frames, system-wide, which usually requires a reload of your switch, so schedule for a maintenance window!
daisettalabs.net(config)#system mtu jumbo 9198
You can run a show system mtu after the reload to be sure the switch is ready for the corpulent frames you will soon send its way:
daisettalabs.net#show system mtu
System MTU size is 1514 bytes
System Jumbo MTU size is 9198 bytes
System Alternate MTU size is 1514 bytes
Routing MTU size is 1514 bytes
2:2 Load Balancing & Failover
Load Balancing & Failover, or LBFO as it’s known, was the #1 feature I was looking forward to in Server 2012.
And boy did Microsoft deliver.
LBFO is a driver/framework that takes whatever NICs you have, “teams” them, applies a mature & resilient multiplexor driver to them, and gives you redundancy & performance in just a few clicks or powershell cmdlets. Let’s do GUI for the team, and later on, we’ll use Powershell to build a switch on that team.
Sidenote: Don’t bother applying IP addresses, VLANs to your LBFO-destined physical NICs at this point. Do bother installing your manufacturer’s latest driver, or hacking one on as I’ve had to do with my new ag_node_1 Intel NIC. (SideSideNote: as this blogger states, Intel can eat a bag of d**** for dropping so many NICs from Server 2012 support. Broadcom, for all the hassles I’ve had with them, still updates drivers on four year old cards!)
On SMB1 from the above schematic, I’ve got five gigabit NICs. One is a RealTek on the motherboard, and the other four are Intel; 1-4 on a PCIe Quad Gigabit network card, i350 x4 I believe.
The RealTek NIC has a static IP and is my management interface for the purposes of this labworks. We’ll only be teaming the four Intel NICs here. Be sure to leave at least one of your NICs out of the LBFO team unless you are sitting in front of your server console; you can always add it in later.
Launch Server Manager in the GUI and click on “All Servers,” then right click on SMB1 and select Configure NIC Teaming:
A new window will emerge,titled, NIC Teaming.
In the NIC Teaming window, notice on the right the five GbE adapters you have and their status (Green Arrow). Click on “Tasks” and select “New Team” (Red Arrow):
The New Team window is where all the magic happens. Let’s pause for a moment and go to our switch.
On my old 2960s, we’re building LACP-flavored port channels by using the “channel group _ mode active” command, which tells the switch to use the genuine-article LACP/802.11ax protocol rather than the older Cisco proprietary Port Aggregation Protocol (PAgp) system, which is activated by running “channel group _ mode auto.”
However, if you have a newer switch, perhaps a nice little SG 300 or something similar, PAgp is dead and not available to you, but the process for LACP is like the old PAgp command: “channel group _ mode auto” will turn on LACP.
Here’s the 2960s process. Note that my Intel NICs are plugged into Gig 1/0/20-23, with spanning-tree portfast enabled (which we’ll change once our Converged virtual switch is built):
daisettalabs.net#show run int gig 1/0/20
Current configuration : 63 bytes
Enter configuration commands, one per line. End with CNTL/Z.
daisettalabs.net(config)#int range gig 1/0/20-23
daisettalabs.net(config-if-range)#description SMB1 TEAM
daisettalabs.net(config-if-range)#channel-group 3 mode active
daisettalabs.net(config-if-range)#switchport mode trunk
Presto! That wasn’t so hard was it?
Note that I’ve trunked all four interfaces; that’s important in Hyper-V Converged switching. We’ll need to trunk po3 as well.
Let’s take a look at our new port channel:
daisettalabs.net(config-if-range)#do show run int po3
Current configuration : 54 bytes
switchport mode trunk
Now let’s check the state of the port channel:
daisettalabs.net#show etherchannel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator
M - not in use, minimum links not met u - unsuitable for bundling w - waiting to be aggregated d - default port Number of channel-groups in use: 3 Number of aggregators: 3 Group Port-channel Protocol Ports ------+-------------+-----------+-----------------------------------------------
1 Po1(SU) LACP Gi1/0/1(P) Gi1/0/2(P) Gi1/0/3(P)
2 Po2(SU) LACP Gi1/0/11(D) Gi1/0/13(P) Gi1/0/14(P) Gi1/0/15(P) Gi1/0/16(P)
3 Po3(SD) LACP Gi1/0/19(s) Gi1/0/20(D) Gi1/0/21(s) Gi1/0/22(s) Gi1/0/23(D)
po3 is in total disarray, but not for long. Back on SMB1, it’s time to team those NICs:
I’m a fan of naming-conventions even if this screenshot doesn’t show it; All teams on all hosts have the same “Daisetta-Team” name, and I usually rename NICs as well, but honestly, you could go mad trying to understand why Windows names NICs the way it does (Seriously. It’s a Thing). There’s no /dev/eth0 for us in MIcroosft-land, it’s always something obscure and strange and out-of-sequence, which is part of the reason why Converged Switching & LBFO kick ass; who cares what your interfaces are named so long as they are identically configured?
If you don’t have an LACP-capable switch, you’ll select “Switch Independent” here.
As for Load Balancing modes: in server 2012, you get Address Hash (Source/Dest MAC or IP in Layer 3 LACP), or Hyper-V Port, which is sort of a round-robin approach (VM1 goes to one port in the team, VM2 to the other).
I prefer the new (with 2012 R2) Dynamic mode which negotiates with the physical switch. More color on those choices & what they mean for you in the References section at the bottom.
Press ok, sit back, and watch my gifcam shot:
Mmmm, taste the convergence.
2:4 Build a Switch on top of that team & Next Steps
If you’ve ever built a switch for Hyper-V, you’ll find building the converged switch immediately familiar, save for one technicality: you’re going to build a switch on top of that multiplexor driver you just created!
Sounds scary? Perhaps. I’ll go into some of the intricacies and gotchas and show some cool powershell bits ‘n bobs on the next episode of Labworks.
Eventually we’re going to dangle all sorts of things off this virtual switch-atop-a-multiplexor-driver!
Links/Knowledge/Required Reading Used in this Post:
A few Microsoft bloggers (some prominent, some less so, none that I know of are employed by MS) are doing a bit of crowing today…OpenSSL, VMware, AWS….all #Heartbleed vulnerable while Azure & Windows & Hyper-V are secure! <Nelson>Ha Ha!</Nelson>
I’m new to IT blogging, but one thing I’ve noticed is that it’s dominated by consultants who are selling something other than just software: their skills & knowledge. That goes for Hyper-V bloggers or VMware bloggers, SQL bloggers or Oracle bloggers. And that’s just fine: we all have to find a way to put food on the table, and let’s face facts: blogging IT doesn’t exactly bring in the pageviews, does it? However, making sport out of the other products’ flaws can bring in the hits, and it’s fun.
Me? I’m what you call a “customer” who has always supported Microsoft products, had a love/hate/love relationship with them, a curiosity about the other camps, and a desire to just make it all work together, on time & on budget in service to my employer and my users.
So I blog from that perspective.
And so while it’s tempting to join some of my Win32 colleagues (after all the BSOD & dll.hell jokes are getting old 20 years on) as they take joy in other engineers’ suffering, I say no!
I remind the reader of that great engineer of words, John Donne, who wrote:
No man is an island,
Entire of itself,
Every man is a piece of the continent,
A part of the main.
If a clod be washed away by the sea,
Europe is the less.
As well as if a promontory were.
As well as if a manor of thy friend’s
Or of thine own were:
Any man’s death diminishes me,
Because I am involved in mankind,
And therefore never send to know for whom the bell tolls;
It tolls for thee.
This poem gets me every time; Donne knows his stuff.
No :443 is an island entire of itself, especially in the internet age. And every network is a part of the great /0.
If one datacenter falls, our infrastructure is the less.
Any engineer’s pain diminishes me, because I have been in his shoes*, RDPd or SSHd into the device at 3am, worried about my data and my job, just as he or she is right now.
So to my friends & colleagues in the open source world trying to stem the bloodloss, I ask; do you need a hand?
Working from home today and be happy to help and I know my way around putty.
*Chinese hackers, the NSA, and other malefactors are of course exempted here
I don’t know if Software Defined Networking is a legitimate thing I should pursue, or just another mine in the IT requisition battlefield I need to be aware of. If it’s something I should pursue, what is the scope, budget, risks, and payoff? And what do I need to buy exactly? With x86 virtualization, it was clear (powerful pizza boxes! Lots of RAM!), with network virtualization…not so much.
I do know this much: the traditional, monolithic & inflexible business WAN causes me pain and suffering.
You know the type, or maybe in your career, you’ve moved on from such environments. But many of us haven’t. Remember this thing? The Business WAN:
Yeah baby. It’s still kicking after all these years…you get yourself some T1s for the branches, 10MegE/100MegE for the tentpole sites, some Cisco routers, OSPF & maybe MPLS to tie it all together with a safe, predictable RFC-1918 ipv4 address scheme and NAT on the ASA edge device. Active Directory is usually built on top with a replication topology matching the sites’ /24s.
And on the seventh day, the young engineer stood back and beheld what he had built, and said, Go forth IT department, and let thy services & value multiply: Exchange, Sharepoint, SMB shares, SANs, QOS policies, print servers, a Squid caching server here, a Peplink there, oh my!
This model is straight out of the 1990s playbook, but it’s still in wide-use. In the meantime, a crazy thing happened, the internet came along and for some inscrutable reason, it’s really popular, accessible and useful, and people like it. Your thought your advanced Business WAN was hot stuff, but your users feel it’s positively archaic because they have 20 megabits of bandwidth on their tiny 4G LTE phone, limitless storage & bandwidth via the Dropboxes & Googles of the world, and an internet that’s never under maintenance and never tells them no.
This then, is the problem I want SDN to solve: take the stuff my users need that’s on the business WAN and put it where my users are: on the internet. 443 doesn’t work for everything and while cloud is the ultimate home, I’m looking for baby-steps to the cloud, things I can do today with SDN that are low-risk and potentially high-reward.
What do you do hotshot?
Once upon a time in the Microsoft world, there was a thing called Direct Access. This was a software-defined solution that eased access to corporate resources for users on the internet. Without initiating a VPN connection, your C-level could access that stubborn decade-old UNC path from his laptop anywhere on the internet. IPV6 to the rescue!
But it was somewhat painful to install, especially in multi-domain scenarios, and sadly, only worked on Windows, which was great 10 years ago, but we’re not in a world where the PC is increasing in relevance; we’re in a world where the PC is less relevant by the day.
Enter Pertino, which to the cynical is yet another SDN startup from the Valley, but to me, is the among the first vendors wearing the badge SDN that actually knows my WAN pain and is building something SDN-related that is imminently practical and immediately applicable.
Pertino bills itself as a Cloud VPN provider, which, I think, doesn’t do it justice. VPN calls to mind dial-up…remote users connecting to your LAN. Pertino is sort of the opposite: this bit of tech allows you to extend your WAN/LAN into the cloud effortlessly.
I’m pretty jazzed on it because I think Pertino, or something like it, could be the next evolution in business WAN networking, especially if your organization is cloud-cautious.
So What is it?
Pertino is essentially an ipv4 & ipv6 overlay technology employing NVGRE-like encapsulation dubbed “Overpass” that works with your existing on-prem equipment and extends securely your Layer 2/Layer 3 LAN assets to the places where your users are: the internet.
It’s so simple too. All you need is a modest 16 megabyte application that, to your users, will remain largely invisible. Once installed, Pertino sits quietly in the Windows system tray or in the background on Android and just generally stays out of the way, which makes it about 10x better than dial-up style VPNs of yesteryear.
While that application is low drama, what’s happening behind the scenes is some serious high stakes vKung-Fu, involving on-demand VMs, virtual switches, control & data planes and encapsulation.
On the Windows side, Pertino creates a simple virtual network interface, hooks onto your existing internet connection and begins a session with a Pertino virtual machine in a datacenter somewhere, in theory, close to your device.
All traffic on that vif is encapsulated via the NVGRE-like Overpass and your Windows client or Android handset is assigned both an ipv4 & ipv6 address. And just like that, you have what is in effect a fully switched LAN on the internet, to the point where an arp lookup sees other MAC addresses of other Pertino-enabled devices wherever they are.
Just think about that for a second. In years past, you’d have to call up your provider and order an exotic Virtual Private Wire Service to extend Layer 2 across a Layer 3 link if you wanted to expose your MAC addresses to the remote end.
Now I’m doing in effect the same thing with a simple Windows application. And I didn’t have to hire a consultant or mess around with NAT or the ASA, which is both comforting in that I like my security blanket, yet terrifying at the same time because it’s ipv6ing its way over my firewall. Pertino is essentially giving me a layer 2 switch….in the cloud.
In the Layer 3 space, your ipv4 address isn’t RFC-1918, but it’s not routable either. You can have any ipv4 address you like as long as it’s 50.203.x.x. Pertino is using Overpass encapsulation here to isolate tenants & customers from each other, reminding me of the way Microsoft uses NVGRE in Hyper-V & Azure.
After installing Pertino, you’re going to look at the output of ipconfig or ifconfig and say, “Wait. They gave me an entire /24?” Indeed, it seems you have a /24 but it’s not routable or unique. Another Pertino customer is probably getting the same /24. That’s what’s cool about encapsulation.
On your ipv6 Pertino network, things are a little more hazy. I think ipv6 is also NVGRE-encapsulated and perhaps all customers share the same /64, but while I’m a willing conscript in Tom Hollingsworth’s army of ipv6 boosters, I’m not smart enough to speak to how it works with Pertino. I know this: I can’t ping that ipv6 address from a non-Pertino client, yet the address differs substantially from the one my Windows 8.1 & 2012 R2 clients assign my NIC, which I can’t ping either.
So whatever man. It’s ipv6 and it’s mysterious.
What can you do with this?
Last fall I was hot on Pertino because I envisioned using it at scale in a modern business WAN: imagine being able to kill your expensive, continent-hopping MPLS network without having to revamp your entire infrastructure.
I’m not sure Pertino could do that, but still: the mind races.
As much as I hate to see it because I think it encourages bad behavior (printing), you can do print servers over this
I’ve been using Pertino in the Daisetta Lab and quietly at work for about five months. With it, I’ve done this:
Built an Domain Controller in AWS somewhere in Virginia, joined & promoted it as a DC with the DC in my Daisetta Lab
SMB (CIFS to the non Microsoft crowd) shares via common UNC paths
Remote desktop, ssh
Mounted LUN in remote datacenter via iSCSI and MS iSCSI Initiator and ran IOMETER over Pertino
So you’ve got your fancy Pertino adapters deployed to laptops, mobile phones, iPads, and certain strategic servers, you’re living the SDN dream with only a modest OpEx spend and no rip & replace and your users can finally access CIFS, Sharepoint, and other internal resources from whatever internet connection they have.
How’s this baby perform?
Couple of measurements:
Test Type, Details, Time, Subjective Feeling
SMB File Copy, Copied 104MB file from remote site, 10 minutes 3 seconds, Felt slow but was evening hours at home
SMB File copy, Copied 95MB of random files from remote site, 3 minutes 46 seconds, Felt much faster speed varied 400k to 1mb/s
Latency tests, Simple pings to remote pertino clients, 90ms minimum/300ms max, Was variable but mostly similar to standard VPN perofrmance
RDP, 2560×1440 remote desktop session, Session connected in 10s or so, Better that expected artifacting and compression minor
There’s room for improvement here, but I’m on the free tier of Pertino service. The company offers some enhancements for paying customers, but Pertino’s not something you deploy as Tier 1 infrastructure; this is better used to fill gaps between your infrastructure and your users, and as such, I think the performance is acceptable.
It’s at least as fast as what my users call the “stupid VPN,” so what’s not to like?
I’ve been using Pertino now for almost five months. I’d give them an A- for reliability.
I’ve been trying to push this review out for months, but it’s so easy to forget Pertino’s there. 99.9% of hte time, it’s invisible and just running in the background, connecting me seamlessly to whatever remote device I need to access.
There have been only two times the network failed me. Once, briefly in January I couldn’t RDP into the home network, and then, last week, there was an hours-long outage affecting many Pertino customers.
Credit to Pertino here: the same day they blogged about the outage, its cause and promised to make it better. Essentially a Pertino datacenter went offline, which they can recover from, but the resulting failover process snowballed and a widespread outage resulted:
On the afternoon of April 1st, there was a network outage between a cluster of data plane v-switches and the control plane, which was located in a different cloud datacenter. The disruption was brief, but lasted long enough for the control plane to consider those v-switches at risk. So, the control plane began migrating customers to other v-switches.
However, due to a new bug in the data plane, too many messages were sent from the data plane v-switches to the control plane, increasingly loading it with requests.
It’s been so reliable that after carefully considering the options, I had no problem recommending Pertino to my own parent partition, dad, a radio engineer by trade & reluctant IT consultant, as he often needs to connect small, distant radio stations to each other over IP. Usually he purchases a Zywall appliance, connects the sites together via VPN and with LogMeIn, he can remotely support these small operations.
Pertino is an obvious fit for scenarios like that as well, and it’s probably cheaper.
Back when I first was testing Pertino, they allowed you to install the package on up to three devices for free. It looks like that plan is gone now,
Pertino still offers a free account for IT pros to sink their teeth into: with it, you can add Peritno on up to three devices to see how this works.
After that, the base level pricing for Pertino is $29/month for up to 10 devices. It scales from there, but only modestly: enterprise packaging starts at 40+ devices and you have to contact them to get pricing.
One gets the feeling that perhaps this is aimed at really small SMB markets, but I’m not so sure. If you have 1500+ objects in Active Directory, you surely don’t need Pertino on all of them. Just certain strategic, edge-focused & secured ones: a Domain Controller & SMB server here, a few executive or important mobile user laptops there. You get the idea.
Up to 40 devices can be connected through Pertino for about $90 per month.
All in all, I’ve been pretty impressed with this kit. It’s at once a practical way to get your on-prem services out to your users, dip your toes into some ipv6 waters even if you’re not engineering it yourself, and leverage some real software defined networking (insert definition of that here) in a safe, low-risk way.
In fact, I think you should help me test it a bit more. If you’re a fellow tech blogger with a lab at home and you’re interested in this or suffer from WAN pains too, let’s link up: The Supervisor Module spouse isn’t interested in becoming a test user on my Pertino network, but if you are, shoot me an email . I can add you as a user on my Pertino network, you can join a VM to my Pertino switch, and we can have a sort of home lab Apollo-Soyuz moment and learn something together.
Crazy timing, but within minutes of me posting my review of Pertino’s CloudVPN tech yesterday, Scott Lowe, a well-known rockstar virtualization blogger weighed in with his views of Pertino on Twitter:
The concept behind @PertinoNetworks is cool, but I’m not terribly impressed w/ the implementation thus far. A bit too simplistic, I think.
I hope I don’t appear to be a Pertino shill and no disrespect to Scott intended, but I don’t think there’s anything “simplistic” about what is in effect a layer 2 switch in the cloud.
Okay, let’s suppose it’s a dumb switch.
Still. A dumb switch…in the cloud is something much more than a dumb switch in your rack.
Maybe I’m just easily impressed and Scott should show me how simplistic it is by joining my Pertino network. 😀
When I first signed up for the 3 device free Pertino service in the fall, a customer agent reached out to me to see if I had good experience. I relayed my experiences, linked to this blog and Pertino executives took notice. They offered me the use of up to 9 devices fro free on my Pertino network if I would post an unedited & unreviewed blog about my experiences with the product. No other compensation was given and no Pertino employees viewd the content of this post prior to its publication.
As you’ll recall from part 1, much of my time at work lately has been consumed by planning, testing and executing mass Storage Live Migration of 65+ .vhdx files from our old filer (built by a company that rhymes with PetTap) & its end-of-life 7200 RPM shelves to our new hotness, a Nimble CS260.
Essentially I’ve been a sort of IT Moses, planning the Exodus of my .vhdxs out of harsh, intolerably slow conditions, through some hazards & risks (Storage Migration during production), and into the promised land of speed, 1-5ms latency (rather than 20ms+!!), and user happiness.
Now VMware guys have a ton of really awesome tools to vMotion their .vmdks around their vCenters & vSpheres and their ESXis and now their VSANS too. They can tune their NFS for ludicrous speed and their Distributed Switches, now with Extra Power LACP support, can speak CDP & even tell you what physical port the SFP+ 10GbE adapter is plugged into.
Do I sound envious? Green with it even? Well I am.
In comparison, I got me some System Center Virtual Machine Manager (2012 SP1), Microsoft Failover Cluster mmc snap-in, a 6509e with two awesome x6748 performance blades (February 2008’s centerpiece switch mod in the popular “Hot Switch Racks” pin-up calendar), Hyper-Vs converged fabric design, 8x1GbE LACP teams, Manage Engine’s NetFlow, boat-loads of enthusiasm and a git ‘r done attitude.
And this is what I’ve got to git done:
And it has to be done with zero downtime because we have a 24/6 operational tempo, and I like my Saturdays.
One of my main worries as I’ve tried to quarterback this transition has been the switch. Recall from part 1 how I’m oversubscribed to hell & back on my two 6748s:
I fear the harsh judgment of my networking peers (You’re doing that with that?!?!) so let me just get it out there: yes, I’m essentially using my 6509 & these two blades as a bus for storage IO. In fact, iSCSI traffic accounts for about 90% of all traffic on the switch in any given 24 hour period:
Perhaps I’m doing things with this switch & with iSCSI that no sane engineer would, but I have to say, this has proven to be pretty durable and adequate as far as performance goes. Would I like some refreshing multi-channel SMB 3 file storage, some relief from the block IO blues & Microsoft clustering headaches? Yes of course, but I’ve got to shepherd the VMs to their new home first.
And to do that, I’ve got to master what this switch is doing on an hour by hour basis as my users log in around the clock.
So I pulled some Netflow data together, cracked my knuckles, and got busy with Excel and Pivot tables.
I’m glad I went through this exercise because it changed the .vhdx parade route & timing. What I thought was the busiest part of my infrastructure’s day was wrong, by a large factor. Here’s 8 days worth of Netflow traffic on the iSCSI & Live Migration VLANs, averaged out. Few live migrations were made during this period:
What you see here are the three login storms (Times on the graph are MST, they start early down under) as my EU, North America & Australia/New Zealand users login to their session virtualization VMs or hit the production SQL databases or run their reports.
I always thought EU punched my stack the hardest; our offices there have as many employees as North America, but only one or two time zones rather than three in North America.
But EU & North America and Australia combined don’t hit my switch fabric as hard as I do. Yes, the monkey on my back is…me. Well, me & the DBA and his incurable devotion to SQL backups in the evening. My crutch is DPM.
I won’t go into too much detail here but this was pretty surprising. At times over the eight days, Netflow would record more than 1 billion packets traversing the switch in one evening hour; the peak “payload” was north of 1 terabyte of iSCSI bytes/hour on some days!
Now I’m not a networking guy (though I do love Wifi experimenting), but what I saw here concerned me, gave me pause. Between the switch blades, I’ve supposedly got a 40 Gigabit/s backplane, to my Supervisor 720 modules, but is that real 40Gbit/s or marketing 40Gbit/s?
The key question: Am I stressing this 6509e, or does it have more to give?
Show fabric utilization detail said I was consuming only 20% of the switch fabric during my exploratory storage migrations, and that was at peak. 4 gigabit/second per port group.
Is that all you got? the 6509e seemingly taunted me.
But oh my stars, better check the buffers:
ACK! One dropped buffer call or whatever you call it, 7 weeks ago, way before I had the Nimble in place. Still….that’s one drop too many for me.
Stop pushing me so hard, the 6509e pleaded with me.
So I did what any self-respecting & fearful admin would do: call TAC. Show me the way home TAC, get me out of the fix I’m in or at least sooth my worry. Tell me I have nothing to worry about, or tell me I need to buy a Supe 2T to do what I want to do, just give me some certainty!
A few show tech supports, one webex session and one telephonic call with a nice engineer from Costa Rica later, I had some certainty. The config was sound, the single buffer drop was concerning but wasn’t repeating, even when I thought I was stressing the switch.
And I didn’t need to buy a Supe 2T.
On to the Exodus/.vhdx parade.
In all my fretting about the switch, I was forgetting one thing: the feeble filer is old and slow and can’t push that much IO to the Nimble in the first place.
As best I can figure it, I can do about five storage live migrations simultaneously. Beyond that, I fear that luns will drop on the filer.
To the Nimble, it’s no sweat:
Netflow’s view from the same period:
Love it when a plan comes together! I should have this complete within a few days, then I can safely upgrade my filer.