Cloud Praxis #3 : Office 365, Email and the best value in tech

Email.

What’s the first thought that comes to your mind when you read that word?

Exchange_2013-logo
I seek #ExchangeGlory !! said no IT blogger ever

If you’re in IT in the Microsoft space, maybe you think of huge mailbox stores, Exchange, Outlook, legal discovery requirements, spam headaches and the pressure & demand that stack places on your infrastructure. Terabytes and terabytes of the stuff, going back years. All up in your stack, DAG on your spindles, CAS on your edge, all load balanced at Layer 4/7 behind a physical or virtual device & wrapped up in a nice legitimate, widely-recognized CA-issued SSL cert. The stuff is everywhere.

I almost forgot. You have to back all that stuff up too. To tape in my case.

Oh, and perhaps you also recall the cold chills & instant sense of dread & fear you’ve felt just about every time an end user has asked (sometimes via email no less) “Is our email down?” I know the feeling.

Like a lot of Microsoft IT pros, I have my share of email war stories. I think email is one of those things in technology that lends itself to a sort of dualism, a sort of Devil on this shoulder, Angel on that shoulder . You can’t say something positive about email without adding a “but….” at the end, and that’s ok. Cognitive dissonance is allowed here; you can believe contrary ideas about email at the same time.

I know I do:

[table]

I love Email because, I hate email because

SMTP is last great agnostic open communication protocol, SMTP is too open and prone to abuse

Email is democratic and foundational to the internet, Email is fundamentally broken

Email will be around in some form forever, There’s no Tread Left on this Tire

Email is your online identity, Messaging applications are all the rage and so much richer

It’s how businesses communicate and thrive, One man’s business communication is another man’s spam

It’s always there, It goes down sometimes

Spam fighters and blacklists, Spam fighters and blacklists

It justifies Infrastructure Spend, It uses so much of my stack

Exchange is awesome and flexible, I broke Exchange once and fear it

[/table]

Whatever your thoughts on email are, one thing is clear: for Microsoft Infrastructure guys pondering the Microsoft cloud, the path to #InfrastructureGlory clearly travels through Exchange Country. In fact, it’s like the first step we’re supposed to take via Office 365.

daisettalabs large logoI don’t know about you, but I worry about the bandits in Exchange Country. Bandits that may break mail flow, or allow the tidal wave of spam in, prompt my users excessively for passwords, engage in various SSL hijinks, or otherwise change any of the finely-tuned ingredients in the delicate recipe that is my Exchange 2010 stack.

And yet, I bet if you polled Microsoft IT guys like me, you would find that of all the things they want to stick up in the Microsoft Cloud, Exchange & the email stack is probably at the top of the list. Just take it off our plate Microsoft as Exchange and email are in a sort of weird place in IT; it’s mission-critical and extremely important to have a durable Exchange infrastructure, yet raise your hand if you think Exchange Administration/Engineering are good career paths to take in 2014.

Didn’t think so.

So how do we get there?

I don’t have all the answers yet, but I at least have a good picture of the project, some hands-on experience, and some optimism, all of which means I’m one step closer to #InfrastructureGlory in the cloud.

Hard to build a realistic Exchange Lab 

First of all, recognize this. While it’s easy to build out a lab infrastructure (Cloud Praxis #2) for Active Directory, it’s quite another thing to build out an Exchange lab as I found out. You can’t do SMTP from home anymore (the spammers ruined that) which means you need resources at work, which might or might not be available. They aren’t in my case, so I struggled for awhile.

Maybe you have some resources at work (a few extra public IPs, a walled-off virtual network, some storage) with which you can build out an Exchange lab. If so, evaluate whether that’s going to benefit you and your organization. It might be a black hole of wasted time; it might pay off in a huge way as you wargame your way from on-prem to hybrid then to cloud and finally #InfrastructureGlory

Office 365 Praxis with the E1 Plan

For me and Daisetta Labs.net, I decided I couldn’t adequately simulate my workplace Exchange. So I did the next best thing.

I bought an Office 365 Enterprise E1 subscription.

That’s right baby. Daisetta Labs.net is on the O365 Enteprise E1 plan. It’s an Enteprrise of 1 (me!) but an Enterprise-scaled O365 account nonetheless.

And it’s fantastically cheap & easy to do, less than $100 a year for all this:

o365e1

For that measly amount, you can be an Enterprise of one in O365 and get all this:

  • A real Office 365 Enterprise account with Exchange 2013 and all of its incredibly rich features & options, including Powershell remoting, which you’ll need in your real O365 migration
  • That’s private email too...no ad bots gathering data against your profile. Up to you, but I moved my personal stack to O365 (more on that later)
  • Lync 2013. Forget Skype and all the other messengers. You get Lync service! Which interfaces with Skype and many others and makes you look like a real pro. Also useful if you have on-prem Lync, though I’m sad to report to you that, as of this month, Lync 2013 in O365 can’t kill your PBX off…yet.
  • Sharepoint & OneDrive for Business : I’ll admit it, I’ve done my fair share of Sharepoint hating but IT Infrastructurists need to realize Sharepoint is the gateway drug to many things businesses are interested in, like Business Intelligence & SQL, data visualizations and more. Besides, Sharepoint 2013 is not your daddy’s Sharepoint; it can do some neat stuff (not that I can show you, yet).
  • OneDrive for Business, again: If you’re in a Microsoft shop that’s still mostly on-prem, you probably experience Dropbox creep, where your users share documents via dropbox or other personal online storage solutions. With E1, you can get familiar with OneDrive for Business,within the context of Sharepoint & O365 management, dirsync, and all the rest.
  • One Terabyte of OneDrive for Business Storage. Outstanding. This was a recent announcement. It tickles me to think that my data is being deduped by a Windows storage spaces VM somewhere, just like I do on my storage at work.
  • Office Online : full on WAC server baby, with Excel in your Chrome or IE browser. Better, and better looking, than Google Docs.
  • With this plan, you can really test out Office for the iPad. You’ll get read and write to your O365 documents via an iPad, which can help you at work with that one C-level who loves his iPad as much as he loves Excel.
  • DirSync: The very directory synchronization tool you have stressed over at work is available to you with this simple, cheap E1 subscription. And it’s working. I’ve done it. Daisetta Labs.net is dirsynced to O365 from my home lab and I have SSO between my on-prem AD & Office 365. I deliberately kept my passwords separate between the two, but now they are in sync.

Anyway you cut it O365 E1 is an amazingly affordable and a very effective way to confront your cloud angst and get comfortable with Office 365. Even if you can’t fully simulate your workplace Exchange stack, you should consider doing this; you will use these same tools (particularly Powershell Remoting, the wizards in O365 & dirsync) at some point; best to get familiar with them now.

I could have hosted my Daisetta Labs.net domain anywhere; but I have zero regrets putting it in O365 on the E1 plan and committing for 12 months. If you’re an IT pro like me trying to get your infrastructure to the Microsoft cloud, you’d be well-served by doing the same thing I did. You may even want to ditch your personal email account and just go full Office 365…to eat the same dog food we’re going to serve to our users soon.

More to come on this tomorrow, suffice it to say, DaisettaLabs.net is dirsyncing as I write this. I’ll have screenshots, wizard processes and more to show.

Cloud Praxis #1: Advice for Microsoft IT Pros w/ Cloud angst

It’s been a tough year for those of us in IT who engineer, deploy, support & maintain Microsoft technology products.

First, Windows 8 happened, which, as I’ve written about before, sent me into a downward spiral of confusion and despair. Shortly after that but before Windows 8.1, Microsoft killed off Technet subscriptions in the summer of 2013, telling Technet fans they should get used to the idea of MSDN subscriptions. As the fall arrived, Windows 8.1 and 2012 R2 cured my Chrome fever just as Ballmer & Crew were heading out the door.

Next, Microsoft took Satya Nadella out of his office in the Azure-plex and sat him behind the big mahogany CEO desk at One Microsoft Way. I like Nadella, but his selection spelled more gloom for Microsoft Infrastructure IT guys; remember it was Nadella who told the New York Times that Microsoft’s on-prem infrastructure products are old & tired and don’t make money for Microsoft anymore.

And then, this spring…first at BUILD, then TechEd, Microsoft did the unthinkable. They invited the Linux & Open source guys into the tent, sat them in the front row next to the developers and handed them drinks and party favors, while more or less making us on-prrem Infrastructure guys feel like we were crashing the party.

No new products announced for us at BUILD or TechEd, ostensibly the event built for us. Instead, the TechEdders got Azured on until they were blue in the face, leading Ars’ @DrPizza to observe:

//platform.twitter.com/widgets.js

We think it feels pretty shitty Dr. Pizza, that’s how. It feels like we’re about to be made obsolete, that we in the infrastructure side of the IT house are about to be disrupted out of existence by Jeffrey Snover’s cmdlets, Satya’s business sense and something menacingly named the Azure Pack.

And the guys who will replace us are all insufferable devs, Visual Studio jockeys who couldn’t tell you the difference between a spindle and a port-channel, even when threatened with a C#.

Which makes it hurt even more Dr. Pizza, if that is your real name.

But it also feels like a wake-up call and a challenge. A call to end the cynicism and embrace this cloud thing because it’s not going away. In fact, it’s only getting bigger, encroaching more and more each day into the DMZ and onto the LAN, forcing us to reckon with it.

daisettalabs large logoThe writing’s on the wall fellow Microsofties. BPOS uptime jokes were funny in 2011 and Azure doesn’t go down anymore because of expired certs. The stack is mature, scalable, and actually pretty awesome (even if they’re still using .vhd for VMs, which is crazy). It’s time we step up, adopt the language & manners of the dev, embrace the cloud vision, and take charge & ownership of our own futures.

I’d argue that learning Microsoft’s cloud is so urgent you should be exploring it and getting experienced with it even if your employer is cloud-shy and can’t commit.Don’t wait on them if that’s the case, do it yourself!

Because, if you don’t, you’ll get left behind. Think of the cloud as an operating system or technology platform and now imagine your resume in two, five, or seven years without any Office 365 or Azure experience on it. Now think of yourself actually scoring an interview, sitting down before the guy you want to work for in 2017 or 2018, and awkwardly telling him you have zero or very little experience in the cloud.

Would you hire that guy? I wouldn’t.

That guy will end up where all failed IT Pros end up: at Geek Squad, repairing consumer laptops & wifi routers and up-selling anti-virus subscriptions until he dies, sad, lonely & wondering where he went wrong.

Don’t be that guy. Aim for #InfrastructureGlory on-prem, hybrid, or in the cloud.

Over the coming days, I’ll show you how I did this on my own in a series of posts titled Cloud Praxis.

[table]

Link, On-prem/Hybrid/Cloud?, Notes

Cloud Praxis #2, On Prem, General guidance on building an AD lab to get started

Cloud Praxis #3, Cloud, Wherein I think about on-prem email and purchase an O365 E1 sub

Cloud Praxis #4, -,Forthcoming likely Dirsync focused

Cloud Praxis #5, Hybrid, Got 24 days & $100 in Azure credits + a wildcard SSL cert. Floor it!

[/table]

 

Labworks 1:4-7 – The Last Word in ZFS Labworks

Greetings to you Labworks readers, consumers, and conversationalists. Welcome to the last  verse of Labworks Chapter 1, which has been all about building a durable and performance-oriented ZFS storage array for Hyper-V and/or VMware.

Let’s review where we’ve been:

[table]

Labworks Chapter, Verse, Subject, Title & URL

Labworks 1:, 1, Storage, Building a Durable and Performance-Oriented ZFS Box for Hyper-V & VMware

,2-3, StorageI Heart the ARC & Let’s Pull Some Drives!

[/table]

Today we’re going to circle back to the very end of Labworks 1:1, where I assigned myself some homework: find out why my writes suck so bad. We’re going to talk about a man named ZIL and his sidekick the SLOG and then we’re going to check out some Excel charts and finish by considering ZFS’ sync models.

But first, some housekeeping: SAN2, the ZFS box, has undergone minor modification. You can find the current array setup below. Also, I have a new switch in the Daisetta Lab, and as switching is intimately tied to storage networking & performance, it’s important I detail a little bit about it.

Labworks 1:4 – Small Business SG300 vs Catalyst 2960S

Cisco’s SG-300 & SG-500 series switches are getting some pretty good reviews, especially in a home lab context. I’ve got an SG-300 and really like it as it offers a solid spectrum of switching options at Layer 2 as well as a nice Layer 3-lite mode all for a tick under $200. It even has a real web-interface if your CLI-shy, which

Small Business Cisco != Linksys
Small Business Cisco != Linksys

I’m not but some folks are.

Sadly for me & the Daisetta Lab, I need more ports than my little SG-300 has to offer. So I’ve removed it from my rack and swapped it for a 2960S-48TS-L from the office, but not just any 2960S.

No, I have spiritual & emotional ties to this 2960s, this exact one. It’s the same 2960s I used in my January storage bakeoff of a Nimble array, the same 2960s on which I broke my Hyper-V & VMware cherry in those painful early days of virtualization, yes, this five year old switch is now in my lab:

The pride of Cisco's 2009 Desktop Switching series, the 2960s
The pride of Cisco’s 2009 Desktop Switching series, the 2960s

Sure it’s not a storage switch, in fact it’s meant for IDFs and end-users and if the guys on that great storage networking podcast from a few weeks back knew I was using this as a storage switch, I’d be finished in this industry for good.

But I love this switch and I’m glad its at the top of my rack. I saved 1U, the energy costs of this switch vs two smaller ones are probably a wash, and though I lost Layer 3 Lite, I gained so much more: 48 x 1GbE ports and full LAN-licensed Cisco IOS v 15.2, which, agnostic computing goals aside for a moment, just feels so right and so good.

And with the increased amount of full-featured switch ports available to me, I’ve now got LACP teams of three on agnostic_node_1 & 2, jumbo frames from end to end, and the same VLAN layout.

Here’s the updated Labworks schematic and the disk layout for SAN2:

Lab 1-4-5 - Daisetta Labs

[table]

Disk Type, Quantity, Size, Format, Speed, Function

WD Red 2.5″ with NASWARE, 6, 1TB, 4KB AF, SATA 3 5400RPM, Zpool Members

Samsung 840 EVO SSD, 1, 128GB, 512byte, SATA 3, L2ARC Read Cache

Samsung 830 SSD, 1, 128GB, 512byte, SATA 3, L2ARC Read Cache

Seagate 2.5″ Momentus, 1, 500GB, 512byte, 80MB/r/w, Boot/swap/system

[/table]

Labworks 1:5 – A Man named ZIL and his sidekick, the SLOG

Labworks 1:1 was all about building durable & performance-oriented storage for Hyper-V & VMware. And one of the unresolved questions I aimed to solve out of that post was my poor write performance.

Review the hardware table and you’ll feel like I felt. I got me some SSD and some RAM, I provisioned a ZIL so write-cache that inbound IO already ZFS, amiright? Show me the IOPSMoney Jerry!

Well, about that. I mischaracterized the ZIL and I apologize to readers for the error. Let’s just get this out of the way: The ZFS Intent Log (ZIL) is not a write-cache device as I implied in Labworks 1:1.

ZFS storage in excellent Good/Better/Best format
ZFS storage layout in excellent Good/Better/Best format courtesy of Nexenta, which has some outstanding documentation & guides

The ZIL, whether spread out among your rotational disks by ZFS design, or applied to a Separate Log Device (a SLOG), is simply a synchronous writes mechanism, a log designed to ensure data integrity and report (IO ACK) back to the application layer that writes are safe somewhere on your rotational media. The ZIL & SLOG are also a disaster recovery mechanisms/devices ; in the event of power-loss, the ZIL, or the ZIL functioning on a SLOG device, will ensure that the writes it logged prior to the event are written to your spinners when your disks are back online.

Now there seem to be some differences in how the various implementations of ZFS look at the ZIL/SLOG mechanism.

Nexenta Community Edition, based off Illumos which is the open source descendant of Sun’s Solaris, says your SLOG should just be a write-optimized SSD, but even that’s more best practice than hard & fast requirement. Nexenta touts the ZIL/SLOG as a performance multiplier, and their excellent documentation has helpful charts and graphics reinforcing that.

In contrast, the most popular FreeBSD ZFS implementations documentation paints the ZIL as likely more trouble than its worth. FreeNAS actively discourages you from provisioning a SLOG unless it’s enterprise-grade, accurately pointing out that the ZIL & a SLOG device aren’t write-cache and probably won’t make your writes faster anyway, unless you’re NFS-focused (which I’m proudly, defiantly even, not) or operating a large database at scale.

ZIL me

What’s to account for the difference in documentation & best practice guides? I’m not sure; some of it’s probably related to *BSD vs Illumos implementations of ZFS, some of it’s probably related to different audiences & users of the free tier of these storage systems.

The question for us here is this: Will you benefit from provisioning a SLOG device if you build a ZFS box for Hyper-V and VMWare storage for iSCSI?

I hate sounding like a waffling storage VAR here, but I will: it depends. I’ve run both Nexenta and NAS4Free; when I ran Nexenta, I saw my SLOG being used during random & synchronous write operations. In NAS4Free, the SSD I had dedicated as a SLOG never showed any activity in zfs-stats, gstat or any other IO disk tool I could find.

One could spend weeks of valuable lab time verifying under which conditions a dedicated SLOG device adds performance to your storage array, but I decided to cut bait. Check out some of the links at the bottom for more color on this, but in the meantime, let me leave you with this advice: if you have $80 to spend on your FreeBSD-based ZFS storage, buy an extra 8GB of RAM rather than a tiny, used SLC or MLC device to function as your SLOG. You will almost certainly get more performance out of a larger ARC than by dedicating a disk as your SLOG.

Labworks 1:6 – Great…so, again, why do my writes suck? 

Recall this SQLIO test from Labworks 1:1:

sqlio lab 1 short test

As you can see, read or write, I was hitting a wall at around 235-240 megabytes per second during much of “Short Test”, which is pretty close to the theoretical limit of an LACP team with two GigE NICs.

But as I said above, we don’t have that limit anymore. Whereas there were once 2x1GbE Teams, there are now 3x1GbE. Let’s see what the same test on the same 4KB block/4KB NTFS volume yields now.

SQLIO short test, take two, sort by Random vs Sequential writes & reads:

labworks147

By jove, what’s going on here? This graph was built off the same SQLIO recipe, but looks completely different than Labworks 1. For one, the writes look much better, and reads look much worse. Yet step back and the patterns are largely the same.

It’s data like this that makes benchmarking, validating & ultimately purchasing storage so tricky. Some would argue with my reliance on SQLIO and those arguments have merit, but I feel SQLIO, which is easy to script/run and automate, can give you some valuable hints into the characteristics of an array you’re considering.

Let’s look at the writes question specifically.

Am I really writing 350MB/s to SAN2?

storagenetworkingforthewinOn the one hand, everything I’m looking at says YES: I am a Storage God and I have achieved #StorageGlory inside the humble Daisetta Lab HQ on consumer-level hardware:

  • SAN2 is showing about 115MB/s to each Broadcom interface during the 32KB & 64KB samples
  • Agnostic_Node_1 perfmon shows about the same amount of traffic eggressing the three vEthernet adapters
  • The 2960S is reflecting all that traffic; I’m definitely pushing about 350 megabytes per second to SAN2; interface port channel 3 shows TX load at 219 out of 255 and maxing out my LACP team

On the other hand, I am just an IT Mortal and something bothers:

  • CPU is very high on SAN2 during the 32KB & 64KB runs…so busy it seems like the little AMD CPU is responsible for some of the good performance marks
  • While I’m a fan of the itsy-bitsy 2.5″ Western Digitial RED 1TB drives in SAN2, under no theoretical IOPS model is it likely that six of them, in RAIDZ-2 (RAID 6 equivalent) can achieve 5,000-10,000 IOPS under traditional storage principles. Each drive by itself is capable of only 75-90 IOPS
  • If something is too good to be true, it probably is

49286241Sr. Storage Engineer Neo feels really frustrated at this point; he can’t figure out why his writes suck, or even if they suck, and so he wanders up to the Oracle to get her take on the situation and comes across this strange Buddha Storage kid.

Labworks 1:7 – The Essence of ZFS & New Storage model

In effect, what we see here is is just a sample of the technology & techniques that have been disrupting the storage market for several years now: compression & caching multiply performance of storage systems beyond what they should be capable of, in certain scenarios.

As the chart above shows, the test2 volume is compressed by SAN2 using lzjb. On top of that, we’ve got the ZFS ARC, L2ARC, and the ZIL in the mix. And then, to make things even more complicated, we have some sync policies ZFS allows us to toggle. They look like this:

sync policy

The sync toggle documentation is out there and you should understand it it is crucial to understanding ZFS, but I want to demonstrate the choices as well.

I’ve got three choices + the compression options. Which one of these combinations is going to give me the best performance & durability for my Hyper-V VMs?

SQLIO Short Test Runs 3-6, all PivotTabled up for your enjoyment and ease of digestion:

compressionsync

As is usually the case in storage, IT, and hell, life in general, there are no free lunches here people. This graph tells you what you already know in your heart: the safest storage policy in ZFS-land (Always Sync, that is to say, commit writes to the rotationals post haste as if it was the last day on earth) is also the slowest. Nearly 20 seconds of latency as I force ZFS to commit everything I send it immediately (vs flush it later), which it struggles to do at a measly average speed of 4.4 megabytes/second.

Compression-wise, I thought I’d see a big difference between the various compression schemes, but I don’t. Lzgb, lz4, and the ultra-space-saving/high-cpu-cost gzip-9 all turn in about equal results from an IOPS & performance perspective. It’s almost a wash, really, and that’s likely because of the predictable nature of the IO SQLIO is generating.

Labworks 1:Epilogue

Last point: ZFS, as Chris Wahl pointed out, is a sort of virtualization layer atop your storage. Now if you’re a virtualization guy like me or Wahl, that’s easy to grasp; Windows 2012 R2’s Storage Spaces concept is similar in function.

But sometimes in virtualization, it’s good to peel away the abstraction onion and watch what that looks like in practice. ZFS has a number of tools and monitors that look at your Zpool IO, but to really see how ZFS works, I advise you to run gstat. GStat shows what your disks are doing and if you’re carefully setting up your environment, you ought to be able to see the effects of your settings on each individual spindle.

In this Gifcam, watch ada0-6 as they struggle under load with the "Always Sync" option enabled.
In this Gifcam, watch ada0-5 (the western digitals)as they struggle under load with the “Always Sync” option enabled. Notice that the zvol/Alpha-Pool/Test2 volume (The logical volume construct) is at 100% busy and the ops/s are not very stellar.

Now look at this gstat sample. Under SQLIO-load, the zvol is showing 10,000 IOPS, 300+MB/s. But ada0-5, the physical drives, aren't doing squat.

Now look at this gstat sample. Under SQLIO-load, the zvol is showing 10,000 IOPS, 300+MB/s. But ada0-5, the physical drives, aren’t doing squat for several seconds at a time as SAN2 absorbs & processes all the IO coming at it.

That, friends, is the essence of ZFS.

 Links/Knowledge/Required Reading Used in this Post:

[table]
Resource, Author, Summary

Nexenta’s awesome whitepapers and guides, Nexenta, Find ’em and collect ’em good stuff on MPIO config and ZFS performance

Comparing SSD vs NoSSD in Nexenta w/NFS, Larry Smith, A fellow ZFS fan with more focus on NFS & VMware

Get the Most out of ZFS SSD, Sebastian “vBagpipes” Laubscher, Sebastian finds a different way to provision the ZIL/SLOG

Nexenta & Scale, Hans DeLeenHeer, Fellow #TFD delegate looks at ZFS tiers in superhero context

SLOG/ZIL Insight, FreeNAS forum, Great forum-focused post on SLOG/ZIL in BSD ZFS

SLOG Blog, Oracle, 2007 post about the ZIL & SLOG heralding storage di

 Zpool and ZIL management, Magnus Strahlert, Excellent how-to guide for ZIL/L2ARC provisioning

[/table]

 

Labworks 2:5-8 – Get-Me -ConvergedSwitching -For “Hyper-V” | Now-Please

Hello Labworks fans, detractors and partisans alike, hope you had a nice Easter / Resurrection / Agnostic Spring Celebration weekend.

Last time on Labworks 2:1-4, we looked at some of the awesome teaming options Microsoft gave us with Server 2012 via its multiplexor driver. We also made the required configuration adjustments on our switch for jumbo frames & VLAN trunking, then we built ourselves some port channel interfaces flavored with LACP.

I think the multiplexor driver/protocol is one of the great (unsung?) enhancements of Server 2012/R2 because it’s a sort of pre-virtualization abstraction layer (That is to say, your NICs are abstracted & standardized via this driver before we build our important virtual switches) and because it’s a value & performance multiplier you can use on just about any modern NIC, from the humble RealTek to the Mighty Intel Server 10GbE.

But I’m getting too excited here; let’s get back to the curriculum and get started shall we?

Goals

5.  Understand what Microsoft’s multiplexor driver/LBFO has done to our NICs

6. Build our Virtual Machine Switch for maximum flexibility & performance

7. The vEthernets are Coming

8. Next Steps: Jumbo frames from End-to-end and performance tuning

Schematic:

Lab 2 - Daisetta Labs overview

2:5 Understand what Microsoft’s Multiplexor driver/LBFO has done to our NICs

So as I said above, the best way to think about the multiplexor driver & Microsoft’s Load Balancing/Failover tech is by viewing it as a pre-virtualization abstraction layer for your NICs. Let’s take a look.

Our Network Connections screen doesn’t look much different yet, save for one new decked-out icon labeled “Daisetta-Team:”

daisettateam

Meanwhile, this screen is still showing the four NICs we joined into a team in Labworks 2:3, so what gives?

A click on the properties of any of those NICs (save for the RealTek) reveals what’s happened:

Egads! My Intel NIC has been neutered by LBFO
Egads! My Intel NIC has been neutered by LBFO

The LBFO process unbinds many (though not all) settings, configurations, protocols and certain driver elements from your physical NICs, then binds the fabulous Multiplexor driver/protocol to the NIC as you see in the screenshot above.

In the dark days of 2008 R2 & Windows core, when we had to walk up hill to school both ways in the snow I had to download and run a cmd tool called nvspbind to get this kind of information.

Fortunately for us in 2012 & R2, we have some simple cmdlets:

daisettateam3

So notice Microsoft has essentially stripped “Ethernet 4” of all that would have made it special & unique amongst my 4x1GbE NICs; where I might have thought to tag a VLAN onto that Intel GbE, the multiplexor has stripped that option out. If I had statically assigned an IP address to this interface, TCP/IP v4 & v6 are now no longer bound to the NIC itself and thus are incapable of having an IP address.

And the awesome thing is you can do this across NICs, even NICs made by separate vendors. I could, for example, mix the sacred NICs (Intel) with the profane NICs (RealTek)…it don’t matter, all NICs are invited to the LBFO party.

No extra licensing costs here either; if you own a Server 2012 or 2012 R2 license, you get this for free, which is all kinds of kick ass as this bit of tech has allowed me in many situations to delay hardware spend. Why go for 10GbE NICs & Switches when I can combine some old Broadcom NICs, leverage LACP on the switch, and build 6×1 or 8x1GbE Converged LACP teams?

LBFO even adds up all the NICs you’ve given it and teases you with a calculated LinkSpeed figure, which we’re going to hold it to in the next step:

4GbS LACP team sounds great, but is it really 4Gb/s?
4GbS LACP team sounds great, but is it really 4Gb/s?

2:6 Build our Virtual Machine Switch for maximum flexibility & performance

If we just had the multiplexor protocol & LBFO available to us, it’d be great for physical server performance & durability. But if you’re deploying Hyper-V, you get to have your LBFO cake and eat it too, by putting a virtual switch atop the team.

This is all very easy to do in Hyper-V manager. Simply right click your server, select Virtual Switch Manager, make sure the Multiplexor driver is selected as the NIC, and press OK.

Bob’s your Uncle:

daisettaconverged1

But let’s go a bit deeper and do this via powershell, where we get some extra options & control:

PS C:usersjeff.DAISETTALABS> new-vmswitch -NetAdapterInterfaceDescription “Microsoft Network Adapter Multiplexor Driver” -AllowManagementOS 1 -MinimumBandwidthMode Weight -name “Daisetta-Converged”

Let’s go through each of these:

  • New-vmswitch : the cmdlet we’re invoking to build the switch. Run get-help new-vmswitch for a rundown of the cmdlet’s structure & options
  • -NetAdapterInterfaceDescription : here we’re telling Windows which NIC to build the VM Switch on top of. Get the precise name from Get-NetAdapter and enclose it in quotes
  • -Allow ManagementOS 1 : Recall the diagram above. This boolean switch (1 yes, 0 no) tells Windows to create the VM Switch & plug the Host/Management Operating System into said Switch. You may or may not want this; in the lab I say yes; at work I’ve used No.
  • -Minimum Bandwidth Mode Weight: We lay out the rules for how the switch will apportion some of the 4Gb/s bandwidth available to it. By using “Weight,” we’re telling the switch we’ll assign some values later
  • Name: Name your switch

A few seconds later, and congrats Mr. Hyper-V admin, you have built a converged virtual switch!

2:7 The vEthernets are Coming

Now that we’ve built our converged virtual switch, we need to plug some things into it. And that starts on the physical host.

If you’re building a Hyper-V cluster or stand-alone Hyper-V host with VMs on networked storage, you’ll approach vEthernet adpaters differently than if you’re building Hyper-V for VMs on attached/internal storage or on SMB 3.0 share storage. In the former, you’re going to need storage vEthernet adpters; in the latter you won’t need as many vEthernets unless you’re going multi-channel SMB 3.0, which we’ll cover in another labworks session.

I’m going to show you the iSCSI + Failover Clustering model.

In traditional Microsoft Failover Clustering for Virtual Machines, we need a minimum of five discrete networks. Here’s how that shakes out in the Daisetta Lab:

[table]

Network Name, VLAN ID, Purpose, Notes

Management, 1, Host & VM management network, You can separate the two if you like

CSV, 14, Host Cluster & communication and coordination, Important for clustering Hyper-V hosts

LM, 15, Live Migration network, When you must send VMs from broke host to host with the most LM is there for you

iSCSI 1-3, 11-13, Storage, Soemwhat controversial but supported

[/table]

Now you should be connecting that dots: remember in Labworks 2:1, we built a trunked port-channel on our Cisco 2960S for the sole purpose of these vEthernet adapters & our converged switch.

So, we’re going to attach tagged vethernet adapters to our host via powershell. Pay attention here to the “-managementOS” tag; though our Converged switch is for virtual machines, we’re using it for our physical host as well.

You can script his out of course (and VMM does that for you), but if you just want to copy paste, do it in this order:

  • Add the vEthernets
add-vmnetworkadapter -managementos -name CSV -switchname Daisetta-converged
add-vmnetworkadapter -managementos -name iSCSI-1 -switchname Daisetta-converged add-vmnetworkadapter -managementos -name iSCSI-2 -switchname Daisetta-converged
add-vmnetworkadapter -managementos -name iSCSI-3 -switchname Daisetta-converged
add-vmnetworkadapter -managementos -name LM -switchname Daisetta-converged
  • Tag those vEthernets!
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 15 -VMNetworkAdapterName LM
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 14 -VMNetworkAdapterName CSV
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 13 -VMNetworkAdapterName iSCSI-3
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 12 -VMNetworkAdapterName iSCSI-2
Set-VMNetworkAdapterVlan -ManagementOS -Access -VlanId 11 -VMNetworkAdapterName iSCSI-1
  • Now set IPs
New-NetIPAddress -IPAddress 172.16.14.12 -InterfaceAlias "vEthernet (CSV)" -AddressFamily IPv4 -PrefixLength 24
 
New-NetIPAddress -IPAddress 172.16.15.12 -InterfaceAlias “vEthernet (LM)” -AddressFamily IPv4 -PrefixLength 24
New-NetIPAddress -IPAddress 172.16.13.12 -InterfaceAlias "vEthernet (iSCSI-3)" -AddressFamily IPv4 -PrefixLength 24
New-NetIPAddress -IPAddress 172.16.12.12 -InterfaceAlias "vEthernet (iSCSI-2)" -AddressFamily IPv4 -PrefixLength 24
New-NetIPAddress -IPAddress 172.16.11.12 -InterfaceAlias "vEthernet (iSCSI-1)" -AddressFamily IPv4 -PrefixLength 24
 

Notice we didn’t include a Gateway in the New-NetIPAddress cmdlet; that’s because when we built our Virtual Switch with the “-managementOS 1” switch attached, Windows automatically provisioned a vEthernet adapter for us, which either got an IP via DHCP or took an apipa address.

So now we have our vEthernets and their appropriate VLAN tags:

daisettaconverged2
Ignore the DMZ vEthernet for now. Notice Daisetta-Converged, our VM Switch, is seen as a VMNetworkAdapter and is untagged. In my lab, this interface functions as my Host Management interface. In a production scenario, you’ll probably use separate vEthernet adapters for Host Management and not expose the switch itself to the management OS

 

 

 

 

 

 

 

2:8: Next Steps : Jumbo Frames from end-to-end & Performance Tuning

So if you’ve made it this far, congrats. If you do nothing else, you now have a converged Hyper-V virtual switch, tagged vEthernets on your host, and a virtualized infrastructure that’s ready for VMs.

But there’s more you can do; stay tuned for the next labworks post where we’ll get into jumbo frames & performance tuning this baby so she can run with all the bandwidth we’ve given her.

Links/Knowledge/Required Reading Used in this Post:

[table]
Resource, Author, Summary
New-VMSwitch Technet, Microsoft, Always good to have Technet reference
Building a Converged Fabric with Server 2012, Hans “The Hyper-Dutchman” Vredevoort, A 2012 post which helped me when I was struggling through 2008 R2 to 2012 Hyper-V migration

Hyper-V 3.0 Converged Networks with Force 10 and DCB, Dell, Neat Wiki & diagram with iSCSI as separate virtual switch but with DCB

[/table]

 

 

Labworks 1:1-2 : I Heart the ARC & Let’s Pull Some Drives!

Last week on Labworks’ debut, Labworks #1 : Building a Durable & Performance Oriented ZFS Box for Hyper-V & VMware, I discussed & shared a few tips, observations & excellent resources for building out a storage layer for your home IT lab using Sun’s Oracle’s the open source community’s Illumos’ the awesome Zetabyte File System via the excellent NAS4Free crew and FreeBSD.

The post has gotten quite a bit of traffic and I hope it’s been helpful to folks. I intended to do the followup posts soon after that, but boy, have I had a tough week in technology. 

Let’s hop to it, shall we?

Labworks  1.1-2 : I Heart the ARC and Let’s Pull Some Drives!

When we left Labworks 1, I assigned myself some homework. Here’s an update on each of those tasks, the grade I’d give myself (I went to Catholic school so I’m characteristically harsh) and some notes on the status:

[table]
Next Step, Completed, Grade, Notes
Find out why my writes suck, Kind of, B-, Replaced Switch & Deep dived the ZIL

Test NAS4Free’s NFS Performance, No,F, One Pink Screen of Death too many

Test SMB 3.0 from a VM inside ZFS box, No, F, Block vs File Bakeoff plans

Sell some stuff, No, C, Other priorities

Rebuild rookie standard switch into distributed, no,F, Cant build a vSwitch without a VMware host
[/table]

I have updates on all of these items, so if you’re curious stick around as they’ll be posted in subsequent Labworks. Suffice it to say, there’s been some infrastructure changes in the Daisetta Lab, and so here’s an updated physical layout with Skull & Crossbones over my VMware host, which I put out of its misery last week.

Lab 1a - Daisetta Labs

In the meantime, I wanted to share some of the benefits of ZFS for your Hyper-V or VMware lab.

1:1 – I Heart the ARC

So I covered some of the benchmark results in Labworks 1, but I wanted to get practical this week. Graphs & benchmarks are great, but how does ZFS storage architecture benefit your virtualization lab?

Dramatically.

At least in the case of Hyper-V, Clustered Share Volumes and dynamic .vhdxs on iSCSI.

To really show how it works, I had to zero out my ARC, empty the L2ARC, and wipe the writes/reads counters to each physical volume out. And to do that, I had to reboot SAN2. My three virtual machines -a Windows 7 vm, a SQL 2014 VM, and a Virtual Machine Management server, had to be shut down, and just to do it right and by the book, I placed both the CSV & LUN mapped to Node-1 into maintenance mode.

And then I started the whole thing back up. Here are the results followed by two animated gifs. Remember, the ARC is your system RAM, so watch how it grows as ZFS starts putting VMs into RAM and L2ARC, my SSD drives:

[table]
ARC Size (Cold Boot), ARC Size after VM Boot, ARC Size +5h, L2ARC Size (Cold Boot), L2ARC Size after VM Boot, L2ARC Size + 5h

7MB, 10GB, 14GB, 900KB, 4.59GB,6.8GB

[/table]

So for you in your lab, or if you’re pondering similar tech at work, what’s this mean?

Boot speed of your VM fleet is the easiest to quantify and the greatest to behold, but that’s just for starters.

ZFS’ ARC & L2ARC shaved over 80% off my VM’s boot times and massively reduced load on rotational disks on the second boot.

Awesome stuff:

[table]
Win7 Cold Boot to Login, Highest ZVol %busy, SSD Read/Write Ops, Win7 2nd Boot to Login, Highest ZVol %busy, SSD Read/Write Ops

121s, 103%, 8/44, 19.9s, 13%, 4/100k

[/table]

The gains here are enormous and hint at the reasons why SSD & caching are so attractive. Done right, the effect is multiplicative in nature; you’re not just adding IOPS when you add an SSD, you’re multiplying storage performance by several orders of magnitude in certain scenarios. And VM boot times are such a scenario where the effect is very dramatic:

[table]

% Improvement in Boot Time,ZVol %Busy Decrease, %ARC Growth, L2ARC Growth

84%, -87%, 43%,410%

[/table]

This is great news if you’re building lab storage because, as I said in Labworks 1, if you’re going to have to use an entire physical box for storage, best to use every last bit of that box for storage, including RAM. ZFS is the only non-commercial system I know of to give you that ability, and though the investment is steep, the payoff is impressive. 

46869718Now at work, imagine you have a fleet of 50 virtual machines, or 100 or more, and you have to boot or reboot them on a Saturday during your maintenance window. Without some sort of caching mechanism, be it a ZFS ARC & its MRU/MFU algorithms, or some of the newer stuff we saw at #VFD3 including Coho’s system & Atlantis’ ILIO USX, you’re screwed.

Kiss your Saturday goodbye because on old rotational arrays, you’re going to have to stagger your boots, spread it over two Saturdays, or suffer the logarithmic curve of filer entropy & death as more IO begets more IO delay in a vicious cycle of decay that will result in you banging your fists bloody on the filer, begging the storage gods for mercy and relief.

Oh man that was a painful Saturday four years ago.

I wish I could breakdown these results even further; what percentage of that 19s boot time is due to my .vhdx being stored in SAN2’s ARC, and what percentage is due, if any, to ZFS compression on the volume or by the CPU on the IO ‘stream’ itself, as I’ve got that particular box ticked on CSV1 as well?

That’s important to understand for lab work or your real job because SSD & caching are only half of the reason why the stodgy storage sector has been turned on its head. Step back and survey the new players vs the old, and I think you’ll find that many of the new players are reading & writing data to/from their arrays in more intelligent (or risky, depending on your perspective) ways, by leveraging the CPU to compress inbound IO, or de-duping on the front-end rather than on the back-end or, in the case of a Coho, just handing over the switch & Layer 2 to the array itself in wild yet amazing & extensible ways.

My humble NAS4Free box isn’t near those levels of sophistication yet I don’t think it’s improper to draw an academic-family tree-style dotted line between my ZFS lab storage & some of the great new storage products on the market that are using sophisticated caching algorithms & compression/processing to deliver high performance storage downmarket, so downmarket that I’ve got fast storage in my garage!

Perhaps a future labworks will explore compression vs caching, but for now, let’s take a look at what ZFS is doing during the cold & warm boots of my VMs.

Single Pane O’GifGlass animated shot of the cold boot (truncated):

In the putty window, ada0-5 are HDD, ada6&7 are SSD, and ada8 is boot
In the putty window, ada0-5 are HDD, ada6&7 are SSD, and ada8 is boot. GStat de-abstracts ZFS & shows you what your disks are doing. Check out how ZFS alternates writes to the two SSDs. Neat stuff.

And the near #StorageGlory Gifcam shot of the entire 19s 2nd boot cycle after my ARC & L2ARC are sufficiently populated:

80% decrease in boot times thanks to the ARC & l2ARC. Value boner indeed.
80% decrease in boot times thanks to the ARC & L2ARC. Now ZFS has some idea of what my most frequently used & most recently used data is, and that algorithm will populate the ARC & L2ARC.

Of course, how often are we rebooting VMs anyway? Fair point.

One could argue the results above, while interesting, have limited applicability in a lab, a small enterprise or even a large one, but consider this: if you deliver applications via session virtualization technologies -XenApp or RDS come to mind- on top of a hypervisor (virtualization within virtualization for the win!), then ZFS and other caching systems will likely ease your pain and get your users to their application faster than you ever could achieve with rotational storage alone. So in my book, it’s something you should master and understand.

Durability Testing

So all this is great. ZFS performs very well for Hyper-V, the ARC/L2ARC paradigm works, and it’s all rather convincing isn’t it? I’ll save some thoughts on writes for a subsequent Labworks, but so far, things are looking up.

Of course you can’t be in IT and not worry about the durability & integrity of your data. As storage guys say, all else is plumbing; when it comes to data and storage, an array has to guarantee integrity.

This is probably most enjoyable test of all IT testing regimes, if only because it’s so physical, so dramatic, so violent, and so rare. I’m talking about drive pulls & storage failure simulations, the kind of test you only get to do when you’re engaging in a PoC at work, and then, perhaps for SMB guys like me, only once every few years.

As I put it back in January when I was testing a Nimble array at work, “Wreck that array.”

At home of course I can’t afford true n+1 everywhere, let alone waste disks on something approaching the level of reliability of RAID DP, but I can at least test RAIDZ2, ZFS’ equivalent to RAID 6.

Drive Pull test below. Will my CSVs stay online? Click play.

More Labworks results tomorrow!

Labworks #1: Building a durable, performance-oriented ZFS box for Hyper-V, VMware

Welcome to my first Labworks post in which I test, build & validate a ZFS storage solution for my home Hyper-V & VMware lab.

Be sure to check out the followup lab posts on this same topic in the table below!

[table]

Labworks Chapter, Section, Subject, Title & URL

Labworks 1:, 1, Storage, Building a Durable and Performance-Oriented ZFS Box for Hyper-V & VMware

,2-3, Storage, I Heart the ARC & Let’s Pull Some Drives!

[/table]

Labworks  #1: Building a durable, performance-oriented ZFS box for Hyper-V, VMware

Primary Goal: To build a durable and performance-oriented storage array using Sun’s fantastic, 128 bit, high-integrity Zetabyte File System for use with Lab Hyper-V CSVs & Windows clusters, VMware ESXi 5.5, other hypervisors,

 

The ARC: My RAM makes your SSD look like 15k drives
The ARC: My RAM makes your SSD look like a couplel of old, wheezing 15k drives

Secondary Goal: Leverage consumer-grade SSDs to increase/multiply performance by using them as ZFS Intent Log (ZIL) write-cache and L2ARC read cache

Bonus: The Windows 7 PC in the living room that’s running Windows Media Center with CableCARD & HD Home Run was running out of DVR disk space and can’t record to SMB shares but can record to iSCSI LUNs.

Technologies used: iSCSI, MPIO, LACP, Jumbo Frames, IOMETER, SQLIO, ATTO, Robocopy, CrystalDiskMark, FreeBSD, NAS4Free, Windows Server 2012 R2, Hyper-V 3.0, Converged switch, VMware, standard switch, Cisco SG300

Schematic: 

Click for larger
Click for larger.

Hardware Notes:
[table]
System, Motherboard, Class, CPU, RAM, NIC, Hypervisor
Node-1, Asus Z87-K, Consumer, Haswell i-5, 24GB, 2x1GbE Intel I305, Hyper-V
Node-2, Biostar HZZMU3, Consumer, Ivy Bridge i-7, 24GB, 2x1GbE Broadcom BC5709C, Hyper-V
Node-3, MSI 760GM-P23, Consumer, AMD FX-6300, 16GB, 2x1GbE Intel i305, ESXi 5.5
san2, Gigabyte GA-F2A88XM-D3H, Consumer, AMD A8-5500, 24GB, 4x1GbE Broadcom BC5709C, NAS4Free
sw01, Cisco SG300-10 Port, Small Busines, n/a, n/a, 10x1GbE, n/a
[/table]

Array Setup:

I picked the Gigabyte board above because it’s got an outstanding eight SATA 6Gbit ports, all running on the native AMD A88x Bolton-D4 chipset, which, it turns out, isn’t supported well in Illumos (see Lab Notes below).

I added to that a cheap $20 Marve 9128se two port SATA 6gbit PCIe card, which hosts the boot volume & the SanDisk SSD.

[table]

Disk Type, Quantity, Size, Format, Speed, Function

WD Red 2.5″ with NASWARE, 6, 1TB, 4KB AF, SATA 3 5400RPM, Zpool Members

Samsung 840 EVO SSD, 1, 128GB, 512byte, 250MB/read, L2ARC Read Cache

SanDisk Ultra Plus II SSD, 1, 128GB, 512byte, 250MB/read & 250MB/write?, ZIL

Seagate 2.5″ Momentus, 1, 500GB, 512byte, 80MB/r/w, Boot/swap/system

[/table]

Performance Tests:

I’m not finished with all the benchmarking, which is notoriously difficult to get right, but here’s a taste. Expect a followup soon.

All shots below involved lzp2 compression on SAN2

SQLIO Short Test: 

sqlio lab 1 short test
Obviously seeing the benefit of ZFS compression & ARC at the front end. IOPS become more realistic toward the middle and right as read cache is exhausted. Consistently in around 150MB-240Mb/s though, the limit of two 1GbE cables.

 

ATTO standard run:

atto
I’ve got a big write problem somewhere. Is it the ZIL, which don’t seem to be performing under BSD as they did under Nexenta? Something else? Could also be related to the Test Volume being formatted NTFS 64kb. Still trying to figure it out

 

NFS Tests:

None so far. From a VMware perspective, I want to rebuild the Standard switch as a distributed switch now that I’ve got a VCenter appliance running. But that’s not my priority at the moment.

Durability Tests:

Pulled two drives -the limit on RAIDZ2- under normal conditions. Put them back in, saw some alerts about the “administrator pulling drives” and the Zpool being in a degraded state. My CSVs remained online, however. Following a short zpool online command, both drives rejoined the pool and the degraded error went away.

Fun shots:

Because it’s not all about repeatable lab experiments. Here’s a Gifcam shot from Node-1 as it completely saturates both 2x1GbE Intel NICs:

test

and some pretty blinking lights from the six 2.5″ drives:

0303141929-MOTION

Lab notes & Lessons Learned:

First off, I’d like to buy a beer for the unknown technology enthusiast/lab guy who uttered these sage words of wisdom, which I failed to heed:

You buy cheap, you buy twice

Listen to that man, would you? Because going consumer, while tempting, is not smart. Learn from my mistakes: if you have to buy, buy server boards.

Secondly, I prefer NexentaStor to NAS4Free with ZFS, but like others, I worry about and have been stung by Open Solaris/Illumos hardware support. Most of that is my own fault, cf the note above, but still: does Illumos have a future? I’m hopeful, NextentaStor is going to appear at next month’s Storage Field Day 5, so that’s a good sign, and version 4.0 is due out anytime.

The Illumos/Nexenta command structure is much more intuitive to me than FreeBSD. In place of your favorite *nix commands, Nexenta employs some great, verb-noun show commands, and dtrace, the excellent diagnostic/performance tool included in Solaris is baked right into Nexenta. In NAS4Free/FreeBSD 9.1, you’ve got to add a few packages to get the equivalent stats for the ARC, L2ARC and ZFS, and adding dtrace involves a make & kernel modification, something I haven’t been brave enough to try yet.

Next: Jumbo Frames for the win. From Node-1, the desktop in my office, my Core i5-4670k CPU would regularly hit 35-50% utilization during my standard SQLIO benchmark before I configured jumbo frames from end-to-end. Now, after enabling Jumbo frames on the Intel NICs, the Hyper-V converged switch, the SG-300 and the ZFS box, utilization peaks at 15-20% during the same SQLIO test, and the benchmarks have show an increase as well. Unfortunately in FreeBSD world, adding jumbo frames is something you have to do on the interface & routing table, and it doesn’t persist across reboots for me, though that may be due to a driver issue on the Broadcom card.

The Western Digital 2.5″ drives aren’t stellar performers and they aren’t cheap, but boy are they quiet, well-built, and run cool, asking politely for only 1 watt under load. I’ve returned the hot, loud & failure prone HGST 3.5″ 2 TB drives I borrowed from work; it’s too hard to put them in a chassis that’s short-depth.

Lastly, ZFS’ adaptive replacement cache, which I’ve enthused over a lot in recent weeks, is quite the value & performance-multiplier. I’ve tested Windows Server 2012 R2 Storage Appliance’s tiered storage model, and while I was impressed with it’s responsiveness, ReFS, and ability to pool storage in interesting ways, nothing can compete with ZFS’ ARC model. It’s simply awesome; deceptively-simple, but awesome.

Lesson is that if you’re going to lose an entire box to storage in your lab, your chosen storage system better use every last ounce of that box, including its RAM, to serve storage up to you. 2012 R2 doesn’t, but I’m hopeful soon that it may (Update 1 perhaps?)

Here’s a cool screenshot from Nexenta, my last build before I re-did everything, showing ARC-hits following a cold boot of the array (top), and a few days later, when things are really cooking for my Hyper-V VMs stored, which are getting tagged with ZFS’ “Most Frequently Used” category and thus getting the benefit of fast RAM & L2ARC:

cache

Next Steps:

  • Find out why my writes suck so bad.
  • Test Nas4Free’s NFS performance
  • Test SMB 3.0 from a virtual machine inside the ZFS box
  • Sell some stuff so I can buy a proper SLC SSD drive for the ZIL
  • Re-build the rookie Standard Switch into a true Distributed Switch in ESXi

Links/Knowledge/Required Reading Used in this Post:

[table]
Resource, Author, Summary
Three Example Home Lab Storage Designs using SSDs and Spinning Disk, Chris Wahl, Good piece on different lab storage models
ZFS, Wikipedia, Great overview of ZFS history and features
Activity of the ZFS Arc, Brendan Gregg, Excellent overview of ZFS’ RAM-as-cache
Hybrid Storage Pool Performance, Brendan Gregg, Details ZFS performance
FreeBSD Jumbo Frames, NixCraft, Applying MTU correctly
Hyper-V vEthernet Jumbo Frames, Darryl Van der Peijl, Great little powershell script to keep you out of regedit
Nexenta Community Edition 3.1.5, NexentaStor, My personal preference for a Solaris-derived ZFS box
Nas4Free, Nas4Free.org, FreeBSD-based ZFS; works with more hardware
[/table]