I’ve been going on, insufferably at times, about my new Nimble storage array at work. Back in January, it passed my home-grown bakeoff with flying colors, in February I wrote about how it was inbound to my datacenter, in March I fretted over iSCSI traffic, .vhdx parades, and my 6509-E.
Well it’s been just about a month since it was racked up and jacked into my Hyper-V fabric and I thought maybe the storage nerds among my readers would like an update on how its performing.
Fast: It’s been strange getting compliments, kudos and thank yous rather than complaints and ALL CAPS emails punctuated by Exclamation Marks. I have a couple of very critical SQL databases, the performance of which can make or break my job, and after some deliberation, we took the risk and moved the biggest of them to the Nimble about three weeks ago.
Here’s a slightly edited email from one power user 72 hours later:
Did I say THANK YOU for the extra zip yet?
STILL LOVING IT!!
I’m taken aback by all the affection coming my way…no longer under user-siege, I feel like maybe I should dress better at work, shave every day, turn on some lights in the office perhaps. Even the dev team was shocked, with one of them invoking Spaceballs and saying his storage-dependent process was moving at “Ludicrous speed.”
It’s Easy: I can’t underscore this enough. If you’re a mid-sized enterprise with vanilla/commodity workloads and you can tolerate an array that’s just iSCSI (you can still use NFS or SMB 3, just from inside clustered VMs!), Nimble’s a good fit, especially if your staff is more generalist in nature. or you don’t have time to engineer a new SAN from scratch.
This was a Do It Yourself storage project for me; I didn’t have the luxury or time to hire storage engineers or VARs to come in and engineer it for me. Nimble will try to sell you on professional services, but you can decline and hook it up yourself, as I did. There are best practice guides a-plenty, and if you understand your stack & workload, your switching & compute, you’ll do fine.
Buying it was easy: Nimble’s lineup is simple and from a customer standpoint, it was a radically different experience to buy a Nimble than a traditional SAN.
Purchasing a big SAN is like trying to decide what to eat at French restaurant in Chinatown…you recognize the letters and & the pictures on the menu look familiar, but you don’t know what that SKU is exactly or how you’ll feel in the morning after buying & eating it. And while the restaurant has provided a helpful & knowledgeable garçon to explain French cuisine & etiquette to you, you know the garçon & his assistants moonlight at the Italian, German and Sushi place down the road, where they are equally knowledgeable & enthusiastic about those cuisines. But they can’t talk about the Italian place because they have something called agency with the french restaurant; so with you, they are only French cuisine experts, and their professional opinion is that Italian, German and Sushi are horrible food choices. Also, your spend with the restaurant is too small to get the chef’s attention..you have to go through this obnoxious garçon system.
Buying from Nimble, meanwhile, is like picking a burger at In ‘n Out. You have three options, all of them containing meat, and from left to right, the choices are simply Good, Better, Best. You can stack shelves onto controller-shelves, just like a Double-Double, and you know what you’ll get in the end. Oh sure, there’s probably an Animal Style option somewhere, but you don’t need Animal Style to enjoy In ‘n Out, do you?
Lesson is this: Maybe your organization needs a real full-featured SAN & VAR-expertise. But maybe you just need fast, reliable iSCSI that you can hook up yourself.
It’s nice that we in customer-land have that option now.
ASUP & Community: The Autosupport from Nimble has left nothing to be desired, in fact, I think they nag too much. But I’ll take that over a downed array.
I’ve grown to enjoy Connect.Nimble.com, the company’s forum where guys like me can compare notes. Shout out to one awesome Nimble SE named Adam Herbert who built a perfect signed MPIO Powershell script that maps your initiators to your targets in no time at all.
And then you get to sit back and watch as MPIO does its thang across all your iSCSI HBAs, producing symmetrical & balanced utilization charts which, in turn, release pleasing little bursts of storage-dopamine in your brain.
It works fine with Hyper-V, CSVs, and Converged Fabric vEthernets: What a mouthful, but it’s true. Zero issues fitting this array into System Center Virtual Machine Manager storage (though it doesn’t have SMI-S support a “standard” which few seem to have adopted), failing CSVs from one Hyper-V node to another, and resizing CSVs or RDMs live.
And for the convergence fans: I pretty much lost my fear of using vEthernet adapters for iSCSI traffic during the bakeoff and in the Daisetta Lab at home, but in case you needed further convincing that Hyper-V’s converged fabric architecture kicks ass, here it is: Each Hyper-V node in my datacenter has 12 gigabit NICs. Eight of them per host are teamed (that is to say they get the Microsoft Multiplexor driver treatment, LACP-flavor) and then a Converged Virtual switch is built atop the multiplexor driver. From that converged v-switch, I’m dangling six virtual Ethernet adapters per host, two of which, are tagged for the Nimble VLAN I built in the 6509.
That’s a really long and complicated way of saying that in a modest-sized production environment, I’m using LACP teaming on the hosts, up to 4x1GbE vNics on the VIP guests, and MPIO to the storage, which conventional storage networking wisdom says is a bit like kissing your sister and bragging about it. Maybe it’s harmless (even enjoyable?) once or twice, but sooner or later, you’ll live to regret it. And hey the Department of Redundancy Department called, they want one of their protocols back.
I’ve read a lot of thoughtful pieces from VMware engineers & colleagues about doing this, but from a Hyper-V perspective, this is supported, and from a Nimble array perspective, I’m sure they’d point the finger at this if something went wrong, but it hasn’t, and from my perspective : one converged virtual switch = easy to deploy/templatize, easy to manage & monitor. Case closed.
LACP + MPIO in Hyper-V works so well that in three weeks of recording iSCSI stats, I’ve yet to record a single TCP error/re-transmit or anything that would make me think the old model was better. And I haven’t even applied bandwidth policies on the converged switches yet; that tool is still in my box and right now iSCSI is getting the Hyper-V equivalent of best effort.
It’s getting faster: Caching is legit. All my monitors and measurements prove it out. Implement your Nimble correctly, and you may see it get faster as time goes on.
And by that I mean don’t tick the “caching” box for every volume. Conserve your resources, develop a strategy and watch it bloom and grow as your iSCSI packets find their way home faster and faster.
The DBA is noticing it too in his latency timers & long running query measurements, but this graph suffices to show caching in action over three weeks in a more exciting way than a select * from slow-ass-tables query:
Least Frequently Used, Most Recently Used, Most Frequently Used….who frequently/recently cares what caching algorithm the CASL architecture is using? A thousand whiteboard sessions conducted by the world’s greatest SE with the world’s greatest schwag gifts couldn’t sell me on this the way my own charts and my precious perfmons do.
My cached Nimble volumes are getting faster baby.
Compression wise, I’m seeing some things I didn’t expect. Some volumes are compressing up to 40x. Others are barely hitting 1.2x. The performance impact of this is hard to quantify, but from a conservation standpoint, I’m not having to grow volumes very often. It’s a wash with the old dedupe model, save for one thing: I don’t have to schedule compression. That’s the CPUs job, and for all I know, the Nehalems inside my CS260 are, or should be, redlining as they lz4 my furious iSCSI traffic.
Busy Box & CLI: The Nimble command line in version 1.4x felt familiar to me the first time I used it. I recognized the command structure, the help files and more, and thought it looked like Busy Box.
What’s Busy Box? How to put this without making enemies of The Guys Who Say Vi…Busy Box is a collection of packages, tools, servers and scripts for the unix world developed about 25 years ago by an amazing Unix engineer. It’s very popular, it’s everywhere, and it’s reliable and I have no complaints about it other than the fact that it’s disconcerting that my Nimble has the same package of tools I once installed on my Android handset.
But that’s just the Windows guy talking, a Windows guy who was really fond of his WAFL and misses it but will adapt and holds out hope that OneGet & PowerShell, one day, will emerge victorious over all.
The SSL cert situation is embarrassing and I’m glad my former boss hasn’t seen it. Namely that situation is this: you can’t replace the stock SSL cert, which, frankly looks like something I would do while tooling around with OpenSSL in the lab.
I understand this is fixed in the new 2.x OS version but holy shit what a fail.
Other than that, I’m very pleased -and the organization is very pleased***- with our Nimble array.
It feels like at last, I’m enjoying the fruits of my labor, I’m riding a high-performance storage array that was cost-effective, easy to install, and is performing at/above expectations. I’m like Major Kong, my array is literally the bomb, man and his machine are in harmony and there’s some joy & euphoria up in the datacenter as my task is complete.
*Remember this lesson #StorageGlory seekers: no one knows your workload like you. The above screenshot of cache hits is of a 400GB SQL transaction log volume of a larger SQL DB that’s in use 24/6. Your mileage may vary.
*** I do not speak for the organization even though I just did.