Friday, 9 August 2013

Virtualized Databases: How to strike the Right Balance between Solid State Technologies and Spinning Disks


Originally published in DataBase Trends and Applications Magazine, written by Augie Gonzalez
http://www.dbta.com/Articles/ReadArticle.aspx?ArticleID=90460

If money was not an issue, we wouldn’t be having this conversation. But money is front and center in every major database roll-out and optimization project, and even more so in the age of server virtualization and consolidation. It often forces us to settle for good enough, when we first aspired for swift and non-stop.

The financial tradeoffs are never more apparent than they have become with the arrival of lightning fast solid state technologies. Whether solid state disks (SSDs) or flash memories, we lust for more of them in the quest for speed, only to be moderated by silly constraints like shrinking budgets.

You know too well the pressure to please those pushy people behind the spreadsheets. The ones keeping track of what we spent in the past, and eager to trim more expenses in the future. They drive us to squeeze more from what we have before spending a nickel on new stuff. But if we step out of our technical roles and take the broader business view, their requests are really not that unreasonable. To that end, let’s see how we can strike a balance between flashy new hardware and the proven gear already on the data center floor. By that, I mean arriving at a good mix between solid state technologies and the conventional spinning disks that have served us well in years’ gone.

On the face of it, the problem can be rather intractable.  Even after spending tedious hours of fine tuning you’d never really be able to manually craft the ideal conditions where I/O intensive code sections are matched to flash, while hard disk drives (HDDs) serve the less demanding segments. Well, let me take that back - you could when databases ran on their own private servers.  The difficulty arises when the company opts to consolidate several database instances on the same physical server using server virtualization. And then wants the flexibility to move these virtualized databases between servers to load balance and circumvent outages.

Removing the Guesswork
When it was a single database instance on a dedicated machine, life was predictable. Guidelines for beefing up the spindle count and channels to handle additional transactions or users were well-documented. Not so when multiple instances collide in incalculable ways on the same server, made worse when multiple virtualized servers share the same storage resources. Under those circumstances you need little elves running alongside to figure out what’s best. And the elves have to know a lot about the behavioral and economic differences between SSDs and HDDs to do what’s right.

Turns out you can hire elves to help you do just that. They come shrink-wrapped in storage virtualization software packages. Look for the ones that can do automated storage tiering objectively - meaning, they don’t care who makes the hardware or where it resides.

On a more serious note, this new category of software really takes much of the guesswork, and the costs, out of the equation. Given a few hints on what should take priority, it makes all the right decisions in real time, keeping in mind all the competing I/O requests coming across the virtual wire. The software directs the most time-sensitive workloads to solid state devices and the least important ones to conventional drives or disk arrays. You can even override the algorithms to specifically pin some volumes on a preferred class of storage, say end-of-quarter jobs that must take precedence.

Better Storage Virtualization Products
The better storage virtualization products go one better. They provide additional turbo charging of disk requests by caching them on DRAM. Not just reads, but writes as well. Aside from the faster response, write caching helps reduce the duty cycle on the solid state memories to prolong their lives. Think how happy that makes the accountants. The storage assets are also thin provisioned to avoid wasteful over-allocation of premium-priced hardware.

This brings us to the question of uptime. How do we maintain database access when some of this superfast equipment has to be taken out of service? Again, device-independent storage virtualization software has much to offer here. Particularly those products which can keep redundant copies of the databases and their associated files on separate storage devices, despite model and brand differences. What’s written to a pool of flash memory and HDDs in one room is automatically copied to another pool of flash/HDDs. The copies can be in an adjacent room or 100 kilometers away. The software effectively provides continuous availability using the secondary copy while the other piece of hardware is down for upgrades, expansion or replacement. Same goes if the room where the storage is housed loses air conditioning, suffers a plumbing accident, or is temporarily out of commission during construction/remodeling.

The products use a combination of synchronous mirroring between like or unlike devices, along with standard multi-path I/O drivers on the hosts to transparently maintain the mirror images. They automatically fail-over and fail-back without manual intervention. Speaking of money, no special database replication licenses are required either. The same mechanisms protecting the databases also protect other virtualized and physical workloads, helping to converge and standardize business continuity practices.

And for the especially paranoid, you can keep distant replicas at disaster recovery (DR) sites as well. For this, asynchronous replication occurs over standard IP WANs.

If you follow the research from industry analysts, you’ve already been alerted to the difficulties of introducing flash memories/SSDs into an existing database environment with an active disk farm. Storage virtualization software can overcome many of these complications and dramatically shorten the transition time. For example, the richer implementations allow solid state devices to be inserted non-disruptively into the virtualized storage pools alongside the spinning disks. In the process, you simply classify them as your fastest tier, and designate the other storage devices as slower tiers. The software then transparently migrates disk blocks from the slower drives to the speedy new cards without disturbing users. You can also decommission older spinning storage with equal ease or move it to a DR site for the added safeguard.

Need for Insight
Of course, you’d like to keep an eye on what’s going on behind the scenes. Built-in instrumentation in the more comprehensive packages provides that precious insight. Real-time charts reveal fine grain metrics on I/O response and relative capacity consumption. They also provide historical perspectives to help you understand how the system as a whole responds when additional demands are placed on it, and anticipate when peak periods of activity are most likely to occur. Heat maps display the relative distribution of blocks between flash, SSDs and other storage media, including cloud-based archives.

What can you take away from this? For one, solid state technologies offer an attractive way to accelerate the speed of your critical database workloads. No surprise there. Used in moderation to complement fast spinning disks and high-capacity, bulk storage already in place, SSDs help you strike a nice balance between excellent response time and responsible spending. To establish and maintain that equilibrium in virtualized scenarios, you should accompany the new hardware with storage virtualization software – the device-independent type. This gives you the optimal means to assimilate flash/SSDs into a high-performance, well-tuned, continuously available environment. In this way, you can please the financial overseers as well as the database subscribers, not to mention all responsible for its caring and feeding - you included.

About the author:
Augie Gonzalez is director of product marketing for DataCore Software and has more than 25 years of experience developing, marketing and managing advanced IT products.  Before joining DataCore, he led the Citrix team that introduced simple, secure, remote access solutions for SMBs. Prior to Citrix, Gonzalez headed Sun Microsystems Storage Division’s Disaster Recovery Group. He’s held marketing / product planning roles at Encore Computers and Gould Computer Systems, specializing in high-end platforms for vehicle simulation and data acquisition.

No comments: