Wednesday, June 25, 2008

Network storage is key to virtualization

Virtualizing? Going Green? Then why do your servers have disks in them?

Virtualization is driving a need for shared storage. Whether you are deploying VMWare virtual machines, or virtualized physical machine images with Racemi’s DynaCenter, or both, virtualization relies on networked storage. Networked storage is the foundation of the mobility and recovery capability inherent in virtualization. If your virtualization goal is consolidation, improved reliability, more availability, automated disaster recovery, dynamic resource allocation or utility computing, without networked storage you can’t take full advantage of the virtualization you deployed.

Networked storage is rapidly gaining market share - iSCSI storage revenue was up nearly 75% in the 4th quarter of 2007 over 2006 (IDC). But most servers are still shipping with internal storage, primarily used to hold boot volumes and applications. Networked storage has advantages over internal storage on power consumption, cost per gig, and many other operational advantages. But the largest (and most overlooked) advantage that networked storage offers over internal storage is the elimination of stranded storage.

Stranded storage was one of the key arguments for storage area network (SAN) adoption in the late ‘90s, and I am surprised the storage vendors aren’t exploiting this today. Prior to SANs, large storage arrays would frequently have large percentages of their storage stranded because the available fiber channel ports were in use, but the systems connected to the array did not require the entire capacity of the array. With external storage being quite expensive at the time, stranded disk was a compelling reason to adopt SAN technology, which was also quite expensive at that time.

Today, due largely to the ever increasing capacity of disk, most servers are using a very small percentage of their internal disk. A customer I was consulting for on a data center consolidation project had completed their physical inventory of infrastructure (servers, network, storage, etc.). They documented 44TB of storage in use, and 21TB available for growth (10TB was located at the new data center and was not in use yet). But they had not inventoried their internal storage. In an exercise to determine what systems might be consolidated, I had them compile the same information for internal storage. They were amazed to find (I wasn’t) that they were using a small fraction of their internal storage (~7%).

But what surprised them more was that the total capacity of their internal storage (177TB) was greater than their external storage. When the original purchase cost of the internal storage was totaled, it was nearly double what they had spent on the SAN and storage arrays. Even if you eliminate the large database servers from the equation, total used capacity was only 7.8%. Ironically, this case is not the exception rather the norm.

The average power consumption of an imbedded 2.5” / 10,000 RPM drive is 11.2 Watts (IBM). So the additional, unnecessary, power consumption by a server with two drives barely used seems small at roughly 22 watts. But as an example, a Dell 860 server consumes 110 watts. If you boot from network storage and eliminate the 2 internal disks, power consumption is reduced by 22 watts or 20%. And that does not factor the additional power you save by not having to cool the additional heat generated by the drives or the cost of the drives themselves. Multiply that by all your servers and it is significant.

Now, if you add to the cost the additional power consumed and the heat generated by the thousands of drives, and the fact that all the systems were already attached to the SAN (no additional cost for FC HBAs), you have a hard time justifying storing anything directly on a server. Additionally, the mean time between failure increases dramatically when servers are configured without internal drives. And in most cases, companies who have a SAN deployed keep enough additional storage headroom to accommodate the boot and application volumes without purchasing additional disk. I have yet to encounter someone who doesn’t. This is primarily due to the relatively small disk requirements of system and application volumes compared to data and databases (with the possible exception of ERP systems like SAP).

Most data center operations managers are completely unaware of the power consumption costs of the environment they manage. The simple fact that “facilities” which is normally responsible for power management has little or no insight or influence on the decisions made on the data center floor consuming it. As the cost of powering the data center becomes a more significant percent of the expense of running the data center, this will have to change. More corporations will make datacenter operations managers accountable for the power their environments consume. Then there will be an incentive to eliminate stranded storage.

Last year U.S. data centers consumed more than 60 billion kilowatt-hours of electricity at a cost of about $4.5 billion, according to the Environmental Protection Agency (EPA). A good chunk of this power—up to 60% in some cases—is needed to cool servers. Data centers accounted for almost 2% of this country’s total energy consumption. These numbers have risen quickly, nearly 40% between 1999 and 2005, according to a survey by the Uptime Institute. And they may double in the next five years to more than 100 billion kilowatt-hours, according to the EPA.

If the increasing cost of the energy doesn’t scare you, the availability might. According to AFCOM, an association of data center professionals: Over the next five years, power failures and limits on power availability will halt data center operations at more than 90% of all companies. Gartner predicts 50% of IT managers will not have enough power to run their data centers by the end of 2008. Expect a rise in outages, along with a pressing need to add more space and power to meet computing demands.

So at this point, you have to ask… Why does anyone purchase servers with internal storage? With the proliferation of very high quality, low cost, network storage technology it amazes me that any data center would not be booting from networked storage (SAN, NAS, or iSCSI). There are so many inherent benefits from using only network storage with diskless servers, you would expect 100% adoption in at least the fortune 5000. Moving all of one’s storage to the network simply makes too much sense not to do it. So why aren’t more companies buying diskless servers and booting from SAN?

Perhaps the reluctance to move to diskless servers and networked storage lies in the motivation of the companies producing the technology. The server vendors are not motivated to push the benefits of network storage and diskless servers. They would perceive that it would lower their revenue (although only slightly). The storage vendors don’t see additional revenue unless they are selling to a customer which doesn’t currently have networked storage. Most people who have networked storage in place already have enough free storage to move their internal volumes to the SAN without buying additional disks. And if a customer (or potential customer) does not have a SAN, adding the additional perceived complexity of booting over the network is most likely perceived as a sales impediment.

Working as an operations management consultant for several years, there were many occasions where the best approach my client could take to reduce operational and capital expenses was to move to networked storage and diskless servers. But I frequently encountered heavy resistance to do so. The following list contains some of the most common (unreasonable) reasons I was given. Bear in mind that most of my clients have been fairly large companies who had some or all of their servers attached to a SAN, yet they were still against diskless servers. These objections to booting from SAN are ranked in order of frequency to the best of my recollection;

1. We don’t boot from SAN (and when asked why, they simply repeat it
2. SAN storage is too expensive - (I actually had one company tell me this even after it was demonstrated that the true cost of their internal storage – even without factoring power and heat costs – was 8 times more expensive per GB because they were using such a small % of it).
3. Security won’t permit it (as if one server could get to another via the storage)
4. Boot from SAN is too slow (actually much faster)
5. It’s too complicated (this one I hear most from companies that don’t refresh their technology regularly… i.e. we do it this way because we always have. See #1)
6. It would create I/O bottlenecks on the SAN when a server is booting (not)
7. It introduces additional risks (I could never figure out what they were, though)

This list is by no means intended to be comprehensive. If you have had the experience of trying to show the benefits of diskless servers to someone you may have heard many other excuses. But it really boils down to change and influence. It may also be that the server engineers and systems administrators perceive removing disks from servers as a loss of control. After all, today the server is the “center of the universe” as far as most data centers go. If the servers become simply processing regions, servers would most likely become commodities.

But the benefits of diskless servers and booting from the network are too great to ignore. Aside from the reduced operational cost of the data center, there are many operational advantages that make a strong argument for this case. You can find some of the benefits listed in any storage vendor’s material. I pulled this list from a Dell whitepaper:

Boot-from-SAN benefits include:

1. Improved disaster tolerance
2. Centralized administration
3. Reduced total cost of ownership (TCO) through diskless servers
4. High-availability storage
5. Enhanced business continuance
6. Rapid server repurposing
7. Consolidation of image management

Obviously if you have not deployed a SAN, the cost appears prohibitive, especially for fiber channel. But with the advent of lower cost SAN technology (<$25k), the maturation of iSCSI and FC over IP combined with very affordable network storage from companies like Agami, Compellant, EqualLogic (now Dell), Pillar and many others, that barrier is really one of perception. It can actually be less expensive to order servers without disks and use an affordable network storage device instead. Additional benefits of diskless servers are:

1. Less power consumed per server
2. Less heat generated
3. Higher disk utilization (eliminates stranded disk)
4. Increased server reliability
5. Reduced cost of the server

With the widespread adoption of virtualization technology, the benefits of network storage grow exponentially. Whether you are deploying VMWare virtual machines, or Virtualized physical machine images with Racemi’s DynaCenter, or both, networked storage becomes an integral part of the solution. These virtualization technologies combined with replication, clones, multiple mirrors and snaps enable new capability such as:

1. Guaranteed automated DR Capability (vs. a plan.)
2. Efficient data center consolidation
3. Centralized lab management
4. N-1 recovery (protecting hundreds of different servers with 1 standby vs. H/A)

So don’t wait for your server or storage vendor to give you a sales pitch on the benefits of networked storage. If you’re deploying virtualization technology, you’re missing capability and wasting money if you’re not using networked storage. With today’s affordable and reliable network storage technologies, going diskless and going green makes more sense than ever.

No comments: