Monday, 3 November 2014

DataCore’s Answer to Random Write Workloads: Sequential Storage

by Jeffrey Slapp Technical Product Specialist / Systems Engineer

Introduction
DataCore Software has developed another exciting new feature extending the arsenal of enterprise features already present within SANsymphony-V. This new feature serves to enhance the performance of random write workloads which are among the most costly operations that can be performed against a storage system. The new Sequential Storage feature will be available in SANsymphony™-V10 PSP1 scheduled for release this month.
Internal testing with the Sequential Storage feature and 100% random write workloads yielded significant performance improvements for spinning disks (>30x improvement) and even noteworthy improvements for SSDs (>3x improvement) under these conditions. The specific performance numbers will be covered later in this article.
The actual performance benefits will vary greatly depending on the percentage of random writes that make up the application’s I/O profile and the types of storage devices participating within the storage pool. Additionally, the feature is enabled on a per-virtual disk basis, allowing you to be very selective about when to apply the optimization.
Basis For Development
As applications drive storage system I/O, DataCore’s high-speed caching engine improves virtual disk read performance. The cache also improves write performance, but its flexibility is limited due to the need to destage data to persistent storage. In many environments the need to synchronize write I/O with back-end storage becomes the limiting factor to the performance that can be realized at the application level; hence the purpose of this development.
With certain types of storage devices, there are significant performance limitations associated with non-sequential writes compared with sequential writes. These limitations occur due to:
  • Physical head movement across the surface of the rotating disk
  • RAID-5 reads to calculate parity data
  • Write amplification inherent to Flash and SSD devices
DataCore SANsymphony-V software presents an abstraction to the application — a virtual SCSI disk. The way that SANsymphony-V stores the data associated with these virtual disks is an implementation detail hidden from the application. Data may be placed invisibly across storage devices in different tiers to take advantage of their distinct price/performance/capacity characteristics. The data may also be mirrored between devices in separate locations to safeguard against equipment and site failures. The SANsymphony-V software can use different ways to store application data to mitigate the aforementioned limitations, while not changing the abstraction presented to the applications.
Functional Details
Sequential Storage changes the way SANsymphony-V stores data written to the virtual disks by:
  • Storing all writes sequentially
  • Coalescing writes to reduce the number of I/Os to back-end storage
  • Indexing the sequential structure to identify the latest data for any given logical block address
  • Directing reads to the latest data for a block using this index
  • Compacting data by copying it and removing blocks that have been rewritten
Performance Details
Now the part everyone is waiting for – the performance numbers. There are three main states to consider from a performance perspective:
  • Base – the underlying level of performance that can be achieved with a 100% random write workload, without Sequential Storage enabled.
  • Maximum – the performance that can be achieved with a 100% random write workload, with Sequential Storage enabled but without compaction active.
  • Sustained – the performance that can be sustained with a 100% random write workload, with Sequential Storage enabled and with compaction active.
The greatest performance is achieved during the Maximum state. When the virtual disk is idle, a background level of compaction will occur to prepare the system to absorb another burst of random write activity. That is, the background compaction will prepare the virtual disks to deliver performance associated with the Maximum state.
The following performance has been observed using IOmeter running a 100% write, 100% random workload with a 4K block size and 64 outstanding I/Os:
   Base IOPS  Maximum IOPSSustained IOPS  
Linear 20 GB volume, SATA WDC 1 TB drive    327  19,500  11,000
Linear 20 GB volume, SSD 840 EVO 250 GB Pool  10,000  62,000  36,000
Mirrored 100 GB volume, PERC H-800 RAID-5 Pool  860  67,000  40,000


Interesting Observations
The above results highlight 3 key observations:
  • Significant acceleration (>30x improvement) of low-cost SATA disks for random write loads is possible. In fact in this particular test with DataCore, the resulting sustained performance of 11,000 IOPS actually exceeded that of a conventional Solid State Disk which ran at 10,000 IOPS.
  • The Solid State Disk also displayed improved performance going from 10,000 IOPS to 36,000 IOPS (>3x improvement).
  • Write intensive RAID-5 workloads displayed the greatest amount of improvement from 860 IOPS to 40,000 IOPS (>45x improvement).
Conclusion
DataCore’s Sequential Storage capability aims to address a limitation every storage system experiences to some extent. Random writes not only severely impact application performance within mechanical systems such as magnetic disks, they can also drastically reduce the performance and shorten the lifespan of SSD/flash based devices because of the write amplification effects produced from the write I/O pattern (see this publication for more detail). You can expect this feature along with many others in SANsymphony™-V10 PSP1 due out later this month.

No comments: