Specialized Section
Disk array – performance?
01.03.2012, 12:08
A frequently asked question when choosing disk array is a requirement of performance comparison.
Unfortunately it abstracts attention from the question ‘what are the parameters of the whole project’ to comparing of marketing statements. Performance parameters of disk array are influenced by many factors and even the performance parameters may not necessarily affect apps performance directly. What’s the answer then?
Performance? At what conditions?
- During the RAID principle data are disarticulated through all participated disks of RAID group. The first factor says that performance is closely related to the number of RaidSet disks.
- ‘The more of RaidSet disks, the better’ statement does not have to be entirely true. With low number of disks this small amount is the thin neck. With many disks the computing overhead of controllers goes up.
- And then there are various types of RAID with various characteristics and various computing overhead.
Strip size? Block Size? Cache segmentation?
- Then, to add insult to injury, each address space on disks is divided into blocks and FileSystem in terms of OS works with segmentation too. It is good to consider this very carefully at the moment of disk array implementation. Essentially – by setting these parameters we predetermine, if the space would be optimized for applications of database nature or on the other end of application spectrum of stream nature.
- Except segmentation on the disk level the disk array also works with segmentation of buffer store – cache. Because it is usually a fixed value at most of the vendors, no one really talks about it. But cache segmentation in the context of application tells us about efficient usage of cache! It is ideal if the cache segmentation is definable.
Explanation:
On principle every disk array works with internal cache segmentation. Optimal size of segmentation setting varies according to usage of volume for given purposes. Possibility to define segmentation means provision of QoS (Quality of Service) of critical apps.
If i.e. the cache segment is defined as 64 kB, this volume will have optimal performance parameters for files, streams etc. If Oracle is run on thus segmented cache, which works with 4 kB segments, then the real use of cache size will be 6,2%! Definition of segmentation for given LUNs according to their usage solves this problem.
Use of cache partitioning for tuning of disk system performance
MB/s or IOPS?
- In light of one data stream the SATA and SAS (FC) disk arrays are able to work with relatively similar data transfer speeds. Diametric situation comes up during the random operation, where the zones of fast disks SAS and FC completely overrule the SATA zones.
- It is good to think about what stream based operation is. In nowadays environment you can’t basically find systems, which would generate a simple stream on the disk array toward RaidSet. And more (though simple) data flows toward disk array present a random running in the end.
- And then there is the phenomenon of virtualization, where in case of VMware the hypervisor is performance-optimized and it can really use the capacity of disk arrays potentialities.
Summarized:
MB/s phenomenon is starting to lose its significance and every current disk array has the ability to transfer hundreds of MB/s at a single stream running. The essential parameter is being number of Input/output operations per second – IOPs, which is able to support the disk array. It is related to the final performance of a complex environment.
Unbiased measuring
We can get a certain image about how individual disk array are doing on the grounds of independent SPC analysis.
Mentioned measurement does not have to correspond with the measurements of individual producers. Everything depends on tests definition and each producer picks the tests conditions so their product comes out in a good light. Nevertheless the substantial fact is that all systems have gone through benchmark tests at the same conditions defined by SPC-1 and a relative comparison (not absolute value) of individual systems is a tangible argument.
Notes:
- yellow value: performance measured according to SPC-1 in IOPS
- green value: more practical index; only IO operations supported by disk array under 5ms are included into results.
3S.cz is a partner of Hitachi Data Systems. Disk array Hitachi AMS2100, AMS2300, AMS2500 belong to the technological and performance top of their categories. It is closelyconnected with the fact that this line is relatively new and was put on the market at the turn of 2008/2009. Thanks to that those are products, in which the state-of-art technologies are implemented.
Systems USP of HITACHI are worth mentioning. Not only their performance parameters are beyond compare as the enterprise disk systems but they managed to support 100 % of IO operations below 5ms limit.
It is a common practice to compare price of stored data in the context of price for Gigabyte. Nevertheless the price for performance is gaining the significance with the phenomenon of virtualization! In a simplified conception this value represents (the more complex environment, the higher significance) what investments must be made to support required volume of business demands.