Data Storage - Data Striping (I/O parallelism)

About

In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, in a way that accesses of sequential segments are made to different physical storage devices.

By performing segment accesses on multiple devices (such as hard disk), multiple segments can be transparently accessed concurrently (in parallel) increasing the data access throughput and then avoiding that the processor idly wait for data accesses.

It yields transfer rates for large I/Os that are many times higher than the rate of your fastest disk drive.

The first application of this technology can be found in a Redundant array of independent disks (RAID) storage.

For many years Oracle has recommended its users to use the Stripe And Mirror Everything (S.A.M.E.) methodology using a stripe size of 1 MB. Such a configuration is relatively simple to set up, providing good performance for pretty much any workload (OLTP, reporting, data warehouse).

Articles Related

Advantages and disadvantages

Advantage: higher throughput

Advantages of striping include performance through a higher throughput.

Sequential time interleaving of data accesses allows the

lesser data access throughput

of each storage devices to be cumulatively multiplied by the number of storage devices employed.

Disadvantage: higher failure

Because different segments of data are kept on different storage devices, the failure of one device causes the corruption of the full data sequence.

The failure rate of the array of storage devices is equal to the sum of the failure rate of each storage device.

This disadvantage of striping can be overcome by the storage of redundant information, such as parity, for the purpose of error correction. In such a system, the disadvantage is over come at the cost of requiring extra storage.

Stripping Levels

Striping is used on two levels:

hardware
software

Hardware

across disk drives in RAID storage,
network interfaces in Grid-oriented Storage,
and RAM in some systems.

Software

some modern databases, such as Sybase
File systems of clusters. Oracle Automatic Storage Management allows ASM files to be either coarse or fine striped.
Logical Volume Management (LVM). LVM tools will allow implementation of data striping in conjunction with mirroring. This can be achieved with LVM2 using LVM2 format metadata.

Hardware or software stripping ?

With hardware stripping, the calculations are carried out by the controller, with software stripping they take place on the server’s CPU.

So, if the stripping calculations are fairly simple say (as in a case of RAID 1 or RAID 10) and the server is fairly powerful, using software stripping shouldn’t be much of a problem.

But with more complex stripping level calculations (such as RAID 5EE or RAID 6 for example), using hardware stripping can be beneficial because the performance is not compromised by the server’s workload, nor are applications on the server compromised by the RAID workload.

With hardware stripping, the stripping functionality is also independent of the OS, and the simple HBA drivers required for a hardware stripping controller are usually available as part of the OS distribution. Also, if it has a battery, hardware stripping can run in write-back mode, adding another level of data protection.

Documentation / Reference

wiki/Data striping