Data Storage - Data Striping (I/O parallelism)

Card Puncher Data Processing

About

In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, in a way that accesses of sequential segments are made to different physical storage devices.

By performing segment accesses on multiple devices (such as hard disk), multiple segments can be transparently accessed concurrently (in parallel) increasing the data access throughput and then avoiding that the processor idly wait for data accesses.

It yields transfer rates for large I/Os that are many times higher than the rate of your fastest disk drive.

The first application of this technology can be found in a Redundant array of independent disks (RAID) storage.

For many years Oracle has recommended its users to use the Stripe And Mirror Everything (S.A.M.E.) methodology using a stripe size of 1 MB. Such a configuration is relatively simple to set up, providing good performance for pretty much any workload (OLTP, reporting, data warehouse).

Advantages and disadvantages

Advantage: higher throughput

Advantages of striping include performance through a higher throughput.

Sequential time interleaving of data accesses allows the

lesser data access throughput

of each storage devices to be cumulatively multiplied by the number of storage devices employed.

Disadvantage: higher failure

Because different segments of data are kept on different storage devices, the failure of one device causes the corruption of the full data sequence.

The failure rate of the array of storage devices is equal to the sum of the failure rate of each storage device.

This disadvantage of striping can be overcome by the storage of redundant information, such as parity, for the purpose of error correction. In such a system, the disadvantage is over come at the cost of requiring extra storage.

Stripping Levels

Striping is used on two levels:

  • hardware
  • software

Hardware

  • across disk drives in RAID storage,
  • network interfaces in Grid-oriented Storage,
  • and RAM in some systems.

Software

Hardware or software stripping ?

With hardware stripping, the calculations are carried out by the controller, with software stripping they take place on the server’s CPU.

So, if the stripping calculations are fairly simple say (as in a case of RAID 1 or RAID 10) and the server is fairly powerful, using software stripping shouldn’t be much of a problem.

But with more complex stripping level calculations (such as RAID 5EE or RAID 6 for example), using hardware stripping can be beneficial because the performance is not compromised by the server’s workload, nor are applications on the server compromised by the RAID workload.

With hardware stripping, the stripping functionality is also independent of the OS, and the simple HBA drivers required for a hardware stripping controller are usually available as part of the OS distribution. Also, if it has a battery, hardware stripping can run in write-back mode, adding another level of data protection.

Documentation / Reference





Discover More
Two Physical Drives
Drive - RAID Technology Overview

Redundant array of independent disks (RAID) is the technology of grouping several physical drives in a computer into one or morelogical drives. Each logical drive appears to the operating system as...
Data Path From Disk To Cpu
IO - Data Path / Balanced System

The data paths is the path composed of all hardware components that are needed to get the data from: a storage device (generally disk drive or network drive) to the CPU It is important to understand...
Card Puncher Data Processing
Oracle Database - ASM (Automatic Storage Management)

Starting with Oracle Database 10g (10.1.0.3 or later), Automatic Storage Management is a high-performance storage management solution. For Oracle Database files, it simplifies the management of a dynamic...
Oracle Database Direct Path Read Temp With Parallel Slave
Oracle Database - direct path read temp and direct path read wait event

“direct path read temp” and “direct path read” event are wait events. When a session is reading buffers from disk directly into the PGA (opposed to the buffer cache in SGA), it waits on this event....



Share this page:
Follow us:
Task Runner