Data Storage - Data Path / Balanced System
Table of Contents
1 - About
It is important to understand the different transfer rates of each component of the server's disk subsystem and of the network. This information helps you to identify potential bottlenecks that can throttle your overall performance.
In the figure, data travels:
- from the actual disk drive
- to the embedded disk controller located on the disk drive unit (<10 Mbytes/sec),
- up the Ultra Fast/Wide SCSI channel 2 at 40 Mbytes/sec,
- through PCI Slot 1 on PCI Bus 1 at 133 Mbytes/sec
- to the Memory subsystem (533 Mbytes/sec),
- and then transferred to the CPU at a P6 system bus speed of 533 Mbytes/sec.
If one component is trying to send more data than the next component can handle, there is a bottleneck. An analogy to this is a plumbing example. If the primary water pipe carrying water away from your basement is five inches in diameter and you have five two inch pipes placing water into the five inch pipe, water will be spilling out.
By completing a little mathematical word problem, you can avoid bottlenecks even before they begin.
For example, placing two 3 channel Ultra Fast and Wide SCSI cards (theoretical aggregate maximum throughput of 2x[3×40 Mbytes/sec] = 240 Bytes/sec) into a single PCI bus can overwhelm the single PCI Bus data link if all of the SCSI channels were active.
A single PCI bus can only support a theoretical maximum of 133 Mbytes/sec. Jamming 240 Mbytes/sec of data into it just does not work very well. If this configuration were actually implemented, you would have created a bottleneck from the start. Placing each one of the 3 channel Ultra Fast and Wide SCSI cards onto their own respective PCI bus will spread the disk I/O activities across 266 Mbytes of total aggregate PCI bus throughput.
The disk drive itself is the slowest link of the data path.
2 - Articles Related
3 - Example
For example, if you want to keep 4 CPU cores busy in such a configuration, then the entire I/O subsystem should be able to support 800 MB/s sustained for optimum performance.
The I/O throughput requirement has to be guaranteed throughout the whole hardware system (the whole data path):
- the Host Bus Adapters (HBAs) in the compute nodes,
- any switches you use,
The weakest link is going to limit the performance and scalability of operations in this configuration.
If you rely on storage shared with other applications then the throughput performance for your application is not guaranteed and you will likely see inconsistent response times for your operations.
Parallel execution is also a heavy consumer of memory. Per CPU core you should have at least 4 GB of RAM.
If you use inter-node parallel operations that spawn multiple nodes - then you have to size the interconnect appropriately, it as crucial as the overall I/O capabilities. The throughput required on the interconnect for good scalability is at least equal to the throughput going to disk. (Use (multiple) 10 GigE or Infiniband interconnect)
4 - Disk (HDD) - Striping
Any physical disk may be able to sustain 20-30 MB/s for large random reads. Considering that you need about 200 MB/s to keep a single CPU core busy (i.e. 8 - 10 physical disks), you should realize that you need a lot of physical spindles to get good performance for database operations running in parallel.
Do not use a single 1 TB disk for your 800 GB database, because you will not get good performance running operations in parallel.