In this article James Richmond explains RAID, a way of getting more speed, better data security, or both from your hard drives.
Computers are a centre piece of most recording studios and the data storage that we rely on is ever increasingly based on solid state drives, be they 2.5” or NVMe based. Previously, as you probably know, we relied upon platter-based HDD which were much slower and more prone to failure. As such, it was not uncommon to use multiple drives in RAID (Redundant Array of Inexpensive Disks’ configuration to improve data speeds and minimise data loss.
What Is RAID?
RAID, a term invented by Patterson, Gibson and Katz at the University of California, Berkley USA, in 1987/1988 did not invent the concept of using multiple drives and spreading data across them. These approach data back to the 1970’s and earlier in the 1980’s by various companies such as IBM, DEC and the wonderfully named ‘Tandem NonStop Systems’, amongst others.
Regardless, what became known as RAID had a number of different levels, some of which I briefly explain below. It is worth noting that no RAID level is perfect for all situations, all of them have their benefits and caveats. When deploying RAID it is critical to understand what you are trying to achieve, is it speed, is it protection or is it both? It also, as always, comes down to cost.
It should be stated also that RAID is not an alternative to backup. A robust backup policy is an essential component of a modern studio and please remember the maxim, ‘data doesn’t exist until it exists in three places’.
JBOD:
JBOD stands for ‘Just a Bunch Of Disks’. This is probably pretty familiar to most users, especially those that have added multiple drives to a computer. Each drives works independently with no mirroring, striping or parity.
RAID 0: Striping
RAID 0 is a striped format where data is shared between multiple drives with no mirroring or parity. Data striping essentially segments logically sequential data across two or more drives so that segments are spread between all the devices. RAID 0 can be very fast but it is vulnerable, because a single disk failure will result in complete data loss. With RAID 0 you do not lose any disk space, if you use four 2TB drives in RAID 0 then you have 8TB of usable space. Apple decided to implement RAID 0 in the 2019 Mac Pro: the internal storage is spread between two SSD’s. My operating system sees the proprietary 2x 2TB SSD’s as a single 4TB volume.
RAID 1: Mirroring
If RAID 0 is for speed then RAID 1 is for redundancy. All data on in RAID 1 is written to two or more drives providing copies of all the data in both (or all) places. This means that you only need one of the drives to be functional to have access to all your data. It is common to mirror with two drives in RAID 1, where a single drive failing results in no data loss. The downsides are speed and disk space. RAID 1 is slower, especially on write speeds, than a single drive because data needs to be updated in two places at once. Also, if you have two 2TB drives in RAID 1 your usable space will only be 2TB.
RAID 5: Distributed Parity
I’ve skipped over RAID 2/3/4 as they are less common formats. RAID 5 though is very common, especially at the enterprise level. RAID 5 provides block level striping with distributed parity. A parity bit is a string of binary error detecting code which is written to all the drives. RAID 5 requires a minimum for 3 drives to work. (RAID 4 is similar but where a dedicated drive is used for parity). This means that a single drive can fail in a RAID 5 configuration with no data loss.
RAID 5 is often seen as the best compromise between speed and redundancy. A RAID 5 configuration using four 2TB drives will have 6TB of usable space.
RAID 6: Dual Parity
RAID 6 is similar to RAID 5 but parity is double distributed, meaning that two parity bits are written to different drives. It is possible to have two drives fail before data loss occurs. RAID 6 requires a minimum of four drives to work. Four 2TB Drives in RAID 6 will have 4TB of usable space.
Hybrid RAID: RAID 1+0
There are other RAID types aside from those above, the most common being RAID 1+0, sometimes called RAID 10.
Raid 1+0 is a stripe of mirrors, where, in a four drive configuration, two pairs of drives are striped and those pairs are a mirror of each other.
RAID 1+0 is my preferred RAID type in the studio when it comes to performance and redundancy.
It has almost the speed of RAID 0 with the redundancy advantage of RAID 1 (hence RAID 1+0).
Hardware vs Software RAID
RAID can be done at both a software and hardware level.
For instance Disk Utility in Mac OS will allow you to create RAID 0 or RAID 1 configurations. Third-party applications such as OWC’s SoftRAID provided more flexibility, allowing for RAID 0, RAID 1, RAID 4, RAID 5 and RAID 1+0 configurations. I’ve used SoftRAID for a number of years and it has worked remarkably well. Given SoftRAID is software-based it does use system resources, which will affect overall performance but it is an effective solution for many.
Hardware RAID is an approach there instead of using software to manage the RAID, it is done instead on a motherboard or separate RAID card. I currently use a Highpoint SSD7505 in my 2019 Mac Pro. It provides four NVME slots, which house four 2TB Samsung 980 Pro NVME drives in RAID 1+0 format, giving me 4TB of usable space. It also gives blistering performance.
By way of comparison, the two figures below show the read and write speeds of the same four 2TB Samsung 980 Pro dives in RAID 1+0 using Softraid (in an OWC Accelsior 4M2 PCIE card) compared to the Highpoint SSD7505.
The SoftRAID allowed for around 5300 MB/s read and 3000 MB/s write speeds.
The Highpoint SSD7505 hardware RAID configuration afforded around 7300 MB/s read speeds and 4300 MB/s write speeds. Whilst these speeds are impressive I am fairly sure that if I was using a PC with PCIE 4.0 I’d see even faster speeds. Apple, in their wisdom, saw fit not to use PCIE 4.0 in the 2019 Mac Pro, nor any other computer so far. Oh well, I’ll just have to make do, I guess.
Do you need RAID?
It is worth noting that given the speeds of modern NVME drives you may not see a use case for yourself for it at all. Even using a single NVME drive you’d need to be recording or playing back many thousands of tracks at once in order to be approaching the limitation of the drive.
So, Why Do It?
First and foremost, for me it is about redundancy. Whilst I do have a robust backup policy I also have the extra security of having my main audio drive in RAID 1+0 so that even if a drive does fail I do not lose anything, even if the drive fails before the backup.
Also, as I sometimes work with video editing software the extra speed is very welcome.