Thursday, July 04, 2013

Server 2012 Storage Spaces: is it software raid?

Storage Spaces is not exactly software RAID. RAID is defined as a redundant array of independent disks. When an admin thinks about RAID they look at the raid levels and they each have a very well defined meaning as to how each disk is involved. The primary goal is to allow you to lose a disk with minimal impact and also to provide better performance. Storage Spaces has the same goal, but does it very differently.

When thinking about RAID, it's usually in terms of whole disks. When thinking about Storage Spaces, its in terms of 256KB chunks of data across a pool of disks. With Storage Spaces, you don't use hot spares but leave free space on the volume for it to maintain redundancy. Rebuild times are significantly faster and not limited to the work of a single disk.

I want to give you some examples on how each one handles the data differently.

1) 3x500G 15k disks. We will compare RAID 1+HS (mirroring with 1 hot spare) with Storage Spaces mirroring.

In our RAID 1, Disk 1 and 2 will be a mirror pair (500G usable). Every byte will be written to both disks in the same sectors. Reads can happen from either disk. This can give 2x read performance for some workloads. The configuration can survive a failure of one disk. In the event of a failure, the data from disk 2 would be copied to the hot spare (disk 3). Disk 1 can be replaced and then become the new hot spare.

In our Storage Spaces mirror, all 3 disks will be added to the pool. While we could allocate 1TB of space, I want to keep the recovery scenario close to the one above. We will allocate 500G to the mirror volume and leave 500G free (instead of a hot spare). Every 256KB chunk of data is written to 2 of the 3 disks. So data will exists on all 3 disks. Reads can happen from any of the 3 disks. I won't compare the read performance of this example, but it evens out to raid 10's 2xread when more disks are involved. The configuration can survive a failure of one disk. If disk 1 fails, then disk 2 and 3 will copy data between themselves to maintain the mirror (this is why we left the space free).

2) 15x500G 15k disks. Now Raid 10+1HS (mirroring with 1 hot spare) with Storage Spaces mirroring.

In our RAID 10, its common to pair each disk and then stripe the data across those pairs. When data is written, each pair writes the same data. When a disk fails, the other disk in the pair will copy all its data to a hot spare and it becomes the new partner. You can replace the failed disk and have the hot spare hand that data back over to the replacement. That would result in a second full copy of the data.

Storage spaces would write your data in pairs to any two disk. None of the disks are mirrors of any other disks. It just makes sure that every 256KB chunk of data exists in 2 locations. This process can be enclosure aware so that the data can be mirrored across the enclosure. When a disk fails, the mirror of that data already exists in 256KB chunks across the other 14 disks. Those 14 disks copy the 256KB chunks to different disks to rebuild the mirror. (This can happen very fast because all remaining disks work to copy the data instead of just one). When you replace that failed disk, nothing happens. No rebuilds and no recopies. Data is only added back onto that disk when data is written to the volume. There are no disk pairs to micromanage.

* There is not a lot of information on exactly how storage spaces works. This is the way I understand it from the information that I have found. If you have a better understanding of Storage Spaces, I would appreciate any feedback.

No comments: