# SnapRAID, other? Want to mix drive sizes.



## markmarz (Feb 3, 2002)

I'm busy buying hardware & getting educated to build a Tivo hd server (x2, 1 primary, 1 backup). I now get why RAID is good in addition to essential backup. But I really, really hate the limitation in standard RAID of same size drives in the array (or effectively same size as it will only make use of the size of the smallest drive in mixed drive array). One of the reasons I really hate this limitation is the long term expense of effectively making use of larger size drives only after either a long period of migration or a single costly replacement all at once.

So I'm looking at some sort of standard RAID alternative and SnapRAID is bubbling to the top. http://snapraid.sourceforge.net/compare.html

Looking okay to me because it's free, can mix drive sizes, uses a snapshot model, can recover data in non-failed disks, doesn't make my server into just a NAS like unRAID, etc. But I can be swayed on any of these reasons; I'm just feeling my way around and these are inchoate impressions.

Any thoughts?


----------



## lrhorer (Aug 31, 2003)

I heartily recommend against it. Although there are various ways it can be made to work, all of them involve significant hits in terms of performace and complexity.

It's true the drive for increasing spindle size tends to foster a certain churn in drive sizes, but usually the older drives can be re-purposed. Not only that, but there are some simple ways around the issue. One rather straightforward one is to employ RAID0 on a temporary basis to grow the spindle size. Thus, if 3T drives are inexpensive right now, but 4T drives get more attractive in the middle future, you can expand the entire array by buying some 1T drives, and combining each 3T spindle with a 1T spindle for all the existing drives except for 1. Along with the 1T drives, buy one 4T drive and replace one of the 3T drives with it. This will increase your top member size to 4T and increase the size of your array by 33% for a fairly economical outlay of cash, giving you an unused 3T drive to be able to back up the additional space on your array. As you run out of space, buy a pair of 4T drives, rather than 1, and use one to replace a 3T + 1T RAID0 element and add a 4T element, again adding the 3T and 1T drives to your backup library. If we assume drive prices in about a year are about $50 for a 1T drive and $200 for a 4T drive, then this approach would allow you to expand a 9T RAID5 array assembled from 3T drives to 12T for about $350, with the re-purposed 3T drive being able to exactly back up the additional 3T. This strategy should allow you to expand to perhaps 20T or better with a minimal amount of money "wasted" on drive upgrades. By that time, they probably will have 8t drives, and you can start collapsing the number of spindles, again, using the decommissioned hard drives for backup purposes.

I'm using four 1T drives from my original main array spliced into two pairs of 2T arrays on my backup array, right now, and have been for over a year.


----------



## markmarz (Feb 3, 2002)

Thanks again for your usual comprehensive & helpful reply!

I'm still mulling this over; I appreciate the detailed example you give of how to minimize drive swaps and incrementally increase. I've been playing around with this on a spreadsheet to see how it plays out. Seems complicated to me.



lrhorer said:


> Although there are various ways it can be made to work, all of them involve significant hits in terms of performance and complexity.


It would help if you could elaborate on the hits in performance and complexity by going the SnapRaid route. To me it seems a lot less complicated than RAID5|6. Performance, unless it's a huge hit, doesn't matter to me considering this is only a Tivo media server.

In an earlier RAID thread I mentioned that the biggest value RAID has to me is greater storage capacity. But I'm seeing now other solutions which offer this same advantage, such as SnapRAID possibly in combination with greyhole or mhddfs. I'm looking at all this from the perspective of a simple _good enough_ Tivo server, not trying to achieve an enterprise level of robustness. Also of course I will have backup; I don't see SnapRAID or greyhole filling that need.

To tell you the truth, and I'm loathe to admit it, I'm entertaining the possibility of once again abandoning even SnapRAID let alone standard RAID. If I can see all my capacity as a single drive, can back it all up easily, can never impact more drives than the failing drives, can easily access the files on any single drive outside of the RAID system and server itself .. well I'm thinking that's enough.


----------



## lrhorer (Aug 31, 2003)

markmarz said:


> Thanks again for your usual comprehensive & helpful reply!
> 
> I'm still mulling this over; I appreciate the detailed example you give of how to minimize drive swaps and incrementally increase. I've been playing around with this on a spreadsheet to see how it plays out. Seems complicated to me.


Not really. The bottom line is one can always take two (or more) relatively smaller devices and combine them into a single, larger logical RAID0 array. (LVM can also be used for this.) This larger array can then be used as a member of a larger array of a different type.

Thus, for example, one can take nine 1T drives and use them to create 3 arrays of 3 drives each, with each array capable of storing 3T. Creating a RAID5 array with three 3T members, each member being one of the RAID0 arrays, creates 6T of storage. Any time one likes, one may replace one of the 3T arrays with a 3TR drive merely by failing the 3T array, removing it from the large array, and adding a 3T drive to the large array. Rebuilding the array takes a while, but the effort required of the sysadmin is literally only a few moments.



markmarz said:


> It would help if you could elaborate on the hits in performance and complexity by going the SnapRaid route. To me it seems a lot less complicated than RAID5|6.


Well, I'm not intimately familiar with the protocol, so I can't make specific comparisons, but consider what is being attempted. RAID1 through RAID5 are designed with the operational intent that any one member of the array can fail for whatever reason without taking the array offline or losing any data. Consider what that implies. It implies that the data must be in part duplicated in such a way that the loss of one member does not irretrievably corrupt any data. RAID1 accomplishes this in a very straightforward and robust way by simply making multiple complete copies of the data. There are N complete copies of the data on unrelated systems so that the failure of as many as N-1 of the copies will result in no loss of data.

Now, mirroring is not the only way to duplicate data so that a loss of part of the data pool will not result in a loss of data. Indeed, mirroring is very inefficient, albeit highly effective, and consequently rather expensive. It also confers no increase in performance. Writes cannot be any faster than the slowest member, and reads will at best be no faster than the fastest member. Enter: parity. Parity provides a means of duplicating and verifying data that does not require completely copying the entire data set. Rather, one takes the sum of all the data bits and stores the LSB (or its complement). When reading back the data, one once again sums all the data bits and compares the result with the stored parity. If they do not match, then some odd number of bits of the data are in error. (Hopefully, and most likely only one.) If one of the bits is missing, then summing the remaining bits and performing an XOR with the stored parity bit produces the value of the missing bit as a result.

Now consider for a moment a 2 member RAID4 or RAID5 array. In fact, both are identical to a 2 member RAID1 array, since with only one data member, the parity information is exactly equal to the data information. If one member is allowed to be larger than the other, then there are two choices.

One is that the extent of the parity and the data are both limited to the size of the smaller member. In this case, regardless of the RAID organization, the loss of either member will not result in data loss. The downside, of course, is the additional space on the larger member is not used. The big upsides are with RAID4 and above, additional members of the array increase the efficiency of the array, so that instead of doubling the cost of the media, the additional fault tolerance may increase the cost of the media by a mere 12%, or even perhaps as little as 10% or even 8%. What's even better, both the read and write speeds can be increased by as much as twelve-fold, or at least easily five or six-fold.

If, on the other hand, we allow the additional space to be used in some fashion, then there is a potential for data loss. This potential becomes an actuality any time the entire larger data member fails. With multiple members and distributed parity (RAID5 or RAID6), one can hedge one's bets a bit and produce a custom distribution scheme that will allow for some enhanced failure tolerance, but no matter what, if that "extra space" on the larger member(s) is utilized, there is a significant probability of data loss with the failure of one of the large members. What's more, depending on the exact nature of the distribution scheme, the performance will drop precipitously, perhaps well below even that of a single drive. An increase in drive "thrashing", as you put it, is virtually inevitable.

Because RAID design (other than RAID0) holds as its top priority data integrity followed by (other than RAID1) top-notch performance, all standard RAID implementations forgo the rather minor convenience of supporting asymmetrical member sizes. Now, Logical Volume Management does not completely eschew this capability, but I can tell you from personal experience that taking advantage of this capability with LVM can result in truly dismal performance, and by that I mean even compared with a single, slow drive.



markmarz said:


> Performance, unless it's a huge hit, doesn't matter to me considering this is only a Tivo media server.


It can be pretty huge. Not only that, but as the FAQ for SnapRAID itself points out, variable drive sizes results in an increased chance of data loss, and I think you will find you really do not want your back-ups and restores to take potentially weeks, and I mean that literally.

Of course, it is your time, your money, your data, and entirely your decision. Bear in mind, however, that increasing the data size also increases the backup size, and since drives retired from the main array make a perfectly dandy addition to the backup media pool, it is my opinion that worrying beyond a modest level about an escalating cost for drive replacements is not effort well spent.



markmarz said:


> In an earlier RAID thread I mentioned that the biggest value RAID has to me is greater storage capacity. But I'm seeing now other solutions which offer this same advantage, such as SnapRAID possibly in combination with greyhole or mhddfs. I'm looking at all this from the perspective of a simple _good enough_ Tivo server, not trying to achieve an enterprise level of robustness. Also of course I will have backup; I don't see SnapRAID or greyhole filling that need.


I certainly am not going to stand here and pretend it cannot be done, and I surely have no right to ultimately tell you what to do. It is just my opinion, based on a rather significant amount of experience, that you are likely to regret it over the long haul more than the short term regret of parting with what is without question valuable cash. I suggest you also consider for a moment that mdadm has been developed over time with significant input by sysadmins from around the world so that it provides a very concise and simple means of dealing with very complex RAID requirements. These people for the most part do not mind having to deal with a mid-level userspace CL utility, but they demand that it be able to manage the arrays expeditiously, and that definitely includes making it easy to recover from even catastrophic array failures.



markmarz said:


> To tell you the truth, and I'm loathe to admit it, I'm entertaining the possibility of once again abandoning even SnapRAID let alone standard RAID. If I can see all my capacity as a single drive, can back it all up easily, can never impact more drives than the failing drives, can easily access the files on any single drive outside of the RAID system and server itself .. well I'm thinking that's enough.


I think you lost me a bit, there. There is no way a video library of any significant size is going to fit on a single drive, not even a 4T drive. It may suffice on day 1, but sooner or later you will need some form of logical space management. The front runners are RAID and LVM, and for your purposes, I think RAID is a distinct winner. Certainly you are free to check out LVM and other options before making your decision, and even then the decision is not irrevocable.

I suggest you subscribe to the mdadm mailing list over on vger.kernel.org. To subscribe, send a plain text (not html or rich text) e-mail message to [email protected] with a single line as the body of the e-mail:


```
subscribe linux-raid
```
Make sure all the From: Sender: and Reply-to: headers have precisely the same e-mail address - the one where you want the mailing list to reside. I heartily suggest you read through some of the threads there and make your concerns and questions known to the list. Neil Brown, the principal developer of mdadm, is very active in the list as are a number of contributing developers and various top experts in data storage from around the world. Some are developers and IT experts for some of the largest software or hardware companies on the planet. There is a good chance some of them have tried SnapRAID, and to be sure some of them have tried many, many other solutions.


----------



## markmarz (Feb 3, 2002)

As usual your arguments are clear, forceful and obviously backed by experience. I appreciate the effort as I'm sure others who, I hope, are also making use of this thread.



lrhorer said:


> I think you lost me a bit, there. There is no way a video library of any significant size is going to fit on a single drive, not even a 4T drive.


Sorry, what I said was (without bold):


> If I can *see *all my capacity as a single drive


 meaning the capacity appears to be a single drive through drive pooling. It's my understanding that both greyhole & mhddfs offer this, as well as isolating the contents of each physical drive from the other, so that failure on one drive has no impact on any other, unlike RAID5|6 which spread parity bits across all the drives.


----------



## lrhorer (Aug 31, 2003)

markmarz said:


> Sorry, what I said was (without bold): meaning the capacity appears to be a single drive through drive pooling.


Yes, there are a number of utilities that support JOBD by just stitching together members end-to-end. IIRC, even LVM offers this as an option.



markmarz said:


> It's my understanding that both greyhole & mhddfs offer this, as well as isolating the contents of each physical drive from the other, so that failure on one drive has no impact on any other, unlike RAID5|6 which spread parity bits across all the drives.


It is not just parity bits that are spread across the drives. For RAID0 and RAID > 4, the data is, as well. The data is sliced into "stripes" and spread evenly across all the data drives. It is this that allows both fault tolerance for RAID > 3, with a drive failure not losing any data, and also for the vastly increased performance of all but RAID1. With JOBD, data is written to each member sequentially as the previous member fills. This means the data goes on and off the member at the native data rate for the member being read or written. Neither seek performance nor read / write speeds are enhanced. With RAID, if there are N data members, then each chunk of data is divided up into N pieces and each piece plus parity (if implemented) is read or written simultaneously to all the members of the array. The effective seek rate drops by a factor of N or more, and the read and write rates increase by a factor of N. It won't make a difference to a single TiVo, or even several TiVos, but I assure you it will make a very significant difference to applications like VideoRedo, and a vast difference to applications like rsync.


----------

