If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
10 drive SCSI RAID array
We're in the process of building a new server which is to be
ultra-redundant, and there's been speculation on SCSI version, suitable flavors of RAID, replacement processes, and such. I'm hoping that someone here can answer a few quick questions. 1. This system is going to have 10 drives, all of which are going to be Seagate ST318436LW. The LVD drives have an Ultra3-SCSI Wide interface. I can never remember the relationship between the true names and marketing names. Is Ultra3-SCSI Wide the same as Ultra160? If so, this that means that the interface (as opposed to the drive) can move roughly 160 MB, right? 2. As far as RAID levels go, we're looking at either 0+1 or 0+5, I think. Are there any suggestions? For instance, does 10 drives warrant using 0+5, or are things just going to get bogged down? From what I understand, 0+1 is faster at writing than 0+5. If no one wants to explain their reasoning, pointing me to resources on this would be super too. 3. On failure... (Important Note: The Seagate drives I mentioned above don't seem to have any hot-swappable capability.) I'm REALLY having trouble finding out what happens when a drive fails in a RAID array. In RAID 0, sans backup, I think you're outta luck. However, in RAID 1 or 5, you should be fine...well, at least as far as I can tell. I'm wondering specifically about the time between drive failure and drive replacement. Are the RAID controllers generally prety smart? If a drive fails in a RAID 1 array, does the controller just ignore that drive and you can keep going and wait to replace the drive until a convenient time? In a RAID 5 array, are you screwed too, or does the controller "interpolate" the missing data from parity bits until you can replace the bad drive. In both cases, when you DO replace the bad drive, does the rebuilding generally happen automatically, or are there utilities that have to be run without a load on the system, resulting in downtime? OK...the questions were longer than I'd thought. I appreciate any answers or redirection you can give me. if another newsgroup is more appropriate, please let me know. Thanks, Rick Kunkel |
#2
|
|||
|
|||
On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel
wrote: We're in the process of building a new server which is to be ultra-redundant, and there's been speculation on SCSI version, suitable flavors of RAID, replacement processes, and such. I'm hoping that someone here can answer a few quick questions. 1. This system is going to have 10 drives, all of which are going to be Seagate ST318436LW. The LVD drives have an Ultra3-SCSI Wide interface. I can never remember the relationship between the true names and marketing names. Is Ultra3-SCSI Wide the same as Ultra160? If so, this that means that the interface (as opposed to the drive) can move roughly 160 MB, right? Errr... you're kinda mixing various labels in the interface names. Seagate uses Ultra, Ultra2, and Ultra160, but not Ultra3. The less (IMO) confusing labels are Fast-20, Fast-40, and Fast-80, but those aren't sexy enough, so you'll rarely see them! But to answer your question, if you go to www.seagate.com you'll see that the ST318436LW does indeed have an LVD interface capable of moving data at 160MB/sec, less any command overhead etc. [ You'll also find that each drive is capable of between 24.7 and 39.4MB/sec ] 2. As far as RAID levels go, we're looking at either 0+1 or 0+5, I think. Are there any suggestions? For instance, does 10 drives warrant using 0+5, or are things just going to get bogged down? From what I understand, 0+1 is faster at writing than 0+5. If no one wants to explain their reasoning, pointing me to resources on this would be super too. What does your IO load look like? And that "0+1"/"0+5" thing is a total red herring from an architectural standpoint, so you can forget the "0+" thing and just consider RAID 1 vs RAID 5. In the case of a large number of small I/OS, if you have "n" drives, RAID1 lets you do "n/2" simultaneous writes or "n" reads. Your usable capacity will be "n/2" times that of a single disk. RAID5 lets you do n reads or (n-1)/2 writes, and gives you n-1 times the capacity of a single disk. 3. On failure... (Important Note: The Seagate drives I mentioned above don't seem to have any hot-swappable capability.) Well, if you used the ST318436LC you'd get you hot plug... I'm REALLY having trouble finding out what happens when a drive fails in a RAID array. What failure mode are you considering? It's kinda important! In RAID 0, sans backup, I think you're outta luck. Obviously. However, in RAID 1 or 5, you should be fine...well, at least as far as I can tell. That *is* why we bother with it! I'm wondering specifically about the time between drive failure and drive replacement. Are the RAID controllers generally prety smart? Some are, some are dumb as fence posts, and in some cases there *is* no RAID controller (it's all done through the host). If a drive fails in a RAID 1 array, does the controller just ignore that drive and you can keep going and wait to replace the drive until a convenient time? If you have a reasonable controller *or* software, sure. In a RAID 5 array, are you screwed too, or does the controller "interpolate" the missing data from parity bits until you can replace the bad drive. Why do you think it bothers with all that parity calculation? In both cases, when you DO replace the bad drive, does the rebuilding generally happen automatically, or are there utilities that have to be run without a load on the system, resulting in downtime? It seems odd that you've specified a disk but haven't mentioned the controller! The answer (predictably) depends on the controller. The worst case, given your selection of non-pluggable disks, would probably be a powerdown to replace the disk, followed by some steps to introduce the disk into the array. The remirroring/rebuilding operation typically happens on-line (and, in the case of software solutions, *has* to happen on line!). Rick Kunkel Malc. |
#3
|
|||
|
|||
Malcolm Weir wrote:
wrote: We're in the process of building a new server which is to be ultra-redundant, and there's been speculation on SCSI version, suitable flavors of RAID, replacement processes, and such. I'm hoping that someone here can answer a few quick questions. The answer (predictably) depends on the controller. The worst case, given your selection of non-pluggable disks, would probably be a powerdown to replace the disk, followed by some steps to introduce the disk into the array. The remirroring/rebuilding operation typically happens on-line (and, in the case of software solutions, *has* to happen on line!). I'd recommend two hot spares in this case, no matter if RAID level is 1 or 5. Give a lot of peace of mind to see redundancy restored automatically after a drive failure. Especially important if the drives are not hot swappable. Thomas |
#4
|
|||
|
|||
Thanks for your reply... some additional info inline here...
On Tue, 30 Sep 2003 13:15:57 -0700, Malcolm Weir wrote: On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel wrote: SNIP 2. As far as RAID levels go, we're looking at either 0+1 or 0+5, I think. Are there any suggestions? For instance, does 10 drives warrant using 0+5, or are things just going to get bogged down? From what I understand, 0+1 is faster at writing than 0+5. If no one wants to explain their reasoning, pointing me to resources on this would be super too. What does your IO load look like? This is going to be an IMAP server. Probably mostly reads, but the read/write difference may be fairly negligible. And that "0+1"/"0+5" thing is a total red herring from an architectural standpoint, so you can forget the "0+" thing and just consider RAID 1 vs RAID 5. Even nuttier, I was just informed that we're looking into "1+5". Apparently, these guys want SUPER redundancy, and are shying away from backups as well. Argh. The description I have of this is that it is "is formed by creating a striped set with parity using multiple mirrored pairs as components; it is similar in concept to RAID 10 except that the striping is done with parity." But fom what I can tell, you're mainly opposed to referring to these multi level RAID things as one entity, and prefer to see each level independently. For instance, we have an array comprised of two drives using RAID 1 called "BLAH". Then we stripe across 5 "BLAH"s using RAID 5. Is that a good interpretation? In the case of a large number of small I/OS, if you have "n" drives, RAID1 lets you do "n/2" simultaneous writes or "n" reads. Your usable capacity will be "n/2" times that of a single disk. RAID5 lets you do n reads or (n-1)/2 writes, and gives you n-1 times the capacity of a single disk. Oh! Not as much of a writing speed sacrifice as I thought. If I have 10 disks using this newfangled RAID 1+5, how do I figure the "n" formulas above? If you have that, I'd love it... 3. On failure... (Important Note: The Seagate drives I mentioned above don't seem to have any hot-swappable capability.) Well, if you used the ST318436LC you'd get you hot plug... I'm REALLY having trouble finding out what happens when a drive fails in a RAID array. What failure mode are you considering? It's kinda important! I'm going to assume failure of the drive in some such way that there is no outcome other than drive replacement. Let's just say that the platters instantly vanish. Unrealistic, but that's the failure. SNIP In a RAID 5 array, are you screwed too, or does the controller "interpolate" the missing data from parity bits until you can replace the bad drive. Why do you think it bothers with all that parity calculation? Well, I mean, will it interpolate that information on the fly? I had just assumed that if you lost a disk in a RAID 5 array, you had to rebuild the lost drive using the parity info before doing anything useful. However, something I read recently suggested that the controller could replace the missing data on the fly using that parity info. In short, I knew parity would be used to rebuild data...I just didn't know that it could do it in real time... In both cases, when you DO replace the bad drive, does the rebuilding generally happen automatically, or are there utilities that have to be run without a load on the system, resulting in downtime? It seems odd that you've specified a disk but haven't mentioned the controller! Well, we don't know that yet. Probably a decent U160 controller from Adaptec. The answer (predictably) depends on the controller. The worst case, given your selection of non-pluggable disks, would probably be a powerdown to replace the disk, followed by some steps to introduce the disk into the array. The remirroring/rebuilding operation typically happens on-line (and, in the case of software solutions, *has* to happen on line!). Yeah...in the other response to my original post, hot spares were suggested. Upon reading more, these seem to be a good idea. Are these, in essence, drives in limbo, waiting to take over when a drive fails? Do you define this in the controller? What makes a controller decide that a drive has failed? (Probably the topic of a different post, eh? ha ha! Yeah....I'll do some reading...) Rick Kunkel Malc. Thanks much! Rick |
#5
|
|||
|
|||
"Rick Kunkel" wrote in message ... Thanks for your reply... some additional info inline here... On Tue, 30 Sep 2003 13:15:57 -0700, Malcolm Weir wrote: On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel wrote: these guys want SUPER redundancy, and are shying away from backups as well. Argh. .... and what happens when there's a fire in the server room or you get broken into ?, having no backup is madness (and may invalidate any insurance your company has). Clive. |
#6
|
|||
|
|||
|
#7
|
|||
|
|||
On Tue, 30 Sep 2003 15:50:30 -0700, Rick Kunkel
wrote: What does your IO load look like? This is going to be an IMAP server. Probably mostly reads, but the read/write difference may be fairly negligible. Read mostly, then, but you'll want good write performance for IMAP updates and message ingestion. And that "0+1"/"0+5" thing is a total red herring from an architectural standpoint, so you can forget the "0+" thing and just consider RAID 1 vs RAID 5. Even nuttier, I was just informed that we're looking into "1+5". Apparently, these guys want SUPER redundancy, and are shying away from backups as well. Argh. The description I have of this is that it is "is formed by creating a striped set with parity using multiple mirrored pairs as components; it is similar in concept to RAID 10 except that the striping is done with parity." But fom what I can tell, you're mainly opposed to referring to these multi level RAID things as one entity, and prefer to see each level independently. For instance, we have an array comprised of two drives using RAID 1 called "BLAH". Then we stripe across 5 "BLAH"s using RAID 5. Is that a good interpretation? I've installed systems that mirrored R5 sets, simply because the customer looked at the (performance/reliability) issues and decided they wanted mirrored stripes, with the stripes on separate controllers. We then pointed out that their mirroring was at the filesystem level, so remirroring would require replicating the filesystem onto the second set. So they then asked us to add an additional disk to each stripe, making a R5 set on each controller, with the two sets mirrored. This protected them against a single drive failure at the R5 level, and against controller issues at the filesystem level. It sounds slightly nuts unless you understood that the R5 was an afterthought added to a (well considered) solution... In the case of a large number of small I/OS, if you have "n" drives, RAID1 lets you do "n/2" simultaneous writes or "n" reads. Your usable capacity will be "n/2" times that of a single disk. RAID5 lets you do n reads or (n-1)/2 writes, and gives you n-1 times the capacity of a single disk. Oh! Not as much of a writing speed sacrifice as I thought. If I have 10 disks using this newfangled RAID 1+5, how do I figure the "n" formulas above? If you have that, I'd love it... "New fangled" is, I presume, a euphenism for "unusual", right? By treating each disk entity as a mirrored pair (and, incidentally, IBM once made a 5.25in disk that *consisted* of a mirrored pair of 3.5in drives), you lose a lot of the virtues of having independent disks. In effect, you get a RANQSID (Redundant Array of Not Quite So Independent Disks)! Assuming the controller is smart enough to figure out what's going on (big assumption): Each "blah" can handle one write or two reads concurrently. Raid 5 can dispatch reads to "n" independent "blahs", so you can achieve "2n" reads. But the classic R5 write behavior doesn't change: a "blah" has exactly the same write characteristics as a disk (actually, it is marginally slower, since both subwrites have to complete), and an R5 update involves two disks *or* two "blahs". 3. On failure... (Important Note: The Seagate drives I mentioned above don't seem to have any hot-swappable capability.) Well, if you used the ST318436LC you'd get you hot plug... I'm REALLY having trouble finding out what happens when a drive fails in a RAID array. What failure mode are you considering? It's kinda important! I'm going to assume failure of the drive in some such way that there is no outcome other than drive replacement. Let's just say that the platters instantly vanish. Unrealistic, but that's the failure. Actually, not totally unrealistic. Modern disks tend to have four main failure modes: * Alive and responding to requests, but reports an e.g. a media error rather than data. * Alive and responding to housekeeping requests ("Are you there?") but that's it. * Dead (nobody home) * Insane ("Oh, look, here are some wires. I wonder what happens if I drive them all to logic one?"), in which case everything else on the same bus vanishes, too. The first three your RAID controller will handle (assuming it's not a broken design, which can be a big assumption). The last requires a "one bus per drive" philosophy, which is not as bad as it may seem: if you have a four channel RAID controller, you build several RAID sets so that there is only one disk in each set on each channel, although each channel has several disks (each in a different set). Losing a channel then means you lose several disks, but not two disks from any one set. In a RAID 5 array, are you screwed too, or does the controller "interpolate" the missing data from parity bits until you can replace the bad drive. Why do you think it bothers with all that parity calculation? Well, I mean, will it interpolate that information on the fly? I had just assumed that if you lost a disk in a RAID 5 array, you had to rebuild the lost drive using the parity info before doing anything useful. Ah. No, while I understand where you're coming from, that would be disasterous, since the OS would effectively see a disk failure, which would result in customers seeking RAID architects heads, preferably painfully detached from their bodies! However, something I read recently suggested that the controller could replace the missing data on the fly using that parity info. In short, I knew parity would be used to rebuild data...I just didn't know that it could do it in real time... It must. Of course, that doesn't mean that some controllers aren't broken in their implementations! In both cases, when you DO replace the bad drive, does the rebuilding generally happen automatically, or are there utilities that have to be run without a load on the system, resulting in downtime? It seems odd that you've specified a disk but haven't mentioned the controller! Well, we don't know that yet. Probably a decent U160 controller from Adaptec. OK: here's a recommendation: Go with TWO Adaptec 3210S or LSI MegaRAID Elite 1600 or 1650 (the latter would be my personal preference, FWIW). Distribute the disks so that one of the two channels on each controller has three disks, the other two. Build TWO RAID-5 arrays so that one disk from each of the four channels is in each array. [ I know you can do this with the LSI controllers, I'm presuming you can do it with the Adaptecs ]. Assign the remaining disks as hot spares. You now have two R5 arrays, each with a capacity of 3 x 18GB, or 54GB. Now, at the host OS level mirror those two R5 arrays, for a net usable capacity of 54GB. You'd then be protected from just about anything except host and software failures, which is why you need some off-system backup. If a drive fails "well", it's role will be assumed by one of the hot spares. If a channel fails, that's equivalent to the simultaneous failure of all disks on the channel, but you won't be impacted. If a controller fails, the mirror will catch you. This is, of course, *RIDICULOUS* overkill. But if it makes someone happy... [ The sensible solution is a software cluster of two reasonably resilient machines, each capable of taking over from the other. This solution treats the hosts and the disks separately, and requires special software to synchronize the hosts) ] [ Snip ] Yeah...in the other response to my original post, hot spares were suggested. Upon reading more, these seem to be a good idea. Are these, in essence, drives in limbo, waiting to take over when a drive fails? Do you define this in the controller? What makes a controller decide that a drive has failed? (Probably the topic of a different post, eh? ha ha! Yeah....I'll do some reading...) Yep, hot spares just assume the role of the failed disk. A key question is whether they can be easily persuaded to relinquish that role when the failed drive is replaced (i.e. transfer their data back to the replaced disk, thus keeping your "live" disks where you wanted them). Rick Malc. |
#8
|
|||
|
|||
Chiming in late but,
I personally would setup 2, 5-Disk RAID5 sets on separate controllers then mirror through volume management software. I am assuming you where planning on using one RAID controller. You're far more likely to lose a controller or have parity corruption then concatenating drive failures. That is unless there are other environment variables like heat involved. Note: I've lost a total of 23 drives since Feb 2001, (out of lost count, around 790) and I've lost 8 volumes due to parity corruption and controller errors. LSI and Adaptec RAID controllers offer many different solutions as far as rebuilds. Typical 18GB drive rebuild will take (with volume under minimum load) 2-3 hours. Depending on how many resources you apply to the rebuild process Robert Rick Kunkel wrote in message . .. On Wed, 1 Oct 2003 10:23:37 +0100, wrote: "Rick Kunkel" wrote in message .. . Thanks for your reply... some additional info inline here... On Tue, 30 Sep 2003 13:15:57 -0700, Malcolm Weir wrote: On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel wrote: these guys want SUPER redundancy, and are shying away from backups as well. Argh. ... and what happens when there's a fire in the server room or you get broken into ?, having no backup is madness (and may invalidate any insurance your company has). Yes....I agree... It seems inconsistent to build a RAID 1+5 array, and then have no other redundancy in other areas at all... Thanks, people, for all your input... Rick Kunkel |
#9
|
|||
|
|||
Rick, sorry to hi-jack your thread but Robert can I ask how often 'parity
corruption' occurs ?. I take it that it's very rare ?. Thanks. Clive. "Robert" wrote in message om... Chiming in late but, I personally would setup 2, 5-Disk RAID5 sets on separate controllers then mirror through volume management software. I am assuming you where planning on using one RAID controller. You're far more likely to lose a controller or have parity corruption then concatenating drive failures. That is unless there are other environment variables like heat involved. Note: I've lost a total of 23 drives since Feb 2001, (out of lost count, around 790) and I've lost 8 volumes due to parity corruption and controller errors. LSI and Adaptec RAID controllers offer many different solutions as far as rebuilds. Typical 18GB drive rebuild will take (with volume under minimum load) 2-3 hours. Depending on how many resources you apply to the rebuild process Robert |
#10
|
|||
|
|||
NP,
Its rare but there are alot of factors. I have mainly encoutered perity coruption on clustered systems (Shared disk) however its no that uncommon for it to occur on standard volumes. I have only seen it occur on RAID5 sets most likly because it has the most complicated perity configuration of the popual NP, It's rare but there are allot of factors that can affect frequency. Look at the number of components between the volume manager and the disk itself. For instance if your disks reside in a separate drive enclosure you have to factor that the enclosure controllers (go by many names. These are just DUMB cards that regulate the flow of power and data across the SCSI bus) can affect the system. Because you added components in the controller's path to the disk you have added another potential point of failure in the disk sub-system. I have mainly encountered parity corruption on clustered systems (Shared disks) however its not that uncommon for it to occur on standard run of the mill Raid sets. It is more common to see parity corruption in RAID 5 sets, most likely because it has the most complicated parity configuration of the popular raid types. Parity errors and other controller issues frequently occur due to other factors out of the controller's hands. Like power failures/interruptions, bad drives, system crashes, dirve enclosure issues, and so forth. Unfortunately, techs usually don't know when the parity errors started and by the time you start seeing odd behavior with the system it's beyond repair. 9 out of 10 parity/controller issues are foreseeable. Make sure you use all the available tools at your disposal to check the health of your disks and the disk sub-system. This is your best line of defense and prevention. If your reigous about checking the health your disk sub-system you will find it rare to ever experience an issue like volume failure. But you never know, thats why we backup and replicate data Wow, sorry to ramble on. Robert wrote in message ... Rick, sorry to hi-jack your thread but Robert can I ask how often 'parity corruption' occurs ?. I take it that it's very rare ?. Thanks. Clive. "Robert" wrote in message om... Chiming in late but, I personally would setup 2, 5-Disk RAID5 sets on separate controllers then mirror through volume management software. I am assuming you where planning on using one RAID controller. You're far more likely to lose a controller or have parity corruption then concatenating drive failures. That is unless there are other environment variables like heat involved. Note: I've lost a total of 23 drives since Feb 2001, (out of lost count, around 790) and I've lost 8 volumes due to parity corruption and controller errors. LSI and Adaptec RAID controllers offer many different solutions as far as rebuilds. Typical 18GB drive rebuild will take (with volume under minimum load) 2-3 hours. Depending on how many resources you apply to the rebuild process Robert |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Compaq Smart Array 532 Controller - Raid 1 Query | Richard Denny | Compaq Servers | 3 | November 24th 04 12:15 AM |
IDE RAID | Ted Dawson | Asus Motherboards | 29 | September 21st 04 03:39 AM |
Need help with SATA RAID 1 failure on A7N8X Delux | Cameron | Asus Motherboards | 10 | September 6th 04 11:50 PM |
Asus P4C800 Deluxe ATA SATA and RAID Promise FastTrack 378 Drivers and more. | Julian | Asus Motherboards | 2 | August 11th 04 12:43 PM |
Gigabyte GA-8KNXP and Promise SX4000 RAID Controller | Old Dude | Gigabyte Motherboards | 4 | November 12th 03 07:26 PM |