10 drive SCSI RAID array

#1 September 30th 03, 07:34 PM

We're in the process of building a new server which is to be
ultra-redundant, and there's been speculation on SCSI version,
suitable flavors of RAID, replacement processes, and such. I'm hoping
that someone here can answer a few quick questions.

1. This system is going to have 10 drives, all of which are going to
be Seagate ST318436LW. The LVD drives have an Ultra3-SCSI Wide
interface. I can never remember the relationship between the true
names and marketing names. Is Ultra3-SCSI Wide the same as Ultra160?
If so, this that means that the interface (as opposed to the drive)
can move roughly 160 MB, right?

2. As far as RAID levels go, we're looking at either 0+1 or 0+5, I
think. Are there any suggestions? For instance, does 10 drives
warrant using 0+5, or are things just going to get bogged down? From
what I understand, 0+1 is faster at writing than 0+5. If no one wants
to explain their reasoning, pointing me to resources on this would be
super too.

3. On failure... (Important Note: The Seagate drives I mentioned
above don't seem to have any hot-swappable capability.) I'm REALLY
having trouble finding out what happens when a drive fails in a RAID
array. In RAID 0, sans backup, I think you're outta luck. However,
in RAID 1 or 5, you should be fine...well, at least as far as I can
tell. I'm wondering specifically about the time between drive failure
and drive replacement. Are the RAID controllers generally prety
smart? If a drive fails in a RAID 1 array, does the controller just
ignore that drive and you can keep going and wait to replace the drive
until a convenient time? In a RAID 5 array, are you screwed too, or
does the controller "interpolate" the missing data from parity bits
until you can replace the bad drive. In both cases, when you DO
replace the bad drive, does the rebuilding generally happen
automatically, or are there utilities that have to be run without a
load on the system, resulting in downtime?

OK...the questions were longer than I'd thought. I appreciate any
answers or redirection you can give me. if another newsgroup is more
appropriate, please let me know.

Thanks,

Rick Kunkel

#2 September 30th 03, 09:15 PM

On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel
wrote:

We're in the process of building a new server which is to be
ultra-redundant, and there's been speculation on SCSI version,
suitable flavors of RAID, replacement processes, and such. I'm hoping
that someone here can answer a few quick questions.

1. This system is going to have 10 drives, all of which are going to
be Seagate ST318436LW. The LVD drives have an Ultra3-SCSI Wide
interface. I can never remember the relationship between the true
names and marketing names. Is Ultra3-SCSI Wide the same as Ultra160?
If so, this that means that the interface (as opposed to the drive)
can move roughly 160 MB, right?

Errr... you're kinda mixing various labels in the interface names.
Seagate uses Ultra, Ultra2, and Ultra160, but not Ultra3. The less
(IMO) confusing labels are Fast-20, Fast-40, and Fast-80, but those
aren't sexy enough, so you'll rarely see them!

But to answer your question, if you go to www.seagate.com you'll see
that the ST318436LW does indeed have an LVD interface capable of
moving data at 160MB/sec, less any command overhead etc.

[ You'll also find that each drive is capable of between 24.7 and
39.4MB/sec ]

2. As far as RAID levels go, we're looking at either 0+1 or 0+5, I
think. Are there any suggestions? For instance, does 10 drives
warrant using 0+5, or are things just going to get bogged down? From
what I understand, 0+1 is faster at writing than 0+5. If no one wants
to explain their reasoning, pointing me to resources on this would be
super too.

What does your IO load look like?

And that "0+1"/"0+5" thing is a total red herring from an
architectural standpoint, so you can forget the "0+" thing and just
consider RAID 1 vs RAID 5.

In the case of a large number of small I/OS, if you have "n" drives,
RAID1 lets you do "n/2" simultaneous writes or "n" reads. Your usable
capacity will be "n/2" times that of a single disk.

RAID5 lets you do n reads or (n-1)/2 writes, and gives you n-1 times
the capacity of a single disk.

3. On failure... (Important Note: The Seagate drives I mentioned
above don't seem to have any hot-swappable capability.)

Well, if you used the ST318436LC you'd get you hot plug...

I'm REALLY
having trouble finding out what happens when a drive fails in a RAID
array.

What failure mode are you considering?

It's kinda important!

In RAID 0, sans backup, I think you're outta luck.

Obviously.

However, in RAID 1 or 5, you should be fine...well, at least as far as I can
tell.

That *is* why we bother with it!

I'm wondering specifically about the time between drive failure
and drive replacement. Are the RAID controllers generally prety
smart?

Some are, some are dumb as fence posts, and in some cases there *is*
no RAID controller (it's all done through the host).

If a drive fails in a RAID 1 array, does the controller just
ignore that drive and you can keep going and wait to replace the drive
until a convenient time?

If you have a reasonable controller *or* software, sure.

In a RAID 5 array, are you screwed too, or
does the controller "interpolate" the missing data from parity bits
until you can replace the bad drive.

Why do you think it bothers with all that parity calculation?

In both cases, when you DO
replace the bad drive, does the rebuilding generally happen
automatically, or are there utilities that have to be run without a
load on the system, resulting in downtime?

It seems odd that you've specified a disk but haven't mentioned the
controller!

The answer (predictably) depends on the controller. The worst case,
given your selection of non-pluggable disks, would probably be a
powerdown to replace the disk, followed by some steps to introduce the
disk into the array. The remirroring/rebuilding operation typically
happens on-line (and, in the case of software solutions, *has* to
happen on line!).

Rick Kunkel

Malc.

#3 September 30th 03, 10:00 PM

Malcolm Weir wrote:

wrote:

We're in the process of building a new server which is to be
ultra-redundant, and there's been speculation on SCSI version,
suitable flavors of RAID, replacement processes, and such. I'm hoping
that someone here can answer a few quick questions.

The answer (predictably) depends on the controller. The worst case,
given your selection of non-pluggable disks, would probably be a
powerdown to replace the disk, followed by some steps to introduce the
disk into the array. The remirroring/rebuilding operation typically
happens on-line (and, in the case of software solutions, *has* to
happen on line!).

I'd recommend two hot spares in this case, no matter if RAID level is 1
or 5.

Give a lot of peace of mind to see redundancy restored automatically
after a drive failure. Especially important if the drives are not hot
swappable.

Thomas

#4 September 30th 03, 11:50 PM

Thanks for your reply... some additional info inline here...

On Tue, 30 Sep 2003 13:15:57 -0700, Malcolm Weir
wrote:

On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel
wrote:

SNIP

2. As far as RAID levels go, we're looking at either 0+1 or 0+5, I
think. Are there any suggestions? For instance, does 10 drives
warrant using 0+5, or are things just going to get bogged down? From
what I understand, 0+1 is faster at writing than 0+5. If no one wants
to explain their reasoning, pointing me to resources on this would be
super too.

What does your IO load look like?

This is going to be an IMAP server. Probably mostly reads, but the
read/write difference may be fairly negligible.

And that "0+1"/"0+5" thing is a total red herring from an
architectural standpoint, so you can forget the "0+" thing and just
consider RAID 1 vs RAID 5.

Even nuttier, I was just informed that we're looking into "1+5".
Apparently, these guys want SUPER redundancy, and are shying away from
backups as well. Argh. The description I have of this is that it is
"is formed by creating a striped set with parity using multiple
mirrored pairs as components; it is similar in concept to RAID 10
except that the striping is done with parity." But fom what I can
tell, you're mainly opposed to referring to these multi level RAID
things as one entity, and prefer to see each level independently. For
instance, we have an array comprised of two drives using RAID 1 called
"BLAH". Then we stripe across 5 "BLAH"s using RAID 5. Is that a good
interpretation?

In the case of a large number of small I/OS, if you have "n" drives,
RAID1 lets you do "n/2" simultaneous writes or "n" reads. Your usable
capacity will be "n/2" times that of a single disk.

RAID5 lets you do n reads or (n-1)/2 writes, and gives you n-1 times
the capacity of a single disk.

Oh! Not as much of a writing speed sacrifice as I thought. If I have
10 disks using this newfangled RAID 1+5, how do I figure the "n"
formulas above? If you have that, I'd love it...

3. On failure... (Important Note: The Seagate drives I mentioned
above don't seem to have any hot-swappable capability.)

Well, if you used the ST318436LC you'd get you hot plug...

I'm REALLY
having trouble finding out what happens when a drive fails in a RAID
array.

What failure mode are you considering?

It's kinda important!

I'm going to assume failure of the drive in some such way that there
is no outcome other than drive replacement. Let's just say that the
platters instantly vanish. Unrealistic, but that's the failure.

SNIP

In a RAID 5 array, are you screwed too, or
does the controller "interpolate" the missing data from parity bits
until you can replace the bad drive.

Why do you think it bothers with all that parity calculation?

Well, I mean, will it interpolate that information on the fly? I had
just assumed that if you lost a disk in a RAID 5 array, you had to
rebuild the lost drive using the parity info before doing anything
useful. However, something I read recently suggested that the
controller could replace the missing data on the fly using that parity
info. In short, I knew parity would be used to rebuild data...I just
didn't know that it could do it in real time...

In both cases, when you DO
replace the bad drive, does the rebuilding generally happen
automatically, or are there utilities that have to be run without a
load on the system, resulting in downtime?

It seems odd that you've specified a disk but haven't mentioned the
controller!

Well, we don't know that yet. Probably a decent U160 controller from
Adaptec.

The answer (predictably) depends on the controller. The worst case,
given your selection of non-pluggable disks, would probably be a
powerdown to replace the disk, followed by some steps to introduce the
disk into the array. The remirroring/rebuilding operation typically
happens on-line (and, in the case of software solutions, *has* to
happen on line!).

Yeah...in the other response to my original post, hot spares were
suggested. Upon reading more, these seem to be a good idea. Are
these, in essence, drives in limbo, waiting to take over when a drive
fails? Do you define this in the controller? What makes a controller
decide that a drive has failed? (Probably the topic of a different
post, eh? ha ha! Yeah....I'll do some reading...)

Rick Kunkel

Malc.

Thanks much!

Rick

#5 October 1st 03, 10:23 AM

"Rick Kunkel" wrote in message
...
Thanks for your reply... some additional info inline here...

On Tue, 30 Sep 2003 13:15:57 -0700, Malcolm Weir
wrote:

On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel
wrote:

these guys want SUPER redundancy, and are shying away from
backups as well. Argh.

.... and what happens when there's a fire in the server room or you get
broken into ?, having no backup is madness (and may invalidate any insurance
your company has).

Clive.

#6 October 1st 03, 05:43 PM

On Wed, 1 Oct 2003 10:23:37 +0100, wrote:

"Rick Kunkel" wrote in message
.. .
Thanks for your reply... some additional info inline here...

On Tue, 30 Sep 2003 13:15:57 -0700, Malcolm Weir
wrote:

On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel
wrote:

these guys want SUPER redundancy, and are shying away from
backups as well. Argh.

... and what happens when there's a fire in the server room or you get
broken into ?, having no backup is madness (and may invalidate any insurance
your company has).

Yes....I agree... It seems inconsistent to build a RAID 1+5 array,
and then have no other redundancy in other areas at all...

Thanks, people, for all your input...

Rick Kunkel

#7 October 1st 03, 10:22 PM

On Tue, 30 Sep 2003 15:50:30 -0700, Rick Kunkel
wrote:

What does your IO load look like?

This is going to be an IMAP server. Probably mostly reads, but the
read/write difference may be fairly negligible.

Read mostly, then, but you'll want good write performance for IMAP
updates and message ingestion.

And that "0+1"/"0+5" thing is a total red herring from an
architectural standpoint, so you can forget the "0+" thing and just
consider RAID 1 vs RAID 5.

Even nuttier, I was just informed that we're looking into "1+5".
Apparently, these guys want SUPER redundancy, and are shying away from
backups as well. Argh. The description I have of this is that it is
"is formed by creating a striped set with parity using multiple
mirrored pairs as components; it is similar in concept to RAID 10
except that the striping is done with parity." But fom what I can
tell, you're mainly opposed to referring to these multi level RAID
things as one entity, and prefer to see each level independently. For
instance, we have an array comprised of two drives using RAID 1 called
"BLAH". Then we stripe across 5 "BLAH"s using RAID 5. Is that a good
interpretation?

I've installed systems that mirrored R5 sets, simply because the
customer looked at the (performance/reliability) issues and decided
they wanted mirrored stripes, with the stripes on separate
controllers. We then pointed out that their mirroring was at the
filesystem level, so remirroring would require replicating the
filesystem onto the second set. So they then asked us to add an
additional disk to each stripe, making a R5 set on each controller,
with the two sets mirrored. This protected them against a single
drive failure at the R5 level, and against controller issues at the
filesystem level.

It sounds slightly nuts unless you understood that the R5 was an
afterthought added to a (well considered) solution...

In the case of a large number of small I/OS, if you have "n" drives,
RAID1 lets you do "n/2" simultaneous writes or "n" reads. Your usable
capacity will be "n/2" times that of a single disk.

RAID5 lets you do n reads or (n-1)/2 writes, and gives you n-1 times
the capacity of a single disk.

Oh! Not as much of a writing speed sacrifice as I thought. If I have
10 disks using this newfangled RAID 1+5, how do I figure the "n"
formulas above? If you have that, I'd love it...

"New fangled" is, I presume, a euphenism for "unusual", right?

By treating each disk entity as a mirrored pair (and, incidentally,
IBM once made a 5.25in disk that *consisted* of a mirrored pair of
3.5in drives), you lose a lot of the virtues of having independent
disks. In effect, you get a RANQSID (Redundant Array of Not Quite So
Independent Disks)!

Assuming the controller is smart enough to figure out what's going on
(big assumption):

Each "blah" can handle one write or two reads concurrently. Raid 5
can dispatch reads to "n" independent "blahs", so you can achieve "2n"
reads.

But the classic R5 write behavior doesn't change: a "blah" has exactly
the same write characteristics as a disk (actually, it is marginally
slower, since both subwrites have to complete), and an R5 update
involves two disks *or* two "blahs".

3. On failure... (Important Note: The Seagate drives I mentioned
above don't seem to have any hot-swappable capability.)

Well, if you used the ST318436LC you'd get you hot plug...

I'm REALLY
having trouble finding out what happens when a drive fails in a RAID
array.

What failure mode are you considering?

It's kinda important!

I'm going to assume failure of the drive in some such way that there
is no outcome other than drive replacement. Let's just say that the
platters instantly vanish. Unrealistic, but that's the failure.

Actually, not totally unrealistic. Modern disks tend to have four
main failure modes:

* Alive and responding to requests, but reports an e.g. a media error
rather than data.
* Alive and responding to housekeeping requests ("Are you there?") but
that's it.
* Dead (nobody home)
* Insane ("Oh, look, here are some wires. I wonder what happens if I
drive them all to logic one?"), in which case everything else on the
same bus vanishes, too.

The first three your RAID controller will handle (assuming it's not a
broken design, which can be a big assumption). The last requires a
"one bus per drive" philosophy, which is not as bad as it may seem: if
you have a four channel RAID controller, you build several RAID sets
so that there is only one disk in each set on each channel, although
each channel has several disks (each in a different set). Losing a
channel then means you lose several disks, but not two disks from any
one set.

In a RAID 5 array, are you screwed too, or
does the controller "interpolate" the missing data from parity bits
until you can replace the bad drive.

Why do you think it bothers with all that parity calculation?

Well, I mean, will it interpolate that information on the fly? I had
just assumed that if you lost a disk in a RAID 5 array, you had to
rebuild the lost drive using the parity info before doing anything
useful.

Ah. No, while I understand where you're coming from, that would be
disasterous, since the OS would effectively see a disk failure, which
would result in customers seeking RAID architects heads, preferably
painfully detached from their bodies!

However, something I read recently suggested that the
controller could replace the missing data on the fly using that parity
info. In short, I knew parity would be used to rebuild data...I just
didn't know that it could do it in real time...

It must. Of course, that doesn't mean that some controllers aren't
broken in their implementations!

In both cases, when you DO
replace the bad drive, does the rebuilding generally happen
automatically, or are there utilities that have to be run without a
load on the system, resulting in downtime?

It seems odd that you've specified a disk but haven't mentioned the
controller!

Well, we don't know that yet. Probably a decent U160 controller from
Adaptec.

OK: here's a recommendation:

Go with TWO Adaptec 3210S or LSI MegaRAID Elite 1600 or 1650 (the
latter would be my personal preference, FWIW).

Distribute the disks so that one of the two channels on each
controller has three disks, the other two.

Build TWO RAID-5 arrays so that one disk from each of the four
channels is in each array. [ I know you can do this with the LSI
controllers, I'm presuming you can do it with the Adaptecs ]. Assign
the remaining disks as hot spares.

You now have two R5 arrays, each with a capacity of 3 x 18GB, or 54GB.

Now, at the host OS level mirror those two R5 arrays, for a net usable
capacity of 54GB.

You'd then be protected from just about anything except host and
software failures, which is why you need some off-system backup.

If a drive fails "well", it's role will be assumed by one of the hot
spares. If a channel fails, that's equivalent to the simultaneous
failure of all disks on the channel, but you won't be impacted. If a
controller fails, the mirror will catch you.

This is, of course, *RIDICULOUS* overkill. But if it makes someone
happy...

[ The sensible solution is a software cluster of two reasonably
resilient machines, each capable of taking over from the other. This
solution treats the hosts and the disks separately, and requires
special software to synchronize the hosts) ]

[ Snip ]

Yeah...in the other response to my original post, hot spares were
suggested. Upon reading more, these seem to be a good idea. Are
these, in essence, drives in limbo, waiting to take over when a drive
fails? Do you define this in the controller? What makes a controller
decide that a drive has failed? (Probably the topic of a different
post, eh? ha ha! Yeah....I'll do some reading...)

Yep, hot spares just assume the role of the failed disk. A key
question is whether they can be easily persuaded to relinquish that
role when the failed drive is replaced (i.e. transfer their data back
to the replaced disk, thus keeping your "live" disks where you wanted
them).

Rick

Malc.

#8 October 2nd 03, 02:10 AM

Chiming in late but,
I personally would setup 2, 5-Disk RAID5 sets on separate controllers
then mirror through volume management software.

I am assuming you where planning on using one RAID controller. You're
far more likely to lose a controller or have parity corruption then
concatenating drive failures. That is unless there are other
environment variables like heat involved.

Note:
I've lost a total of 23 drives since Feb 2001, (out of lost count,
around 790)
and I've lost 8 volumes due to parity corruption and controller
errors.

LSI and Adaptec RAID controllers offer many different solutions as far
as rebuilds. Typical 18GB drive rebuild will take (with volume under
minimum load)
2-3 hours. Depending on how many resources you apply to the rebuild
process

Robert

Rick Kunkel wrote in message . ..
On Wed, 1 Oct 2003 10:23:37 +0100, wrote:

"Rick Kunkel" wrote in message
.. .
Thanks for your reply... some additional info inline here...

On Tue, 30 Sep 2003 13:15:57 -0700, Malcolm Weir
wrote:

On Tue, 30 Sep 2003 11:34:28 -0700, Rick Kunkel
wrote:

these guys want SUPER redundancy, and are shying away from
backups as well. Argh.

... and what happens when there's a fire in the server room or you get
broken into ?, having no backup is madness (and may invalidate any insurance
your company has).

Yes....I agree... It seems inconsistent to build a RAID 1+5 array,
and then have no other redundancy in other areas at all...

Thanks, people, for all your input...

Rick Kunkel

#9 October 2nd 03, 08:31 AM

Rick, sorry to hi-jack your thread but Robert can I ask how often 'parity
corruption' occurs ?. I take it that it's very rare ?.

Thanks.

Clive.

"Robert" wrote in message
om...
Chiming in late but,
I personally would setup 2, 5-Disk RAID5 sets on separate controllers
then mirror through volume management software.

I am assuming you where planning on using one RAID controller. You're
far more likely to lose a controller or have parity corruption then
concatenating drive failures. That is unless there are other
environment variables like heat involved.

Note:
I've lost a total of 23 drives since Feb 2001, (out of lost count,
around 790)
and I've lost 8 volumes due to parity corruption and controller
errors.

LSI and Adaptec RAID controllers offer many different solutions as far
as rebuilds. Typical 18GB drive rebuild will take (with volume under
minimum load)
2-3 hours. Depending on how many resources you apply to the rebuild
process

Robert

#10 October 2nd 03, 07:47 PM

NP,

Its rare but there are alot of factors. I have mainly encoutered
perity coruption on clustered systems (Shared disk) however its no
that uncommon for it to occur on standard volumes. I have only seen it
occur on RAID5 sets most likly because it has the most complicated
perity configuration of the popual NP,

It's rare but there are allot of factors that can affect frequency.
Look at the number of components between the volume manager and the
disk itself.

For instance if your disks reside in a separate drive enclosure you
have to factor that the enclosure controllers (go by many names. These
are just DUMB cards that regulate the flow of power and data across
the SCSI bus) can affect the system. Because you added components in
the controller's path to the disk you have added another potential
point of failure in the disk sub-system.

I have mainly encountered parity corruption on clustered systems
(Shared disks) however its not that uncommon for it to occur on
standard run of the mill Raid sets. It is more common to see parity
corruption in RAID 5 sets, most likely because it has the most
complicated parity configuration of the popular raid types.

Parity errors and other controller issues frequently occur due to
other factors out of the controller's hands. Like power
failures/interruptions, bad drives, system crashes, dirve enclosure
issues, and so forth.

Unfortunately, techs usually don't know when the parity errors started
and by the time you start seeing odd behavior with the system it's
beyond repair.

9 out of 10 parity/controller issues are foreseeable. Make sure you
use all the available tools at your disposal to check the health of
your disks and the disk sub-system. This is your best line of defense
and prevention. If your reigous about checking the health your disk
sub-system you will find it rare to ever experience an issue like
volume failure.
But you never know, thats why we backup and replicate data

Wow, sorry to ramble on.

Robert

wrote in message ...
Rick, sorry to hi-jack your thread but Robert can I ask how often 'parity
corruption' occurs ?. I take it that it's very rare ?.

Thanks.

Clive.

"Robert" wrote in message
om...
Chiming in late but,
I personally would setup 2, 5-Disk RAID5 sets on separate controllers
then mirror through volume management software.

I am assuming you where planning on using one RAID controller. You're
far more likely to lose a controller or have parity corruption then
concatenating drive failures. That is unless there are other
environment variables like heat involved.

Note:
I've lost a total of 23 drives since Feb 2001, (out of lost count,
around 790)
and I've lost 8 volumes due to parity corruption and controller
errors.

LSI and Adaptec RAID controllers offer many different solutions as far
as rebuilds. Typical 18GB drive rebuild will take (with volume under
minimum load)
2-3 hours. Depending on how many resources you apply to the rebuild
process

Robert

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Compaq Smart Array 532 Controller - Raid 1 Query	Richard Denny	Compaq Servers	3	November 24th 04 12:15 AM
IDE RAID	Ted Dawson	Asus Motherboards	29	September 21st 04 03:39 AM
Need help with SATA RAID 1 failure on A7N8X Delux	Cameron	Asus Motherboards	10	September 6th 04 11:50 PM
Asus P4C800 Deluxe ATA SATA and RAID Promise FastTrack 378 Drivers and more.	Julian	Asus Motherboards	2	August 11th 04 12:43 PM
Gigabyte GA-8KNXP and Promise SX4000 RAID Controller	Old Dude	Gigabyte Motherboards	4	November 12th 03 07:26 PM