A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » Motherboards » Gigabyte Motherboards
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

SCSI RAID problems



 
 
Thread Tools Display Modes
  #1  
Old September 21st 04, 12:23 AM
Andrew Wasielewski
external usenet poster
 
Posts: n/a
Default SCSI RAID problems

Hello SCSI friends,

I am having some strange/worrying problems with my SCSI RAID setup. Perhaps someone can help?

I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB useable space is partitioned into a 4GB user partition + 4 others for paging file, apps & data (+ some free space left over). I run WinXP Pro SP1.

A little while ago I started to get signs of hardware problems:-
a.. Errors in the Windows Event Log (ID 9) saying "The device, \Device\Scsi\a320raid1, did not respond within the timeout period." A few times the system crashed about the same time;
b.. Nasty "scrunching" sounds coming from the disks;
c.. In the SCSI BIOS the array is tagged "Degraded". On drilling down, the disk on ID 4 is tagged "Degraded", all the others are "Optimal".
After a short while the errors & noise go away, so I assume the disk has finally died & prepare to RMA it. In the interim I get a different error in the event log (ID 7) saying "The device, \Device\Harddisk0\D, has a bad block." whenever (and only when) I try to backup drive C: to the Tandberg SLR7 tape drive using the WinXP Backup utility. (I get error ID 14 saying "The shadow copy of volume C: was aborted because of an IO failure." at the same time. I can't find out much about these errors, other than that he first one usually means disk failure is imminent; however as the O/S can only see the logical disk, not the individual physical devices, I put this down to a glitch from the disk failure. Apart from this, all apps appear to work normally as far as I can tell.

When I get the replacement disk I replace the "Degraded" one using the same SCSI ID & attempt a rebuild. However this fails "Dur ro read error on device ID 3". When I run Verify Media on disk ID 3 it finds 3 bad blocks; it says it has remapped these, but on re-running both the array rebuild & the verify media come up with the same errors. I put the "old" disk on Channel B, expecting it to be dead; however it appears in the BIOS, and when I run verify media it comes up with 1 bad block. Again, remapping fails to fix it, but after a low-level format verify media finds no errors.

So the disk I originally thought was dead now appears to be as good as new (so too presumably is the replacement), however the *real* problem seems to lie with disk ID 3, even though the BIOS says it is optimal - and I can't rebuild the array. As the array is now running on only 3 disks out of 4 I am reluctant to do anything with ID 3 that might corrupt the array. Do I have any alternative other than to rebuild the array & reinstall everything? Fortunately I can still make backups of all partitions *except* C:, and I guess I can save anything I need from there on CR-R, but it is still a real pain...

Anyone have any ideas what is really going on, & how I can recover it? And how many, if any, dud disks do I have? The noises when the problem originally manifested definitely sounded very physical!

Thanks in advance for all help & advice.

Andrew
  #2  
Old September 22nd 04, 10:47 AM
Anthony Preston
external usenet poster
 
Posts: n/a
Default

Are the disks hot?

I was getting scsi disk errors due to the 4 disks that I had configured where getting to hot and started to play up.

I have since added additional fans and now the disks are easy to touch and handle but before I added the fans I could hardly touch them.


"Andrew Wasielewski" wrote in message ...
Hello SCSI friends,

I am having some strange/worrying problems with my SCSI RAID setup. Perhaps someone can help?

I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB useable space is partitioned into a 4GB user partition + 4 others for paging file, apps & data (+ some free space left over). I run WinXP Pro SP1.

A little while ago I started to get signs of hardware problems:-
a.. Errors in the Windows Event Log (ID 9) saying "The device, \Device\Scsi\a320raid1, did not respond within the timeout period." A few times the system crashed about the same time;
b.. Nasty "scrunching" sounds coming from the disks;
c.. In the SCSI BIOS the array is tagged "Degraded". On drilling down, the disk on ID 4 is tagged "Degraded", all the others are "Optimal".
After a short while the errors & noise go away, so I assume the disk has finally died & prepare to RMA it. In the interim I get a different error in the event log (ID 7) saying "The device, \Device\Harddisk0\D, has a bad block." whenever (and only when) I try to backup drive C: to the Tandberg SLR7 tape drive using the WinXP Backup utility. (I get error ID 14 saying "The shadow copy of volume C: was aborted because of an IO failure." at the same time. I can't find out much about these errors, other than that he first one usually means disk failure is imminent; however as the O/S can only see the logical disk, not the individual physical devices, I put this down to a glitch from the disk failure. Apart from this, all apps appear to work normally as far as I can tell.

When I get the replacement disk I replace the "Degraded" one using the same SCSI ID & attempt a rebuild. However this fails "Dur ro read error on device ID 3". When I run Verify Media on disk ID 3 it finds 3 bad blocks; it says it has remapped these, but on re-running both the array rebuild & the verify media come up with the same errors. I put the "old" disk on Channel B, expecting it to be dead; however it appears in the BIOS, and when I run verify media it comes up with 1 bad block. Again, remapping fails to fix it, but after a low-level format verify media finds no errors.

So the disk I originally thought was dead now appears to be as good as new (so too presumably is the replacement), however the *real* problem seems to lie with disk ID 3, even though the BIOS says it is optimal - and I can't rebuild the array. As the array is now running on only 3 disks out of 4 I am reluctant to do anything with ID 3 that might corrupt the array. Do I have any alternative other than to rebuild the array & reinstall everything? Fortunately I can still make backups of all partitions *except* C:, and I guess I can save anything I need from there on CR-R, but it is still a real pain...

Anyone have any ideas what is really going on, & how I can recover it? And how many, if any, dud disks do I have? The noises when the problem originally manifested definitely sounded very physical!

Thanks in advance for all help & advice.

Andrew
  #3  
Old September 22nd 04, 02:10 PM
Mal
external usenet poster
 
Posts: n/a
Default

when you say you have a Raid 10 array setup ... do you mean a raid 1/0? ...
the only raid setups I've come across are either Raid0, Raid1, Raid1/0 or
Raid5.

It sounds like you have a Raid 0/1 array from the drives you have and the
space available (Raid5 would give you 108Gb -- 3x36Gb + 4th drive online
spare) ... how is the mirroring setup? ... if the drives on scsi id's 5&6
are a mirror of drives 3&4 then you should be able to restore from them. If
that's the case you could try rebuilding the array using your original drive
4 and the replacement while keeping drive 3 intact

let us know how you get on

Mal


  #4  
Old September 23rd 04, 12:09 AM
Andrew Wasielewski
external usenet poster
 
Posts: n/a
Default

Disks don't feel noticably warm, let alone hot. Case is a Lian-Li PC-71 with plenty of fans, so don't think that can be it...
"Anthony Preston" wrote in message ...
Are the disks hot?

I was getting scsi disk errors due to the 4 disks that I had configured where getting to hot and started to play up.

I have since added additional fans and now the disks are easy to touch and handle but before I added the fans I could hardly touch them.


"Andrew Wasielewski" wrote in message ...
Hello SCSI friends,

I am having some strange/worrying problems with my SCSI RAID setup. Perhaps someone can help?

I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB useable space is partitioned into a 4GB user partition + 4 others for paging file, apps & data (+ some free space left over). I run WinXP Pro SP1.

A little while ago I started to get signs of hardware problems:-
a.. Errors in the Windows Event Log (ID 9) saying "The device, \Device\Scsi\a320raid1, did not respond within the timeout period." A few times the system crashed about the same time;
b.. Nasty "scrunching" sounds coming from the disks;
c.. In the SCSI BIOS the array is tagged "Degraded". On drilling down, the disk on ID 4 is tagged "Degraded", all the others are "Optimal".
After a short while the errors & noise go away, so I assume the disk has finally died & prepare to RMA it. In the interim I get a different error in the event log (ID 7) saying "The device, \Device\Harddisk0\D, has a bad block." whenever (and only when) I try to backup drive C: to the Tandberg SLR7 tape drive using the WinXP Backup utility. (I get error ID 14 saying "The shadow copy of volume C: was aborted because of an IO failure." at the same time. I can't find out much about these errors, other than that he first one usually means disk failure is imminent; however as the O/S can only see the logical disk, not the individual physical devices, I put this down to a glitch from the disk failure. Apart from this, all apps appear to work normally as far as I can tell.

When I get the replacement disk I replace the "Degraded" one using the same SCSI ID & attempt a rebuild. However this fails "Dur ro read error on device ID 3". When I run Verify Media on disk ID 3 it finds 3 bad blocks; it says it has remapped these, but on re-running both the array rebuild & the verify media come up with the same errors. I put the "old" disk on Channel B, expecting it to be dead; however it appears in the BIOS, and when I run verify media it comes up with 1 bad block. Again, remapping fails to fix it, but after a low-level format verify media finds no errors.

So the disk I originally thought was dead now appears to be as good as new (so too presumably is the replacement), however the *real* problem seems to lie with disk ID 3, even though the BIOS says it is optimal - and I can't rebuild the array. As the array is now running on only 3 disks out of 4 I am reluctant to do anything with ID 3 that might corrupt the array. Do I have any alternative other than to rebuild the array & reinstall everything? Fortunately I can still make backups of all partitions *except* C:, and I guess I can save anything I need from there on CR-R, but it is still a real pain...

Anyone have any ideas what is really going on, & how I can recover it? And how many, if any, dud disks do I have? The noises when the problem originally manifested definitely sounded very physical!

Thanks in advance for all help & advice.

Andrew
  #5  
Old September 23rd 04, 12:21 AM
Andrew Wasielewski
external usenet poster
 
Posts: n/a
Default

RAID 0+1 is another name for my setup i.e. striping + mirroring. Since the
4 disks are identical I presume there is a 1-to-1 correspondence between the
the blocks on one side of the mirror and the other, as the mirroring logic
doesn't care whether & how they are striped. However in that case I don't
know how the disks are paired off. There wasn't anywhere to specify it in
the array setup in the SCSI BIOS, & I can't see anything that displays it.

I am wary about finding out the hard way by disconnecting disk ID 3, as if
that turns out to be the currently non-redundent disk I don't want to risk
corrupting the array irretrievably. Or will it simply fail to recognise the
array at all in that case, until I put back the missing disk?

"Mal" wrote in message
...
when you say you have a Raid 10 array setup ... do you mean a raid 1/0?

....
the only raid setups I've come across are either Raid0, Raid1, Raid1/0 or
Raid5.

It sounds like you have a Raid 0/1 array from the drives you have and the
space available (Raid5 would give you 108Gb -- 3x36Gb + 4th drive online
spare) ... how is the mirroring setup? ... if the drives on scsi id's 5&6
are a mirror of drives 3&4 then you should be able to restore from them.

If
that's the case you could try rebuilding the array using your original

drive
4 and the replacement while keeping drive 3 intact

let us know how you get on

Mal




  #6  
Old September 23rd 04, 01:31 AM
Tim
external usenet poster
 
Posts: n/a
Default

Hi,

I can't see the original post for this due to ISP clobbering the
newsgroup....

What type of raid controller is it?

- Tim





"Andrew Wasielewski" wrote in message
...
RAID 0+1 is another name for my setup i.e. striping + mirroring. Since
the
4 disks are identical I presume there is a 1-to-1 correspondence between
the
the blocks on one side of the mirror and the other, as the mirroring logic
doesn't care whether & how they are striped. However in that case I don't
know how the disks are paired off. There wasn't anywhere to specify it in
the array setup in the SCSI BIOS, & I can't see anything that displays it.

I am wary about finding out the hard way by disconnecting disk ID 3, as if
that turns out to be the currently non-redundent disk I don't want to risk
corrupting the array irretrievably. Or will it simply fail to recognise
the
array at all in that case, until I put back the missing disk?

"Mal" wrote in message
...
when you say you have a Raid 10 array setup ... do you mean a raid 1/0?

...
the only raid setups I've come across are either Raid0, Raid1, Raid1/0 or
Raid5.

It sounds like you have a Raid 0/1 array from the drives you have and the
space available (Raid5 would give you 108Gb -- 3x36Gb + 4th drive online
spare) ... how is the mirroring setup? ... if the drives on scsi id's 5&6
are a mirror of drives 3&4 then you should be able to restore from them.

If
that's the case you could try rebuilding the array using your original

drive
4 and the replacement while keeping drive 3 intact

let us know how you get on

Mal






  #7  
Old September 23rd 04, 01:51 AM
Tim Kelley
external usenet poster
 
Posts: n/a
Default

In article , Andrew Wasielewski wrote:
I am having some strange/worrying problems with my SCSI RAID setup. =
Perhaps someone can help?

I have got a Gigabyte GA-KNXP Ultra m/b with embedded Adaptec AIC-7902 =
U320 SCSI controller. I have got a RAID 10 array made up of 4 x Maxtor =
Atlas IV 10k. 36GB disks (SCSI IDs 3, 4, 5 & 6) on Channel A. The 72GB =
useable space is partitioned into a 4GB user partition + 4 others for =
paging file, apps & data (+ some free space left over). I run WinXP Pro =
SP1.


When you're having mystifyng problems, try looking at heat and power
....

That's a lot of drives and if you have a lot of other stuff, perhaps
your power supply can't handle it? That can cause all manner of
weirdness.

Are they too hot? What's the temp sensor on the drives say?

--
_ _ _ _ _ _ _ _ _ _ _ _ _
/ \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \
( t | i | m | @ | i | t | . | k | p | t | . | c | c )
\_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/
GPG key fingerprint = 1DEE CD9B 4808 F608 FBBF DC21 2807 D7D3 09CA 85BF
  #8  
Old September 23rd 04, 10:48 PM
Folkert Rienstra
external usenet poster
 
Posts: n/a
Default


"Tim" wrote in message
Hi,

I can't see the original post for this due to ISP clobbering the newsgroup....




Some ISPs filter HTML posts.

You can find the message-ID in the header.
Cut and paste it to your address bar and type "news:" in front of it.
Or go to the top most message and click the attribution line.


What type of raid controller is it?

- Tim


"Andrew Wasielewski" wrote in message
...


[snip]
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
HELP: P4C800-E Deluxe, Intel RAID and Windows detection problems Michail Pappas Asus Motherboards 2 November 20th 04 03:18 AM
IDE RAID Ted Dawson Asus Motherboards 29 September 21st 04 03:39 AM
P4C800-E Delux: Setting up SATA Drives with RAID Will Asus Motherboards 13 July 12th 04 04:33 AM
RAID 0 problems no spam Homebuilt PC's 0 April 30th 04 06:18 PM
GA-8KNXP, how to configure BIOS for SATA? John Ward Gigabyte Motherboards 20 October 6th 03 07:42 AM


All times are GMT +1. The time now is 09:05 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.