A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Homebuilt PC's
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

difficult hardware diagnosis



 
 
Thread Tools Display Modes
  #1  
Old May 3rd 04, 12:35 PM
Frank
external usenet poster
 
Posts: n/a
Default difficult hardware diagnosis

Winxp Home. Updated except for the buggy KB835732.
Gigabyte mobo GA-8S650GXM (socket 478).
Celeron 2400
DDR 2100 - 128mb.

Experiencing frequent (several times a day) hardware crashes and automatic
reboots. I have turned off the control panel option (systemadvanced) to
automatically reboot after a crash, but it still does so anyway.

Machine_Check_Exception.
There is a blue screen error called Machine_Check_Exception, just prior to
the crash & reboot, but it's impossible to read it as it's only on for an
instant. Definitely not a Windows screen. I believe it's an Intel message
from the cpu diagnosing a hardware error (perhaps cpu or mobo).

There is no pattern to the crashes and no relation to what programs may be
running - sometimes it happens when the machine is standing idle.

On reboot the following is sometimes displayed (not everytime) - you have
recovered from a serious error:
BC Code:9c BCP1:00000000 BCP2: 8005366FO BCP3:CC0000FF BCP4:20040189
OSVER:5_1_2600 SP:1_0 Product 768_1


Memtest86.
I ran Memtest86 as I suspected a faulty ram module (128 & 256 installed). I
replaced the 256 module that was probably faulty but the crashes continued.
So I removed the new module and am only running the original 128 module
which tests good. Crashes continue.

PSU.
I suspected the PSU - Verified voltages - all seem ok (within 5%)- checked
the 3.3, 5 & 12v rails on the main mobo connector with a autosensing digital
multimeter - all seem ok,

Fans.
All fans are working (cpu, case and psu).

MBM5 (MotherboardMonitor).
The MBM temperature results are different from the Bios readings.
case 141F/61C
cpu 30F/-1C
sensor3 32F/OC
core0 1.6v
core1 .00v
+3.3 3.39v
+5 5.00v
-12v -12.27v
-5 -4.89v
fan1 5625rpm
fan2 33750rpm
fan3 16875rpm
cpu 2424mhz
cpu0 0%

Bios:
system temp=32C/89F
cpu temp=fluctuates from 39C/100F to 41C/105F
cpu fan=3125rpm
system fan2766rpm
vcore=1.58v
+3.3=3.39v
+5=5.02v
+12=11.97v

Event Viewer.
I looked at the Event Viewer errors and am continually getting STOP
0x0000009c errors, which point to hardware problems.

Dumpchk.
I used Dumpchk.exe to analyse Minidump files (created by XP in the
windowsminidump folder) and have copy/pasted one below as an example, and
to see if it offers any clues:

C:\WINDOWSdumpchk minidump\mini042704-02.dmp
Loading dump file minidump\mini042704-02.dmp
----- 32 bit Kernel Mini Dump Analysis

DUMP_HEADER32:
MajorVersion 0000000f
MinorVersion 00000a28
DirectoryTableBase 00039000
PfnDataBase 81053000
PsLoadedModuleList 8054be30
PsActiveProcessHead 8054de78
MachineImageType 0000014c
NumberProcessors 00000001
BugCheckCode 0000009c
BugCheckParameter1 00000000
BugCheckParameter2 8053f0f0
BugCheckParameter3 cc0000ff
BugCheckParameter4 20040189
PaeEnabled 00000000
KdDebuggerDataBlock 8053dde0
MiniDumpFields 00000dff

TRIAGE_DUMP32:
ServicePackBuild 00000100
SizeOfDump 00010000
ValidOffset 0000fffc
ContextOffset 00000320
ExceptionOffset 000007d0
MmOffset 00001068
UnloadedDriversOffset 000010a0
PrcbOffset 00001878
ProcessOffset 000024c8
ThreadOffset 00002720
CallStackOffset 00002978
SizeOfCallStack 00003000
DriverListOffset 00005c08
DriverCount 000000a3
StringPoolOffset 00008c70
StringPoolSize 00001680
BrokenDriverOffset 00000000
TriageOptions 00000041
TopOfStack f2ebd000
DebuggerDataOffset 00005978
DebuggerDataSize 00000290
DataBlocksOffset 0000a2f0
DataBlocksCount 00000003


Windows XP Kernel Version 2600 (Service Pack 1) UP Free x86 compatible
Kernel base = 0x804d4000 PsLoadedModuleList = 0x8054be30
Debug session time: Tue Apr 27 12:57:06 2004
System Uptime: 0 days 0:05:41
start end module name
804d4000 806c6980 nt Checksum: 0020230B Timestamp: Thu Aug 29
05:
03:24 2002 (3D6DE35C)

Unloaded modules:
f309f000 f30af000 NAVENG.Sys Timestamp: unavailable (00000000)
f2dff000 f2e90000 NavEx15.Sys Timestamp: unavailable (00000000)
f2ea0000 f2eb0000 NAVENG.Sys Timestamp: unavailable (00000000)
f2dff000 f2e90000 NavEx15.Sys Timestamp: unavailable (00000000)
f78c8000 f78d8000 NAVENG.Sys Timestamp: unavailable (00000000)
f4238000 f42c9000 NavEx15.Sys Timestamp: unavailable (00000000)
f2f30000 f2f57000 kmixer.sys Timestamp: unavailable (00000000)
f38c1000 f38e8000 kmixer.sys Timestamp: unavailable (00000000)
f7de7000 f7de8000 drmkaud.sys Timestamp: unavailable (00000000)
f3abe000 f3acb000 DMusic.sys Timestamp: unavailable (00000000)
f7cb2000 f7cb4000 splitter.sys Timestamp: unavailable (00000000)
f3ace000 f3adc000 swmidi.sys Timestamp: unavailable (00000000)
f3923000 f3946000 aec.sys Timestamp: unavailable (00000000)
f7b80000 f7b85000 Cdaudio.SYS Timestamp: unavailable (00000000)
f757c000 f757f000 Sfloppy.SYS Timestamp: unavailable (00000000)
end.

Prime95
I ran Prime95 to put load on the system and it failed the torture test. But
no clues as to why - not necessarily cpu ?
Readout - Beginning a continuous self test to check computer.
Test1, 4000 Lucas-Lehmer iterations of M19922945 using 1024k FFT length.
FATAL ERROR:Writing to temp file.
Error opening results file to output this message:
Unable to open log file.
Torture Test ran 0 minutes_1 error.0 warnings.
Execution halted.

CPU Stability Test ver.6
I ran the Normal test mode, and it lasted about 9 minutes before crashing.
No telling if it was because of strain on the cpu as the machine crashes
like that anyway, even when not under a load.

So, definitely a hardware problem :-) But how can I be specific and sure?
Would a Post diagnostic card tell me if it's the cpu or mobo or ? I dont
have a spare cpu or mobo to swop with known-good parts.


  #2  
Old May 3rd 04, 01:37 PM
Dave C.
external usenet poster
 
Posts: n/a
Default


"Frank" wrote in message
. ..
Winxp Home. Updated except for the buggy KB835732.
Gigabyte mobo GA-8S650GXM (socket 478).
Celeron 2400
DDR 2100 - 128mb.

Experiencing frequent (several times a day) hardware crashes and automatic
reboots. I have turned off the control panel option (systemadvanced) to
automatically reboot after a crash, but it still does so anyway.

Machine_Check_Exception.
There is a blue screen error called Machine_Check_Exception, just prior

to
the crash & reboot, but it's impossible to read it as it's only on for an
instant. Definitely not a Windows screen. I believe it's an Intel message
from the cpu diagnosing a hardware error (perhaps cpu or mobo).

There is no pattern to the crashes and no relation to what programs may be
running - sometimes it happens when the machine is standing idle.


Your symptoms point to iffy power supply* or a motherboard component
failing. I'm wondering if you got one of those motherboards with the bad
batch of caps on it? Open the case and use a flashlight to carefully
examine all of the components on the motherboard, most especially the
capacitors. In case you don't know, those are the small coke can shaped
components standing upright and soldered directly to the motherboard. You
should probably see many of them on your motherboard, with likely a cluster
of several of them near your CPU socket. Examine all sides of them that you
can see, look CAREFULLY for bulges or deformities. Also look for capacitors
that are leaning to one side or another, with no reasonable explanation,
such as being crowded by another component. Also look for capacitors that
are discolored with a brownish discharge. If you notice any of these
symptoms, that points to a capacitor that has failed, and a bad cap could
EASILY explain the problems you are having.

Unless you see something obvious like a bad cap, this one is going to be
tough to trace. I'm afraid you will just have to replace components until
you have a stable system. Start with the power supply, replace it with
something by Seasonic in the ~400W range. If that doesn't help, consider
picking up a different motherboard, possibly off ebay. (it's always good to
save some money, where possible, and you won't want to spend too much money
on a Celeron motherboard) -Dave

* While a multimeter can be useful in diagnosing SEVERE problems with a
power supply, it doesn't take severe power supply problems to cause a system
to be unstable.


  #3  
Old May 3rd 04, 05:43 PM
Frank
external usenet poster
 
Posts: n/a
Default

I'm wondering if you got one of those motherboards with the bad
batch of caps on it?


Is this a known issue with this brand & model of mobo ?


* While a multimeter can be useful in diagnosing SEVERE problems with a
power supply, it doesn't take severe power supply problems to cause a

system
to be unstable.

But running Prime95 's "torture test" is supposed to simulate a max load on
the psu ?


  #4  
Old May 3rd 04, 06:06 PM
JAD
external usenet poster
 
Posts: n/a
Default

Norton system works? Norton anti virus? Office Find FAST? Media
sniffer?


"Frank" wrote in message
. ..
Winxp Home. Updated except for the buggy KB835732.
Gigabyte mobo GA-8S650GXM (socket 478).
Celeron 2400
DDR 2100 - 128mb.

Experiencing frequent (several times a day) hardware crashes and

automatic
reboots. I have turned off the control panel option

(systemadvanced) to
automatically reboot after a crash, but it still does so anyway.

Machine_Check_Exception.
There is a blue screen error called Machine_Check_Exception, just

prior to
the crash & reboot, but it's impossible to read it as it's only on

for an
instant. Definitely not a Windows screen. I believe it's an Intel

message
from the cpu diagnosing a hardware error (perhaps cpu or mobo).

There is no pattern to the crashes and no relation to what programs

may be
running - sometimes it happens when the machine is standing idle.

On reboot the following is sometimes displayed (not everytime) - you

have
recovered from a serious error:
BC Code:9c BCP1:00000000 BCP2: 8005366FO BCP3:CC0000FF

BCP4:20040189
OSVER:5_1_2600 SP:1_0 Product 768_1


Memtest86.
I ran Memtest86 as I suspected a faulty ram module (128 & 256

installed). I
replaced the 256 module that was probably faulty but the crashes

continued.
So I removed the new module and am only running the original 128

module
which tests good. Crashes continue.

PSU.
I suspected the PSU - Verified voltages - all seem ok (within 5%)-

checked
the 3.3, 5 & 12v rails on the main mobo connector with a autosensing

digital
multimeter - all seem ok,

Fans.
All fans are working (cpu, case and psu).

MBM5 (MotherboardMonitor).
The MBM temperature results are different from the Bios readings.
case 141F/61C
cpu 30F/-1C
sensor3 32F/OC
core0 1.6v
core1 .00v
+3.3 3.39v
+5 5.00v
-12v -12.27v
-5 -4.89v
fan1 5625rpm
fan2 33750rpm
fan3 16875rpm
cpu 2424mhz
cpu0 0%

Bios:
system temp=32C/89F
cpu temp=fluctuates from 39C/100F to 41C/105F
cpu fan=3125rpm
system fan2766rpm
vcore=1.58v
+3.3=3.39v
+5=5.02v
+12=11.97v

Event Viewer.
I looked at the Event Viewer errors and am continually getting STOP
0x0000009c errors, which point to hardware problems.

Dumpchk.
I used Dumpchk.exe to analyse Minidump files (created by XP in the
windowsminidump folder) and have copy/pasted one below as an

example, and
to see if it offers any clues:

C:\WINDOWSdumpchk minidump\mini042704-02.dmp
Loading dump file minidump\mini042704-02.dmp
----- 32 bit Kernel Mini Dump Analysis

DUMP_HEADER32:
MajorVersion 0000000f
MinorVersion 00000a28
DirectoryTableBase 00039000
PfnDataBase 81053000
PsLoadedModuleList 8054be30
PsActiveProcessHead 8054de78
MachineImageType 0000014c
NumberProcessors 00000001
BugCheckCode 0000009c
BugCheckParameter1 00000000
BugCheckParameter2 8053f0f0
BugCheckParameter3 cc0000ff
BugCheckParameter4 20040189
PaeEnabled 00000000
KdDebuggerDataBlock 8053dde0
MiniDumpFields 00000dff

TRIAGE_DUMP32:
ServicePackBuild 00000100
SizeOfDump 00010000
ValidOffset 0000fffc
ContextOffset 00000320
ExceptionOffset 000007d0
MmOffset 00001068
UnloadedDriversOffset 000010a0
PrcbOffset 00001878
ProcessOffset 000024c8
ThreadOffset 00002720
CallStackOffset 00002978
SizeOfCallStack 00003000
DriverListOffset 00005c08
DriverCount 000000a3
StringPoolOffset 00008c70
StringPoolSize 00001680
BrokenDriverOffset 00000000
TriageOptions 00000041
TopOfStack f2ebd000
DebuggerDataOffset 00005978
DebuggerDataSize 00000290
DataBlocksOffset 0000a2f0
DataBlocksCount 00000003


Windows XP Kernel Version 2600 (Service Pack 1) UP Free x86

compatible
Kernel base = 0x804d4000 PsLoadedModuleList = 0x8054be30
Debug session time: Tue Apr 27 12:57:06 2004
System Uptime: 0 days 0:05:41
start end module name
804d4000 806c6980 nt Checksum: 0020230B Timestamp:

Thu Aug 29
05:
03:24 2002 (3D6DE35C)

Unloaded modules:
f309f000 f30af000 NAVENG.Sys Timestamp: unavailable (00000000)
f2dff000 f2e90000 NavEx15.Sys Timestamp: unavailable (00000000)
f2ea0000 f2eb0000 NAVENG.Sys Timestamp: unavailable (00000000)
f2dff000 f2e90000 NavEx15.Sys Timestamp: unavailable (00000000)
f78c8000 f78d8000 NAVENG.Sys Timestamp: unavailable (00000000)
f4238000 f42c9000 NavEx15.Sys Timestamp: unavailable (00000000)
f2f30000 f2f57000 kmixer.sys Timestamp: unavailable (00000000)
f38c1000 f38e8000 kmixer.sys Timestamp: unavailable (00000000)
f7de7000 f7de8000 drmkaud.sys Timestamp: unavailable (00000000)
f3abe000 f3acb000 DMusic.sys Timestamp: unavailable (00000000)
f7cb2000 f7cb4000 splitter.sys Timestamp: unavailable

(00000000)
f3ace000 f3adc000 swmidi.sys Timestamp: unavailable (00000000)
f3923000 f3946000 aec.sys Timestamp: unavailable (00000000)
f7b80000 f7b85000 Cdaudio.SYS Timestamp: unavailable (00000000)
f757c000 f757f000 Sfloppy.SYS Timestamp: unavailable (00000000)
end.

Prime95
I ran Prime95 to put load on the system and it failed the torture

test. But
no clues as to why - not necessarily cpu ?
Readout - Beginning a continuous self test to check computer.
Test1, 4000 Lucas-Lehmer iterations of M19922945 using 1024k FFT

length.
FATAL ERROR:Writing to temp file.
Error opening results file to output this message:
Unable to open log file.
Torture Test ran 0 minutes_1 error.0 warnings.
Execution halted.

CPU Stability Test ver.6
I ran the Normal test mode, and it lasted about 9 minutes before

crashing.
No telling if it was because of strain on the cpu as the machine

crashes
like that anyway, even when not under a load.

So, definitely a hardware problem :-) But how can I be specific and

sure?
Would a Post diagnostic card tell me if it's the cpu or mobo or ? I

dont
have a spare cpu or mobo to swop with known-good parts.




  #5  
Old May 3rd 04, 08:52 PM
Dave C.
external usenet poster
 
Posts: n/a
Default


"Frank" wrote in message
. ..
I'm wondering if you got one of those motherboards with the bad
batch of caps on it?


Is this a known issue with this brand & model of mobo ?


* While a multimeter can be useful in diagnosing SEVERE problems with a
power supply, it doesn't take severe power supply problems to cause a

system
to be unstable.

But running Prime95 's "torture test" is supposed to simulate a max load

on
the psu ?


The bad caps are a known issue with all brands and models of motherboards.
I believe it was at it's worst with boards built a couple of years ago. Max
load or not, an intermittent problem with a PSU could cause really bizarre
symptoms like you were describing earlier. If you've got time to kill, try
running the torture test while watching the PSU with a multimeter. If the
12V rail drops below 11.5V or fluctuates more than .5V, I'd replace that
puppy whether it's the cause of your current problems or not. -Dave


  #6  
Old May 4th 04, 01:43 AM
JB
external usenet poster
 
Posts: n/a
Default

"Frank" wrote in message ...
Winxp Home. Updated except for the buggy KB835732.
Gigabyte mobo GA-8S650GXM (socket 478).
Celeron 2400
DDR 2100 - 128mb.

Experiencing frequent (several times a day) hardware crashes and automatic
reboots. I have turned off the control panel option (systemadvanced) to
automatically reboot after a crash, but it still does so anyway.

Machine_Check_Exception.


I'm assuming you've read this:

http://support.microsoft.com/default...b;en-us;329284
  #7  
Old May 5th 04, 01:03 AM
Matt
external usenet poster
 
Posts: n/a
Default

Dave C. wrote:

Unless you see something obvious like a bad cap, this one is going to be
tough to trace. I'm afraid you will just have to replace components until
you have a stable system. Start with the power supply, replace it with
something by Seasonic in the ~400W range. If that doesn't help, consider
picking up a different motherboard, possibly off ebay. (it's always good to
save some money, where possible, and you won't want to spend too much money
on a Celeron motherboard) -Dave


I didn't read the original post in detail, but I didn't see that he
needs a 400W supply.

  #8  
Old May 5th 04, 03:39 AM
~A_Sammy
external usenet poster
 
Posts: n/a
Default

I had the same problem. It was the cpu overheating.
check the fan and thermal paste.


  #9  
Old May 5th 04, 12:08 PM
Frank
external usenet poster
 
Posts: n/a
Default

was your fan working ? Mine is.


I had the same problem. It was the cpu overheating.
check the fan and thermal paste.




 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
s/w diagnosis tools for hardware - any good? Gary Roach General 1 December 6th 04 10:42 AM
Newbie Question re hardware vs software RAID Gilgamesh General 44 November 22nd 04 10:52 PM
New Thermaltake Tsunami Case -> What's wrong with hardware reviews these days?! Vlad General 6 July 26th 04 10:03 PM
my new mobo o/c's great rockerrock Overclocking AMD Processors 9 June 30th 04 08:17 PM
Can't install any OS (hardware fault?) Paul Richard Homebuilt PC's 4 September 27th 03 12:55 PM


All times are GMT +1. The time now is 12:58 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.