If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error
Followups-To alt.comp.hardware
A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error SHORT VERSION: If room ambient temp drops below 25C, system is so instable it can't even complete a POST. If system is running when room temp drops, it then starts to act erratic (odd Explorer pauses and Prime95 errors), powering off then back on a dozen seconds later, fails resulting in only ""CMOS Checksum Error" and automatically booting to floppy, running awdflash. Once ambient temp rises to 32-24C, system always POSTS and runs fine. In-between these temps, failure to get beyond "CMOS Checksum Error" and errors in Windows, go up in frequency as temp drops. Multiple troubleshooting attempts have been made (Clear CMOS, BIOS flash, swap hardware, remount board in case, strip down system, etc), problem appears isolated to motherboard itself. Different BIOS, bios defaults, etc, have been tried. System is not overclocked. What are the potential cause(s) and the best methods to check these? LONG VERSION: A7N8X Deluxe Motherboard rev 2.00 Socket A, nForce2-400 Current BIOS 1008 Athlon XP2400 512MB Kingston PC3200 Fortron 400W Power Supply ATI AIW 128 Pro 1 HDD, CDROM, Floppy, (typical non-power-user PC) Approximately 1 year old, system remained unchanged for that period of time and had (guessing) about 800 hours of on-time, it was not used very much, and AFAIK, nothing demanding, it had an easy life so far. All settings were conservative, default values, no overclocking and minimal BIOS changes. Board appears to be very sensitive to temperature, but not too hot, rather too cold. If case thermometer (not motherboard integrated temp sensor but a separate digital probe) reads below approx. 27C, then powering on system from soft-off state results in system displaying the following message: "CMOS Checksum Error" Then system proceeds to do an Award BIOS recovery by booting to awdflash if appropriate floppy is already in the system. There is no option to do anything else, the BIOS setup is not accessible and the attempt to boot floppy is automatic, not a normal "boot to floppy" event as would occur with any normally-working system that has a boot floppy in it. Clearing CMOS and/or loading setup defaults does not resolve this. If the ambient temp is right at 25C or slightly higher, multiple attempts at powering off, then on, will result in system posting with setup defaults for FSB & multiplier, successfully doing so at a rate of roughly 1 in 5 tries, more often as room temp rises, less as temp falls. Even if system is manually set to same speeds or very low speed (6X multiplier for CPU and 100MHz FSB and memory), after saving these changes (or not) the system will not proceed with POST again, it takes several tries to get system to post, again displaying the "CMOS checksum Error" each time. Unplugging power supply from AC had no effect, nor did clearing CMOS. I've not pinned down the EXACT temp, since it gets progressively worse and the range is fairly tight, but it roughly corresponds to 25C-32C being fail-pass thresholds, certainly within 10C temp rise it goes from unable to POST to working fine. This has been deliberately reproduced (later) by cranking up an air conditioner, it is clearly low-temperature related, but first the steps prior to this conclusion... It does sometimes POST after saving the changes, but if then powered off it may not POST the next time... seems to still be marginal regardless of the BIOS settings, since even loading setup defaults and clearing CMOS didn't change the roughly 1-in-5 success rate. Every time it fails, the video does come up and it does attempt to boot to floppy, it never just acts dead, always has video display. Every common troubleshooting procedure I could think of was tried to no avail. Certainly more than mentioned here but due to the length of the post I'll try to list what seems most relevant. Initally suspecting BIOS corruption, I'd flashed the board with same bios version, which at the time seemed to work, but later it was discovered that the difference was instead that the ambient temperature was higher than previously, because soon enough the temp had dropped and system again failed to do anything more than "CMOS Checksum Error". The next time I had a chance I flashed the next, newest bios version, with no change. I'd been suspecting the often rumored "nvidia bios corruption" problem, which seems to occur from exiting the bios too quickly when saving settings, but this was not the case with this board. Later it was noted that doing NOTHING to system other than leaving it sit until ambient temp rose, would return system to 100% stable state. Even overclocking quite a bit it passed several different stress tests at 32C room temp, but once temp falls again, still not stable even at 6 x 100. I could understand if it these were arctic conditions but such a drastic change within a span of 10C seems quite unusual. Indeed, several other systems in same room do work fine at same temp. Also notable is that if the temp is barely high enough to get it to post and boot, running a stress test like Prime95 results in errors within a minute or so, yet with ambient temp 10C higher the system not only passes same Prime95 test for 24 hours, but can even pass it running at 50% higher FSB, Memory clock, and CPU frequency. Trying to isolate the problem I'd changed power supplies 3 times with known/proven good 400W+ units, unplugged nonessential cards, swapped video and memory, ran in minimal configuration and checked every mechanical connection as well as possible (including pulling/inspecting/reinstalling the EEPROM and jumpers), all with known good/working parts. It seems that the motherboard itself is simply intolerant of quite mild temperature drop. Normally I'd just replace it but this is quite puzzling, unique for such a small temp span, and I'd like to get to the bottom of it. There are no visable problems with the board, capacitors look fine and no visable cracking or other physical abnormalities, though I don't have the means to check this with a microscope, especially since a motherboard is a bit wider than most 'scope's reach. Since this is a very popular motherboard and I've not heard of anyone else having this problem (or perhaps they just didn't isolate the cause as low temp?) I wonder if this is an isolated flaw, but the closest examination I could make showed nothing unusual and it does work fine, never this problem (or any other that I'm aware of) once room temp rises by 5-10C. The system case is very well ventilated, ambient room temp never causes interior air temp to go up much except immediately adjacent to heatsink, as expected. I thought about the battery but voltage on it reads OK and it shouldn't explain the instability after booting and running Windows. I don't recall if I ended up putting a different battery in it or not but will do so just for the heck of it. One possibility I'm wondering about is whether one or more of the capacitors are dropping their ESR as the temp falls, if one or more are marginal and this is the cause. It might be a bit difficult to easy check this though, I'd though about possible touching a small light bulb to each in turn, individually warming them to see if that made any difference, but that could take quite a long time, especially if multiple caps are involved, since it could be necessary to wait till each cooled to try, isolate the next cap. It also seems difficult to determine their core temp without getting the outside can quite hot, as any non-destructive temp reading would be of the outer can. It seems a rather crude way of warming them too but I'm drawing a blank as to how to individually warm a capacitor without also warming the surrounding area, or at least minimizing that as much as reasonably possible. I could instead touch the leads with a solderin iron but would prefer to leave the solder alone if possible. I suppose I could take the opposite approach and warm up the board then use freeze spray on each cap, but that doesn't seem a very good approach either, since it might easily (probably would) lower the cap temp too much, introducing further failures that aren't present at 27C, not until much colder, and it again seems difficult to thoroughly chill the core of individual caps without changing surrounding area temp by this small 10C thermal margin. Another possibility I'd considered is temporarily placing a tantalum cap in parallel with (as many suspect caps as possible), since tantalums should be much more tolerant of low temp (IIRC), but this also seems to be a lengthly, tedious process that would best be avoided if anyone has a better idea? Even if I don't solve this problem I wanted to at least get this bit of info out there, that at the very least this one board is effected by a relatively small temp change, but due to the type of problem I wonder if it's more frequent, a CMOS Checksum Error is not all that uncommon and "some" of the occurrences of a Checksum Error might be misdiagnosed... and some boards not even getting far enough to post "CMOS Checksum Error" might do so if they were a little warmer. |
#2
|
|||
|
|||
"kony" wrote in message ... Followups-To alt.comp.hardware A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error SHORT VERSION: Replace the cmos battery, that board is well known for having a marginally acceptable battery installed. This ties in with the temperature and cmos checksum errors, a failing battery will not seem to work at lower temps but be fine when the temp increases. As for the board being sensitive to cold, I'm afraid that your theory is wrong, the colder you run most electronics the better, overclockers have run this board submerged in a non-conductive fluid at -30C. It may seem odd that a £1 battery could be causing all your probs but I've seen it on two of those boards, and read about plenty other battery problems, It's an excellent board otherwise. -- Apollo |
#3
|
|||
|
|||
On Mon, 11 Oct 2004 08:52:36 +0100, "Apollo"
wrote: "kony" wrote in message .. . Followups-To alt.comp.hardware A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error SHORT VERSION: Replace the cmos battery, that board is well known for having a marginally acceptable battery installed. This ties in with the temperature and cmos checksum errors, a failing battery will not seem to work at lower temps but be fine when the temp increases. Battery voltage (removed from board) was 3.05V with a ~ 300 Ohm load measured via DMM, but I replaced it anyway... didn't help. As for the board being sensitive to cold, I'm afraid that your theory is wrong, the colder you run most electronics the better, overclockers have run this board submerged in a non-conductive fluid at -30C. Well the theory is not that ALL boards won't work when cold. Rather, the fact of the matter is that cold DOES effect this one particular board's operation so at this point the focus is on WHAT is the cold effecting, whether it be a broken trace or crack, marginal capacitors, bad contacts or ??? It may seem odd that a £1 battery could be causing all your probs but I've seen it on two of those boards, and read about plenty other battery problems, It's an excellent board otherwise. Yes it's a nice board for it's time, and I wish it were merely the battery. Last time i tried to power it up it wouldn't even complete POST 1 in 5 times as previosly, so it may be progressively getting worse? Supposedly system was doing this "rarely" in the past month but more frequently, frequently enough to prompt owner to bring it to me a few days ago. Anyway, still didn't find the time yet to individually test any (motherboard) components but when it wouldn't POST after a dozen times I held a hair-dryer up to the whole thing for a couple minutes and once again it posted fine, booted to windows, rebooted and flashed bios again successfully. Then a few hours later I tried to turn it on and nothing again! If it were mine (I also have an A7N8X which doesn't have this problem) I'd try replacing component(s) and keep an eye on it, stress testing and such but it has to be back to it's owner so ultimately I'll probably just have them buy a new one soon, yet I'm still curious to find out what's going on with it. |
#4
|
|||
|
|||
On Mon, 11 Oct 2004 08:35:49 GMT, kony wrote:
On Mon, 11 Oct 2004 08:52:36 +0100, "Apollo" wrote: "kony" wrote in message . .. Followups-To alt.comp.hardware A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error Anyway, still didn't find the time yet to individually test any (motherboard) components but when it wouldn't POST after a dozen times I held a hair-dryer up to the whole thing for a couple minutes and once again it posted fine, booted to windows, rebooted and flashed bios again successfully. Then a few hours later I tried to turn it on and nothing again! If it were mine (I also have an A7N8X which doesn't have this problem) I'd try replacing component(s) and keep an eye on it, stress testing and such but it has to be back to it's owner so ultimately I'll probably just have them buy a new one soon, yet I'm still curious to find out what's going on with it. Use a piece of cardboard to direct the heat to one half of the motherboard at a time. After you determine which half is heat sensitive, then determine which half of that half is the problem, etc., until you pinpoint the component. |
#5
|
|||
|
|||
"kony" wrote in message ... Followups-To alt.comp.hardware A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error SHORT VERSION: If room ambient temp drops below 25C, system is so instable it can't even complete a POST. If system is running when room temp drops, it then starts to act erratic (odd Explorer pauses and Prime95 errors), powering off then back on a dozen seconds later, fails resulting in only ""CMOS Checksum Error" and automatically booting to floppy, running awdflash. Once ambient temp rises to 32-24C, system always POSTS and runs fine. In-between these temps, failure to get beyond "CMOS Checksum Error" and errors in Windows, go up in frequency as temp drops. Multiple troubleshooting attempts have been made (Clear CMOS, BIOS flash, swap hardware, remount board in case, strip down system, etc), problem appears isolated to motherboard itself. Different BIOS, bios defaults, etc, have been tried. System is not overclocked. What are the potential cause(s) and the best methods to check these? snip Kony If you've narrowed the problem to the motherboard by substituting all other components, then the 'freeze spray' technique would appear to be about the only way of finding which component it is. I had a kitset computer many years ago that did exactly what you describe (erratic within a temperature range) The can of 'Freeze' found the culprit in less than 10 minutes. (Z80 CPU as it happens) The only 'concern' with using Freeze spray is when the humidity is high enough to cause condensation, shorting high impedance lines out. (High impedance = low power CMOS circuitry these days) Just do it in methodically, the boards U/S anyway so you've nothing to loose. Cheers Paul. |
#6
|
|||
|
|||
"PC" wrote in message ...
"kony" wrote in message ... Followups-To alt.comp.hardware A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error SHORT VERSION: If room ambient temp drops below 25C, system is so instable it can't even complete a POST. If system is running when room temp drops, it then starts to act erratic .....snip.... Kony If you've narrowed the problem to the motherboard by substituting all other components, then the 'freeze spray' technique would appear to be about the only way of finding which component it is. .....snip..... Agreed. I haven't used the DX version of the A7N8X board but I've built a couple of systems with the -X version. They've been in continuous use by avid gamers for several months in temps ranging from an estimated 12C to 35C without any problem. It's obvious this mobo has an unusual temperature sensitivity. Unusual because, as you will no doubt be aware, faults usually show up with a rise in temperature. But with years of servicing electronic products under my belt, I've seen all kinds of weird things happen. E.g., 99% of defective resistors generally increase in resistance or become open, but I've seen a couple of cases where their resistance had lowered drastically. In this case, material contraction due to lowered temp must be causing a bad joint to lose electrical contact (I know this isn't much help). The bad contact could be at a poor solder point, inside a component (ICs, capacitors...), wire crimp, or a microscopic crack in the PCB copper tracks. |
#7
|
|||
|
|||
kony wrote in
: Followups-To alt.comp.hardware A7N8X Motherboard Low Temperature Sensitivity, CMOS Checksum Error SHORT VERSION: If room ambient temp drops below 25C, system is so instable it can't even complete a POST. If system is running when room temp drops, it then starts to act erratic (odd Explorer pauses and Prime95 errors), powering off then back on a dozen seconds later, fails resulting in only ""CMOS Checksum Error" and automatically booting to floppy, running awdflash. Once ambient temp rises to 32-24C, system always POSTS and runs fine. In-between these temps, failure to get beyond "CMOS Checksum Error" and errors in Windows, go up in frequency as temp drops. Multiple troubleshooting attempts have been made (Clear CMOS, BIOS flash, swap hardware, remount board in case, strip down system, etc), problem appears isolated to motherboard itself. Different BIOS, bios defaults, etc, have been tried. System is not overclocked. What are the potential cause(s) and the best methods to check these? LONG VERSION: A7N8X Deluxe Motherboard rev 2.00 Socket A, nForce2-400 Current BIOS 1008 Athlon XP2400 512MB Kingston PC3200 Fortron 400W Power Supply ATI AIW 128 Pro 1 HDD, CDROM, Floppy, (typical non-power-user PC) Approximately 1 year old, system remained unchanged for that period of time and had (guessing) about 800 hours of on-time, it was not used very much, and AFAIK, nothing demanding, it had an easy life so far. All settings were conservative, default values, no overclocking and minimal BIOS changes. Board appears to be very sensitive to temperature, but not too hot, rather too cold. If case thermometer (not motherboard integrated temp sensor but a separate digital probe) reads below approx. 27C, then powering on system from soft-off state results in system displaying the following message: "CMOS Checksum Error" Then system proceeds to do an Award BIOS recovery by booting to awdflash if appropriate floppy is already in the system. There is no option to do anything else, the BIOS setup is not accessible and the attempt to boot floppy is automatic, not a normal "boot to floppy" event as would occur with any normally-working system that has a boot floppy in it. Clearing CMOS and/or loading setup defaults does not resolve this. If the ambient temp is right at 25C or slightly higher, multiple attempts at powering off, then on, will result in system posting with setup defaults for FSB & multiplier, successfully doing so at a rate of roughly 1 in 5 tries, more often as room temp rises, less as temp falls. Even if system is manually set to same speeds or very low speed (6X multiplier for CPU and 100MHz FSB and memory), after saving these changes (or not) the system will not proceed with POST again, it takes several tries to get system to post, again displaying the "CMOS checksum Error" each time. Unplugging power supply from AC had no effect, nor did clearing CMOS. I've not pinned down the EXACT temp, since it gets progressively worse and the range is fairly tight, but it roughly corresponds to 25C-32C being fail-pass thresholds, certainly within 10C temp rise it goes from unable to POST to working fine. This has been deliberately reproduced (later) by cranking up an air conditioner, it is clearly low-temperature related, but first the steps prior to this conclusion... It does sometimes POST after saving the changes, but if then powered off it may not POST the next time... seems to still be marginal regardless of the BIOS settings, since even loading setup defaults and clearing CMOS didn't change the roughly 1-in-5 success rate. Every time it fails, the video does come up and it does attempt to boot to floppy, it never just acts dead, always has video display. Every common troubleshooting procedure I could think of was tried to no avail. Certainly more than mentioned here but due to the length of the post I'll try to list what seems most relevant. Initally suspecting BIOS corruption, I'd flashed the board with same bios version, which at the time seemed to work, but later it was discovered that the difference was instead that the ambient temperature was higher than previously, because soon enough the temp had dropped and system again failed to do anything more than "CMOS Checksum Error". The next time I had a chance I flashed the next, newest bios version, with no change. I'd been suspecting the often rumored "nvidia bios corruption" problem, which seems to occur from exiting the bios too quickly when saving settings, but this was not the case with this board. Later it was noted that doing NOTHING to system other than leaving it sit until ambient temp rose, would return system to 100% stable state. Even overclocking quite a bit it passed several different stress tests at 32C room temp, but once temp falls again, still not stable even at 6 x 100. I could understand if it these were arctic conditions but such a drastic change within a span of 10C seems quite unusual. Indeed, several other systems in same room do work fine at same temp. Also notable is that if the temp is barely high enough to get it to post and boot, running a stress test like Prime95 results in errors within a minute or so, yet with ambient temp 10C higher the system not only passes same Prime95 test for 24 hours, but can even pass it running at 50% higher FSB, Memory clock, and CPU frequency. Trying to isolate the problem I'd changed power supplies 3 times with known/proven good 400W+ units, unplugged nonessential cards, swapped video and memory, ran in minimal configuration and checked every mechanical connection as well as possible (including pulling/inspecting/reinstalling the EEPROM and jumpers), all with known good/working parts. It seems that the motherboard itself is simply intolerant of quite mild temperature drop. Normally I'd just replace it but this is quite puzzling, unique for such a small temp span, and I'd like to get to the bottom of it. There are no visable problems with the board, capacitors look fine and no visable cracking or other physical abnormalities, though I don't have the means to check this with a microscope, especially since a motherboard is a bit wider than most 'scope's reach. Since this is a very popular motherboard and I've not heard of anyone else having this problem (or perhaps they just didn't isolate the cause as low temp?) I wonder if this is an isolated flaw, but the closest examination I could make showed nothing unusual and it does work fine, never this problem (or any other that I'm aware of) once room temp rises by 5-10C. The system case is very well ventilated, ambient room temp never causes interior air temp to go up much except immediately adjacent to heatsink, as expected. I thought about the battery but voltage on it reads OK and it shouldn't explain the instability after booting and running Windows. I don't recall if I ended up putting a different battery in it or not but will do so just for the heck of it. One possibility I'm wondering about is whether one or more of the capacitors are dropping their ESR as the temp falls, if one or more are marginal and this is the cause. It might be a bit difficult to easy check this though, I'd though about possible touching a small light bulb to each in turn, individually warming them to see if that made any difference, but that could take quite a long time, especially if multiple caps are involved, since it could be necessary to wait till each cooled to try, isolate the next cap. It also seems difficult to determine their core temp without getting the outside can quite hot, as any non-destructive temp reading would be of the outer can. It seems a rather crude way of warming them too but I'm drawing a blank as to how to individually warm a capacitor without also warming the surrounding area, or at least minimizing that as much as reasonably possible. I could instead touch the leads with a solderin iron but would prefer to leave the solder alone if possible. I suppose I could take the opposite approach and warm up the board then use freeze spray on each cap, but that doesn't seem a very good approach either, since it might easily (probably would) lower the cap temp too much, introducing further failures that aren't present at 27C, not until much colder, and it again seems difficult to thoroughly chill the core of individual caps without changing surrounding area temp by this small 10C thermal margin. Another possibility I'd considered is temporarily placing a tantalum cap in parallel with (as many suspect caps as possible), since tantalums should be much more tolerant of low temp (IIRC), but this also seems to be a lengthly, tedious process that would best be avoided if anyone has a better idea? Even if I don't solve this problem I wanted to at least get this bit of info out there, that at the very least this one board is effected by a relatively small temp change, but due to the type of problem I wonder if it's more frequent, a CMOS Checksum Error is not all that uncommon and "some" of the occurrences of a Checksum Error might be misdiagnosed... and some boards not even getting far enough to post "CMOS Checksum Error" might do so if they were a little warmer. Dry solder joint? Board mount shrinks infinitessimally at lower temp? David in Norfolk UK |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Asus A7N8X Deluxe Motherboard | [email protected] | General | 4 | December 19th 03 01:54 PM |
I need to replace a bad Asus A7V motherboard | David Cook | General | 1 | October 31st 03 01:40 AM |
incorrect temperature measurements... | KILOWATT | General | 8 | October 1st 03 02:49 AM |
your motherboard | redbuffalo003 | General | 0 | September 21st 03 04:08 PM |
Thermal pad or Thermal paste? | Vin | General | 68 | September 17th 03 05:38 AM |