View Single Post
  #9  
Old March 13th 10, 04:32 PM posted to alt.comp.hardware.amd.x86-64,alt.comp.hardwre.amd.x86-64,alt.comp.periphs.mainboard.msi-microstar
Ant[_3_]
external usenet poster
 
Posts: 756
Default "TLB parity error in virtual array; TLB error 'instruction"?

On 3/13/2010 1:21 AM PT, Paul typed:

The author of cpuburn, told me to try seven and 37 "nice -19 ./burnMMX
P &" separately. I ran them for many hours, and no problems. I am
starting to notice that the errors and kernel panics seem to only
occur when my system is idled (again, not using cool'n'quiet).


The TLB is part of your processor. It converts virtual addresses into
physical addresses. And it involves a small memory to store the
entries.

To test it, you'd need a test software specifically designed to verify
that it can hold entries, and the entries are pointed to the right
physical locations. I haven't read of any programs that do that
specifically.

The processor memory BIST function, is an example of a "structural" test,
which is used to verify that a chunk of hardware works. When we talk
about running test programs on the computer, those are "functional" tests.
It can be much harder, to get good test coverage, using nothing
but functional tests.

If you want a test case, that can make the situation worse, you'd need
a test program with a random access characteristic, something which causes
so many TLB entries to be used, that there are lots of page table walks,
swapping out of least recently used entries in the TLB and so on.

You can see someone characterizing a TLB here. I think the program
they're using, runs under Windows.

http://ixbtlabs.com/articles2/rmma/rmma-dothan.html

RMMA

http://cpu.rightmark.org/download.shtml

I have no idea under Windows, how a TLB ECC error would show up.
Machine check exception ? Or something in the Event Viewer ?


Thanks. I wonder if I can run it off WinPE CDs.

FYI, I still got errors overnight (no crashes) so I guess the
uninstallation of some the two packages and a reboot didn't help
according to dmesg:

[ 3299.988650] Machine check events logged
[ 8549.989026] Machine check events logged


HOWEVER, I did notice this:
[ 19.939430] powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core
Processor 4600+ processors (2 cpu cores) (version 2.20.00)
[ 19.939453] [Firmware Bug]: powernow-k8: No compatible ACPI _PSS
objects found.
[ 19.939454] [Firmware Bug]: powernow-k8: Try again with latest BIOS.

I think that error shows up only because I have it disabled in BIOS. I
wonder if enabling and using Cool'n'Quiet will fix the problems?

In the past I used these lines in my /etc/rc.local file (been using it
since my single core Athlon64 754 CPU system too):
modprobe powernow-k8
echo ondemand /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo 1 /sys/devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice_load
I added "#" to the beginning of each line to avoid running them with
disabled AMD's Cool'n'Quiet.
--
"Still we live meanly, like ants;... like pygmies we fight with
cranes;... Our life is frittered away by detail. Simplify...
simplify..." --Henry Thoreau
/\___/\
/ /\ /\ \ Phil./Ant @ http://antfarm.ma.cx (Personal Web Site)
| |o o| | Ant's Quality Foraged Links: http://aqfl.net
\ _ / Nuke ANT from e-mail address: NT
( ) or

Ant is currently not listening to any songs on his home computer.