View Single Post
  #8  
Old November 30th 04, 05:52 PM
Johnny
external usenet poster
 
Posts: n/a
Default

Paul wrote:
I don't normally top post, but don't want to try to trim the
rest of this down.

Some random observations:

1) Could this be a Hyperthreading problem ? Is Hyperthreading
disabled in the BIOS ? I don't know my Hyperthreading policy
versus OS, but perhaps if you were quitting Passmark between
runs, maybe the program is running on a different virtual
processor each time, and one virtual processor has more load
than the other. If you disable Hyperthreading in the BIOS,
the perf difference might stop.

In any case, Hyperthreading is not all it is cracked up to
be. In some cases, it is a clear win, but in other cases it
can trash the performance of the memory subsystem, and actually
run slower than without it.

WOW!!! Before altering any voltages or settings, just running the standard
[auto] jumperless detection settings and simply setting CPU hyperthreading
[disabled] option, the results are now, well, somewhat different!!
How thorough or accurate passmark is I know not but for purposes of
comparison it's useful. It's difficult to present the results in here but
the scores for example of the CPU suite of tests are as follows in my
attempt at a table (hope it comes out ok).

cpu test hyperthreading [enabled] hyperthreading
[disabled]

integer math 170/246 varies 257 solid
floating p math 230 291
mmx 181 278
sse 131 164
compression 1319 1868
encryption 6.8 10.9
image rotation 113 195.9
string sorting 665 810

CPU passmark 322 467
integer math

I havent managed to get anything other than very close to the numbers above
with hyperthreading [disabled], it is solid. [disabled] hyperthreading has
also affected the memory test benchmark speeds, presumably due to the
increased CPU performance.

all this before altering any voltages or any other settings, blimey!

2) Increase Vdimm to the Corsair. DDR400 memory needs 2.6V to
start with, and you may find bumping the memory voltage up
a couple notches stops the errors. If the memory passes memtest86
in an overnight test without errors, use Prime95 torture test
in mixed mode, and see if it runs error free as well. I've had
memory pass memtest86 and fail Prime95.

3) Look up your Corsair memory he

http://corsairmicro.com/corsair/xms.html

Click the link and download the datasheet. For example, 3200XL
is rated for 2.75V and you could try that. The datasheet for
3200XL claims the SPD is loaded with 2-2-2-5, so it shouldn't
start at 2.5-2-2 on its own. If this is some other memory,
you may need to post in this forum, and get some help with
your product - or search for someone having the same system
as you've got:

The product is CMX512-3200XLPT listed on their site under CMX512-3200XL and
it clearly states 2.75V. Changing the voltage to 2.75V has stopped the
blackouts.

For interest here are the passmark memory results before (but with
hyperthreading disabled) and after voltage change. The - configure DRAM
timing by speed option is [enabled] in bios

test [auto] 2.75V[auto]
[manual] 2.75V / 2.0-2-2-5

allocate small block 1162.8 1163 1164.8
read cached 1390 1389.7
1389.9
read uncached 1326.6 1328.3 1328.8
write 809.4 809.7
809.4

altering the dram burst timing between 4 and 8 clocks appeared to make no
difference in these tests. having memory acceleration enabled gave the
following 1165.4,1389.3, 1340.2, 810 so only read uncached improved
slightly but consistently.

**** INTEL/AMD/VIA memory config info, c't/Andreas Stiller V2.7 June 03
****
Kernel Driver: WinNT DIRECTNT.SYS V01.09
Pentium 4,(0F34-00)ca 3274 MHz (sleep) 2999 MHz (load)
Bus Speed: max=200MHz, ratio=15 = 200 MHz
Hostdevice: (2570) Springdale i865 MCH, Vendor: (8086) Intel, Rev:0002h
----------------------------------------------------------------
Intel Springdale i865 MCH Rev:02: Bus:0, Device-Nr:0, Function:0
System Frequency : FSB533/133 MHz
Memory Frequency : DDR266/133 MHz (1:1)
IOQ Depth : 12 deep
Top of usable Memory : 1024.0 MByte
Extended SMRAM (Tseg) : disabled
Overflowdevice : disabled and unlocked, ID= 2576h, Rev: 2
Memory Delays Base Address : FECF0000 not prefetchable
CPU Parking : disabled
Memory : row0: 512 MByte/16 KB Pages
: row1: 512 MByte/16 KB Pages
DRAM-Channels : Dual Channel Linear, DDR
ECC & Refresh : Non-ECC, Refresh=7.8 µs
PAT-mode : (1) fully enabled
Active to Precharge Delay : 5 clocks .. 70 µs
Tcl - Trcd -Trp : 2-2-2 T (DRAM Clocks)

Memory Read Bandwidth : ca. 5780.5 MBytes/s, Cacheline size= 64
go on with CR




http://www.houseofhelp.com/forums/fo...hp?forumid=128

4) CTIAW and memtest86 disagree on your PAT setting. I don't know
what to make of that.

5) There is a possible reason for CTIAW mis-reporting the bus
speed. An 865PE Northbridge is not supposed to have PAT, but
Asus and others use a trick to enable it. The processor has
two signals called BSEL, and they indicate the bus speed rating
of the processor (400, 533, 800 etc). The BSEL signals are
normally routed from the processor to the Northbridge and to
the clockgen. What Asus did, is they disconnected that link.
Asus sends a fake value of BSEL to the Northbridge - I think
if the FSB is set to 533, PAT is enabled, so by sending the
533 bit pattern to the Northbridge, but setting the clockgen
to 800, PAT is enabled, and the memory can run at DDR400, just
like on an 875P Northbridge. I think what CTIAW could be doing,
is reading the Northbridge register, instead of checking the
clockgen. This trick is great for fooling the hardware, but
software authors have to be aware of the trick too, to get
the info right.

6) I dug up some benchmarks you can try. Maybe these will be
reproducible from run to run.

http://www.super-computing.org/
ftp://pi.super-computing.org/windows/super_pi.zip

Super_pi computes PI, and you select the number of digits from
the menu. You double click the .exe, to run a Windows dialog.
Select the number of digits to calculate and then run it.
I just ran 1 million digits, and it takes 48 seconds
on my 2.8C with 2x512MB 2-2-2-6 memory. I did two test runs and
they had exactly the same test time. A file is created in the
install directory with the results of the calculation.
The test time and the amount of memory used increase
with the digits setting. Some people use the 32M setting
as a stability test for new motherboards.

44 seconds with hyper threading [disabled]
53 seconds with hyper threading [enabled]

as you say this test is consistent


Here is a second test:

This is some kind of finite element analysis. It was
posted by the author a while back. It uses a good chunk
of memory, and judging by the CPU heating, is not memory
bound, but does a fair amount of computing. To use it,
unzip the file, fire up a MSDOS window, cd to the unzipped
directory, then type "now" into the MSDOS window, to execute
now.bat . After it reaches "step 992", it will finish, and
print the number of "MUPs", which are millions of operations
per second. My computer takes 202 or 203 seconds to run the
benchmark, and achieves a rating of 12.27 MUPs (the number
is printed in scientific notation, so shift the decimal
point as appropriate).

with hyperthreading [enabled]

242 - 244seconds 10.16 - 10.24 MUPs +/- 0.04% (i assume)

with hyperthreading [disabled]

203seconds 12.21 MUPs +/- 0.06% consistently.

http://users.viawest.net/~hwstock/bench/3d0/3d0.zip

Instructions and some background info are he
http://www.abxzone.com/forums/showthread.php?t=70142

Those two tests are reproducible for me. Give them a
try, with and without Hyperthreading turned on in the
BIOS.

Note: The 3d0 program is a bit unhygenic, and leaves
a bunch of files in its directory. You may want to
dump all but the original files, when the directory
fills up.

Be interested to hear what you make of that lot. Obviously hyperthreading is
doing the bulk of the damage but the memory scores seem a little low also.
I'll run the memtest and mess with some other BIOS settings later but I have
to go make some money now.

many thanks,
J

HTH,
Paul

In article , "Johnny"
wrote:
Paul wrote:
In article , "Johnny"
wrote:

This is a repost as the other didn't appear so if it pops up
twice, sorry.

I posted a while ago the dismal performance I'm getting with
this board and a Prescott 3.0ghz cpu with 2 x 512K crucial
2-2-2-5 ddr400 memory. I've noticed the passmark cpu tests give
significant differences but not entirely sure if that's not
unusual - is it possible the cpu or motherboard is faulty
even though the system works albeit relatively slowly. This
thing has me totally flummoxed and perplexed. I've swapped out
a power supply from another machine with no change (don't know
why but thought it might be a power issue). I haven't got access
to another 800FSB cpu to compare and not sure I'll get any sense
out of the tech support as it is actually working which is
frustrating in the extreme. If I select turbo mode the board dies -
it literally blacks out completely requiring a hard power off
to get bios back with the post message that overclocking failed???
I'm really getting ****ed off with this now - is it likely the
cpu or mainboard are faulty or just a combo of the two, who knows?

Are you talking about Passmark giving different results from
repeating the Passmark test ? Or are you talking about comparing
Passmark from your current machine, to a previous slower machine,
or comparing to the Futuremark database ?

I'm comparing the exact same machine with repeated runs and differing
results within a short time period.

With respect to your turbo mode setting, turbo requires the use
of CAS2 memory, which you've got. So, it should have worked.
A "black out" is what happens with CAS2.5 or CAS3 memory, when
turbo is selected.

The memory is 2-2-2-5 Corsair (not crucial).

How many ways are there to make a slow processor:

1) Internal CPU cache has ECC protection. If the internal cache
has bad bits in it, a single bit in error can be corrected
by the ECC checker, but at the price of extra cycles to
attempt to correct the data.

I don't know whether Memtest86 can detect this kind of
fault or not.

I have a sinking feeling I'm going to need a like for like swap out
comparison which I can't get yet, although I have a couple of PC's
to make so might well get the opportunity although I obviously want
to avoid replicating this problem in their machines so may well
choose a different manufacturers board.

2) Intel processors have thermal throttle. In the case of the
Prescott, the processor reduces the internal instruction rate
when the die temperature reaches 70C. If the CPU die is not
making good contact with the heat spreader on the top of the
chip, it might be possible for the die to be hot, yet the
heatsink won't be that hot. There is a thermal paste inside
the processor, between the top of the die and the heat
spreader, and if that paste was missing, your performance
could drop.

The CPU temperature hovers around 40C tonight it is 37-38. I'm always
careful to spread a thin but effective layer of thermal contact
grease on the cpu heat sink, I removed it to check the grease was
where it needed to be and doing its job, the coverage was complete
(at least visually) and the temperature monitors seem to support
that. I should say I have built a good number of PC's so i'm not a
total novice. I always check carefully the cpu is properly seated
and gripped before fitting the heatsink etc. I don't normally have
any issues at all that aren't inherent faults in the hardware.

3) ACPI has an option to reduce the processor clock rate. But
I doubt that is doing anything in this case. ACPI might
use this option, when the processor is idle, to reduce the
processor operating temperature. During benchmarks, the OS
would turn this off again.

I don't think, at this point, temperature is an issue in this
instance.

4) Many Northbridge chips have throttle capabilities for the
DIMMs. See section 5.5 (pg.140) of this document, for features
of the 875 Northbridge regarding protecting the DIMMs against
overheat. I doubt Asus bothered with thermal sensors next to
the DIMMs, but there is still the software method:

http://developer.intel.com/design/ch...s/25252502.pdf

"The number of hexwords transferred over the DRAM interface
are tracked per row. The tracking mechanism takes into account
that the DRAM devices consume different levels of power based
on cycle type (i.e., page hit/miss/empty). If the programmed
threshold is exceeded during a monitoring window, the activity
on the DRAM interface is reduced. This helps in lowering the
power and temperature."

5) You could be experiencing an "interrupt storm". There have been
motherboards in the past, where a particular PCI chip on the
motherboard keeps asserting its IRQ, causing the interrupt
handler to be invoked needlessly, and sucking performance from
the machine. Looking at performance counters might identify such
a problem. (There is one report in Google against the Promise
20378, so see if you can run with that chip disabled, then
run Passmark again.)

All integrated devices turned off for test purposes - promise
controller, firewire, audio, network. Left serials and parallel
enabled.

6) The PCI Latency Timer setting could influence performance.
A setting lower than 16, could make I/O slow, but the BIOS
on this machine doesn't allow such low settings. Lower settings
promote "fairness" between peripherals, so a sound card can
still get data while a disk drive is doing burst transfers.
A high setting might allow a better disk benchmark, at the
expense of general usability of the computer.

I haven't had too much luck using performance counters in Windows.
I've read that there are all sorts of fancy metrics in Windows, but
maybe you need a plugin/snapin to see them ? I still don't know
what the missing ingredient might be.

It may be easier to see some of these performance counters in
Linux.

The toughest part of your problem, will be finding baseline
numbers for exactly what your combo of hardware should be
doing. Does Futuremark collect enough data, to make sure
the BIOS settings that affect memory performance are the
same, when you compare to other hardware ? If Passmark is
not collecting info on whether PAT is enabled, for example,
that might make a difference to benchmarks.

I tried researching in two directions. I looked for benchmarks
that are a bit simpler than Passmark, and for the CPU, there
is the HINT benchmark. But all knowledge of it is gone from
the .gov site it was on, and even web.archive.org has no
copy of the site. I also tried to find info on performance
counters, and didn't have much luck there, either. Intel
has a $$$ program called Vtune, which is a profiler used by
software developers, but that isn't free.

I was hoping by using free tools, we could compare machines,
and see if you really are slower than other comparable machine,
and what part of the machine is slower. Some things you can
try:

1) memtest.org has version 1.4 of memtest86 available. It is
presumably the same as the other versions, when it comes to
measuring bandwidth. I get L1=8KB=22940MB/s, L2=512KB=19571MB/s,
and main memory is 2955MB/s. Memtest claims PAT is enabled
on my machine. I have a 2.8C Northwood, 2x512MB 2-2-2-6 RAM,
running at stock speed.
2) ftp://ftp.heise.de/pub/ct/ctsi/ctiaw.zip
This runs from a DOS window, and reports a few settings.
It is a way to verify that PAT is enabled. Mine says "fully
enabled".

http://abxzone.com/forums/showthread...ighlight=ctiaw

It also reports two values at the top of the screen, the
"sleep" speed and the "load" speed. On my 2.8C, the values
reported are both close to 2800MHz. It seems other processors
are using different frequencies for this, but I don't know
why. It could be the ACPI throttle feature, not sure.

As for performance counters, booting a copy of Knoppix or some
other Linux distro, might give access to more info than you can
get easily from Windows. If I do "vmstat 5" in a console
window, it says I get exactly 1000 interrupts per second.
(The number will be related to clock tick interrupts, as in
this scenario the system was idle, except for vmstat running.)
If I had a defective 20378, that number would undoubtedly climb.
I don't know how much work it is to get Windows to display
the same stat, whether it is total interrupts, or interrupts
per peripheral device.

HTH,
Paul


Paul, first of all thanks for the help and taking the time to reply,
it's much appreciated. I'm using win2Kpro.

One example I noticed tonight is the CPU passmark suite of tests are
all roughly similar within fractions of a percent (i don't know how
this software rates in the scheme of things but it was readily
available so i used it) All except for the CPU integer math test
which scored 170 mops then a few seconds later scored 246.8 then a
few seconds later scored 168 but it doesn't repeat that pattern of
higher then lower. All three scores well below my other test machine
running a 533fsb 2.8 ghz northwood on a P4PE with ddr333 generic ram
which scores 261 consistently. During this time testing the prescott
3.0Ghz there is no change in monitored temperatures on the P4P800-E.
CPU = 37-38C Board = 29C.

this is what the ftp://ftp.heise.de/pub/ct/ctsi/ctiaw.zip software
reports

**** INTEL/AMD/VIA memory config info, c't/Andreas Stiller V2.7
June 03
****
Kernel Driver: WinNT DIRECTNT.SYS V01.09
Pentium 4,(0F34-00)ca 3274 MHz (sleep) 2999 MHz (load)
Bus Speed: max=200MHz, ratio=15 = 200 MHz
Hostdevice: (2570) Springdale i865 MCH, Vendor: (8086) Intel,
Rev:0002h
----------------------------------------------------------------
Intel Springdale i865 MCH Rev:02: Bus:0, Device-Nr:0, Function:0
System Frequency : FSB533/133 MHz
Memory Frequency : DDR266/133 MHz (1:1)
IOQ Depth : 12 deep
Top of usable Memory : 1024.0 MByte
Extended SMRAM (Tseg) : disabled
Overflowdevice : disabled and unlocked, ID= 2576h,
Rev: 2 Memory Delays Base Address : FECF0000 not prefetchable
CPU Parking : disabled
Memory : row0: 512 MByte/16 KB Pages
: row1: 512 MByte/16 KB Pages
DRAM-Channels : Dual Channel Linear, DDR
ECC & Refresh : Non-ECC, Refresh=7.8 µs
PAT-mode : (1) fully enabled
Active to Precharge Delay : 5 clocks .. 70 µs
Tcl - Trcd -Trp : 2.5-2-2 T (DRAM Clocks)

Memory Read Bandwidth : ca. 5715.6 MBytes/s, Cacheline size=
64 go on with CR

so it looks like the system and memory frequencies are set or
operating incorrectly at 5333FSB/266Mhz although the boards bios is
autodetecting and displaying 800FSB/400Mhz, this is surely the
software not reading the system settings correctly. Also the CAS 2.5
looks suspicious unless corsair are a bunch of bandits. Selecting
2.0 in bios blacks out the board and gives the overclocking failed
message after hard reset.

memtest 1.4 gives the following info

Pentium 4 (0.09) 2999Mhz
L1 cache 16K 20969MB/s
L2 cache 1024K 18396MB/s
Memory 1023M 2928MB/s
Chipset : i848/i865 ECC disabled FSB199Mhz PAT disabled
RAM 199Mhz (DDR 398) CAS 2.5-2-2-5 Dual channel (128bit)1

test #6 moving inversions, 32bit reports 3 counts of an error at
130.1 MB

whether that is significant in relation to speed issues I doubt,
although have to say I wouldn't have expected to see any errors on
new RAM.
I'm going to keep messing but I've had enough now tonight. Will give
it a go tomorrow evening, ******* computers.

Thanks,
J