View Single Post
  #2  
Old October 25th 11, 02:21 AM posted to alt.comp.periphs.videocards.nvidia,alt.os.linux.ubuntu
Paul
external usenet poster
 
Posts: 13,364
Default GeForce 9800 GX2 - Only one GPU reported under linux

Craig Harrison wrote:
Hi folks,

My Dual GPU 9800 GX2 has developed a problem! Under Linux only 1 GPUis
ever reported, this did not used to be the case, and has only come
apparent since installing Kubuntu 11.10.

However I have tested by re-installing Mint 11 (Where this used to
correctly report both GPU's and SLi configuration, however this too only
reports a single GPU now.

Is this a hardware issue, or has the latest Linux GeFOrce driver got an
issue? below is the result of LSPCI and nvidia-smi
Regards

Craig.

craig@AmsTech-Desktop:~$ lspci


02:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4...
03:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4...
03:02.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4...
05:00.0 VGA compatible controller: nVidia Corporation G92 [GeForce 9800 GX2] (rev a2)


resend - problem with E-S server...

The person here, has both a 9800GX2 and a 9400GT card. The first line is the
9400GT, and the rest is the 9800GX2.

http://www.nvnews.net/vbulletin/showthread.php?t=154384

01:00.0 VGA compatible controller: nVidia Corporation G96 [GeForce 9400 GT] (rev a1)
07:00.0 PCI bridge: nVidia Corporation PCI express bridge for Quadro Plex S4...
08:00.0 PCI bridge: nVidia Corporation PCI express bridge for Quadro Plex S4...
08:02.0 PCI bridge: nVidia Corporation PCI express bridge for Quadro Plex S4...
09:00.0 3D controller: nVidia Corporation G92 [GeForce 9800 GX2] (rev a2)
0a:00.0 VGA compatible controller: nVidia Corporation G92 [GeForce 9800 GX2] (rev a2)

In my non-expert opinion, yes, one of your GPUs is missing. It looks like the
PCI switch chip is still there, and one GPU.

Have you ever taken your 9800GX2 apart ? Maybe it's a bad connection between
the two PCBs.

The other thing that is puzzling, is the identifier for the switch chip.

*******

If I look at this one... And selectively snip bits of it.

http://mushkingames.com/phpbb2/viewt...13090&start=15

[ AFAIK B=Bus D=Device F=Function ]

B01 D00 F00: nVIDIA nForce 200 (BR04) PCI Express 2.0 Switch
B02 D00 F00: nVIDIA nForce 200 (BR04) PCI Express 2.0 Switch
B02 D02 F00: nVIDIA nForce 200 (BR04) PCI Express 2.0 Switch
B03 D00 F00: nVIDIA GeForce 9800 GX2 Video Adapter
B04 D00 F00: nVIDIA GeForce 9800 GX2 Video Adapter

The BR04 may be

Vendor ID 0x10DE
Model ID 0x05BE

when I compare it to the list here (NF200 entries).

http://pciids.sourceforge.net/pci.ids

So I'm guessing right now, that it isn't a problem with a mis-identified
NF200. More likely, it's something with GPU or wiring between GPU and
the switch chip.

The entries should start with simple bus probes to config space.
I don't know if something like a missing VESA rom info, would do that,
or whether what you're seeing, is a failure to get an ACK of any
sort, back from one of the GPUs. It could be as simple as snapping one of the
surface mount coupling caps off the capacitively coupled bus lanes. If you
pick the right lane for that, you can prevent detection. While PCI Express
buses can dynamically resize (thus avoiding some lane failures), I don't
think they can work around any arbitrary lane failing. Some lanes are
more important than others (like say, lane zero).

I suppose it could also be something like a power converter failure
next to the GPU, denying power to the core of the GPU, and preventing
it from answering probes.

Just a non-expert guess,

Paul