If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#21
|
|||
|
|||
dvus wrote:
Walter Mitty wrote: factory wrote: CPU's generally only have one pipeline, maily because they generally operate in a serial manner. The stuff a GPU operates on is highly parallelised so going with multiple pipelines makes sense. rubbish. Now, there's a convincing argument... Its not an argument. Anyone can find out for themselves. |
#23
|
|||
|
|||
Memnoch wrote:
On 8 Dec 2004 21:28:10 -0800, wrote: http://www.xbitlabs.com/news/video/d...208014939.html "While NVIDIA remains extremely tight-lipped over its future products, it is known that the company is readying its code-named NV47 visual processing unit, a massively revamped GeForce 6 architecture with 24 pixel pipelines. The NV47 is expected to be released sometime in Spring, 2005, but it is unknown whether NVIDIA is ahead, or behind ATI's R520 product. The status of NVIDIA's future architecture code-named NV50 is also uncertain: some reported recently that the chip had been cancelled, but officials decline to confirm or deny the information." No doubt they are blowing smoke up ATI's collective arse. But interesting reading nonetheless. I wonder if the Register and other reports are simply miscontruing the two chips. I.E., there is no NV50 yet, but a 47, or that an early revision of one of the chips was scrapped (sort of like what happened with early R500 development or what NVIDIA initially did with the NV40). |
#24
|
|||
|
|||
CPU's generally only have one pipeline, maily because they generally operate in a serial manner. The stuff a GPU operates on is highly parallelised so going with multiple pipelines makes sense. rubbish. Now, there's a convincing argument... Its not an argument. Anyone can find out for themselves. If you want to go on arguing, you'll have to pay for another five minutes. And this isn't the complain department, its 'Getting hit on the head lessons' in here. Waa |
#25
|
|||
|
|||
I'm happy with my 12 pipes.
|
#26
|
|||
|
|||
Yea and my Wesley Pipes !!!
wrote in message oups.com... I'm happy with my 12 pipes. |
#27
|
|||
|
|||
The 486 was the last CPU to have one pipeline....
The Athlon 64 has 9 the Pentium 4 7. Still the more piplines you have the harder it is to push up mhz. And if GPU deal with more parallel data than CPU's.... The A64 has two pipelines, one for integer and one for floating point, (cite: http://sandpile.org/impl/k8.htm) what you are referring to is pipeline stages, which is not the same as a pipeline. What is the difference? Yadda yaddayadda .. it doesn't MATTER. That's a red herring. The real difference is in the fact that CPU executes linear sequence of instructions, yes, it is possible to re-order instructions and execute them out-of-order when there are no dependencies to other instructions and so on. But this is just going for straws. Now look at GPU, the instructions that are executed are SAME for each pixel in a primitive that is being scanconverted. If there is a triangle with 100 pixels, yes, every single one of those 100 pixels execute precisely the same shader instructions. The data that comes from samplers varies, but the shader is the same. This means it is feasible to throw N shaders at the problem and get the job done N times faster (theoretically). In practise getting values from samplers is I/O bound, there is specific amount of memory bandwidth the system can sustain, after that, the system is bandwidth limited. To remedy this to a degree the memory subsystem been diviced to multiple stages where the stage closer to the shader unit is faster, but smaller and the slowest memory is in the DDR3 (for example) modules but is the cheapest kind of memory so there is most of that type. Just over simplification of a typical contemporary GPU but should do the trick. Now, this is a bit different as pipelined CPU architechture.. because.. the term is simply abused by the almost-know-what-talking-about people. Generally the people who are clued-in talk about shader arrays (this terminology depends on the corporate culture you are from) or similiar. The GPU IMPLEMENTATION can be pipelined, or not, more likely it is pipelined than not because non-pipelined architechtures are ****ing slow and inefficient. This applies to a single shader, not array, an array of shaders is not normally refered as "pipelined" when the array containst more than a single shader: that is completely different issue. And to go back to things that annoy in the CPU discussion above... The 486 was the last CPU to have one pipeline.... The Athlon 64 has 9 the Pentium 4 7. Still the more piplines you have the harder it is to push up mhz. And if GPU deal with more parallel data than CPU's.... 486DX has 487, by the above definition that would count as a 'pipeline', but the correct terminology would be 'execution unit'. The real meat of this is in the fact that 386 wasn't 'pipelined', the 486 was 'pipelined', it was the processor architechture from *Intel* that introduced pipelining in the mainstream x86 product line. It was the Pentium which introduced multi-execution-unit ALU core to x86 product line, those were called the U and V pipe, literally. The next core design was something completely different: it did decode the x86 instruction stream into micro-ops, which were executed on number of (3 if I remember correctly!) execution units. Two of which were simpler and executed only simplest instructions and one which executed more complex instructions such as division, multiple and such. This was the PentiumPRO architechture, which was used in PentiumII and PentiumIII aswell, with the difference that MMX and SSE were added on the consequent processors. But why I am telling this is that the PPRO architechture wasn't really 'multi-pipe' in the traditional sense, it was multiple execution units and out-of-order execution of single instruction stream in micro-op level. The next design, NetBurst architechture went a step further.. the decoded instruction streams were stored in so-called trace cache, again in multiple execution units and the pipeline length was more than doubled since the previous generations (I don't insult anyone by explaining what pipelining means in practise, assumed that the reader is familiar with microprocessor design basics for crying out loud). The pipeline was broken into more distinct stages to reach higher operating frequencies. Simpler stages complete in smaller time, therefore the frequency is possible to increase and still have a design that works reliably and predictably. This seems to be market driven decision instead of purely engineering decision but that can be speculated so everyone can draw their own conclusions that is just mine and not necessarily Truth. Anyway, the point I am drawing to is that the 'pipelining' in CPU -or- GPU is implementation detail and not relevant to shader arrays per-se. Merry Xmas! |
#28
|
|||
|
|||
Modern CPU's generally have more than one integer pipeline so they can
decode/fetch both branches of a conditional statement in advance of the condition being evaluated. These integer pipelines are in addition to the FPU pipeline(s). CPU's generally only have one pipeline, maily because they generally operate in a serial manner. The stuff a GPU operates on is highly parallelised so going with multiple pipelines makes sense. - Factory -Bill (remove "botizer" to reply via email) |
#29
|
|||
|
|||
assaarpa wrote:
Yadda yaddayadda .. it doesn't MATTER. That's a red herring. The real difference is in the fact that CPU executes linear sequence of instructions, yes, it is possible to re-order instructions and execute them out-of-order when there are no dependencies to other instructions and so on. But this is just going for straws. Now look at GPU, the instructions that are executed are SAME for each pixel in a primitive that is being scanconverted. If there is a triangle with 100 pixels, yes, every single one of those 100 pixels execute precisely the same shader instructions. The data that comes from samplers varies, but the shader is the same. This means it is feasible to throw N shaders at the problem and get the job done N times faster (theoretically). In practise getting values from samplers is I/O bound, there is specific amount of memory bandwidth the system can sustain, after that, the system is bandwidth limited. To remedy this to a degree the memory subsystem been diviced to multiple stages where the stage closer to the shader unit is faster, but smaller and the slowest memory is in the DDR3 (for example) modules but is the cheapest kind of memory so there is most of that type. Just over simplification of a typical contemporary GPU but should do the trick. Now, this is a bit different as pipelined CPU architechture.. because.. the term is simply abused by the almost-know-what-talking-about people. Generally the people who are clued-in talk about shader arrays (this terminology depends on the corporate culture you are from) or similiar. The GPU IMPLEMENTATION can be pipelined, or not, more likely it is pipelined than not because non-pipelined architechtures are ****ing slow and inefficient. This applies to a single shader, not array, an array of shaders is not normally refered as "pipelined" when the array containst more than a single shader: that is completely different issue. And to go back to things that annoy in the CPU discussion above... The 486 was the last CPU to have one pipeline.... The Athlon 64 has 9 the Pentium 4 7. Still the more piplines you have the harder it is to push up mhz. And if GPU deal with more parallel data than CPU's.... 486DX has 487, by the above definition that would count as a 'pipeline', but the correct terminology would be 'execution unit'. The real meat of this is in the fact that 386 wasn't 'pipelined', the 486 was 'pipelined', it was the processor architechture from *Intel* that introduced pipelining in the mainstream x86 product line. It was the Pentium which introduced multi-execution-unit ALU core to x86 product line, those were called the U and V pipe, literally. The next core design was something completely different: it did decode the x86 instruction stream into micro-ops, which were executed on number of (3 if I remember correctly!) execution units. Two of which were simpler and executed only simplest instructions and one which executed more complex instructions such as division, multiple and such. This was the PentiumPRO architechture, which was used in PentiumII and PentiumIII aswell, with the difference that MMX and SSE were added on the consequent processors. But why I am telling this is that the PPRO architechture wasn't really 'multi-pipe' in the traditional sense, it was multiple execution units and out-of-order execution of single instruction stream in micro-op level. The next design, NetBurst architechture went a step further.. the decoded instruction streams were stored in so-called trace cache, again in multiple execution units and the pipeline length was more than doubled since the previous generations (I don't insult anyone by explaining what pipelining means in practise, assumed that the reader is familiar with microprocessor design basics for crying out loud). The pipeline was broken into more distinct stages to reach higher operating frequencies. Simpler stages complete in smaller time, therefore the frequency is possible to increase and still have a design that works reliably and predictably. This seems to be market driven decision instead of purely engineering decision but that can be speculated so everyone can draw their own conclusions that is just mine and not necessarily Truth. Anyway, the point I am drawing to is that the 'pipelining' in CPU -or- GPU is implementation detail and not relevant to shader arrays per-se. Merry Xmas! What'd he say? dvus |
#30
|
|||
|
|||
"dvus" wrote in message
... assaarpa wrote: Yadda yaddayadda .. it doesn't MATTER. That's a red herring. The real difference is in the fact that CPU executes linear sequence of instructions, yes, it is possible to re-order instructions and execute them out-of-order when there are no dependencies to other instructions and so on. But this is just going for straws. Now look at GPU, the instructions that are executed are SAME for each pixel in a primitive that is being scanconverted. If there is a triangle with 100 pixels, yes, every single one of those 100 pixels execute precisely the same shader instructions. The data that comes from samplers varies, but the shader is the same. This means it is feasible to throw N shaders at the problem and get the job done N times faster (theoretically). In practise getting values from samplers is I/O bound, there is specific amount of memory bandwidth the system can sustain, after that, the system is bandwidth limited. To remedy this to a degree the memory subsystem been diviced to multiple stages where the stage closer to the shader unit is faster, but smaller and the slowest memory is in the DDR3 (for example) modules but is the cheapest kind of memory so there is most of that type. Just over simplification of a typical contemporary GPU but should do the trick. Now, this is a bit different as pipelined CPU architechture.. because.. the term is simply abused by the almost-know-what-talking-about people. Generally the people who are clued-in talk about shader arrays (this terminology depends on the corporate culture you are from) or similiar. The GPU IMPLEMENTATION can be pipelined, or not, more likely it is pipelined than not because non-pipelined architechtures are ****ing slow and inefficient. This applies to a single shader, not array, an array of shaders is not normally refered as "pipelined" when the array containst more than a single shader: that is completely different issue. And to go back to things that annoy in the CPU discussion above... The 486 was the last CPU to have one pipeline.... The Athlon 64 has 9 the Pentium 4 7. Still the more piplines you have the harder it is to push up mhz. And if GPU deal with more parallel data than CPU's.... 486DX has 487, by the above definition that would count as a 'pipeline', but the correct terminology would be 'execution unit'. The real meat of this is in the fact that 386 wasn't 'pipelined', the 486 was 'pipelined', it was the processor architechture from *Intel* that introduced pipelining in the mainstream x86 product line. It was the Pentium which introduced multi-execution-unit ALU core to x86 product line, those were called the U and V pipe, literally. The next core design was something completely different: it did decode the x86 instruction stream into micro-ops, which were executed on number of (3 if I remember correctly!) execution units. Two of which were simpler and executed only simplest instructions and one which executed more complex instructions such as division, multiple and such. This was the PentiumPRO architechture, which was used in PentiumII and PentiumIII aswell, with the difference that MMX and SSE were added on the consequent processors. But why I am telling this is that the PPRO architechture wasn't really 'multi-pipe' in the traditional sense, it was multiple execution units and out-of-order execution of single instruction stream in micro-op level. The next design, NetBurst architechture went a step further.. the decoded instruction streams were stored in so-called trace cache, again in multiple execution units and the pipeline length was more than doubled since the previous generations (I don't insult anyone by explaining what pipelining means in practise, assumed that the reader is familiar with microprocessor design basics for crying out loud). The pipeline was broken into more distinct stages to reach higher operating frequencies. Simpler stages complete in smaller time, therefore the frequency is possible to increase and still have a design that works reliably and predictably. This seems to be market driven decision instead of purely engineering decision but that can be speculated so everyone can draw their own conclusions that is just mine and not necessarily Truth. Anyway, the point I am drawing to is that the 'pipelining' in CPU -or- GPU is implementation detail and not relevant to shader arrays per-se. Merry Xmas! What'd he say? I dunno Beavis, but one of them has "ass" in his name. HUHHHUHUHUHUH. |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
X800 Pro Vivo, which brand unlocks to 16 pipes the best ? | Avid Gamer | Ati Videocards | 5 | October 7th 04 07:28 PM |
any chance to unlock all 16 pipes on x800 pro?? | zerang shah | Ati Videocards | 28 | August 28th 04 12:32 AM |
Heat pipes | Norm | Overclocking AMD Processors | 7 | July 29th 04 01:47 PM |
Spring is difinately here, When will HDTV Wonder arrive? | power | Ati Videocards | 5 | April 13th 04 03:11 PM |
COMING THIS SPRING: HDTV Wonder Separately or Bundled With Selected All-In-Wonders | New Question | Ati Videocards | 4 | February 28th 04 05:16 PM |