2005-05-21 02:05:21 UTC
http://www.majornelson.com/2005/05/20/xbox-360-vs-ps3-part-1-of-4/
Xbox 360 vs. PS3
Posted: May 20, 2005 @ 10:37 am (8 hours, 5 minutes ago) By: Major Nelson
One of the great things about working at Xbox is that we have some of the
smartest people in the world working on the Xbox 360. When Sony came
announced the PS3, along with the product specs some of our team started
looking at some of the numbers to see what they mean. Floating Point,
shaders, bandwidth..what does it all mean. Clearly there are some numbers
and stats that mean more to gaming then others, so the team cranked out some
facts for everyone to absorb. Our world class technology team looked at the
numbers and claims and decided to do what everyone else does: compare them
to the PS3. The difference it that these guys are uniquely qualified to do
so, and can cut through the smoke and mirrors to see what the real deal is.
To that end, I present this summary, which I have broken up into four parts
to make it more RSS Reader friendly.
Warning: Some of this stuff may make your head hurt, but these are the facts
as they stand right now. Enjoy the read:
XBOX 360 / PLAYSTATION 3 PERFORMANCE COMPARISON
SUMMARY
Now that the Xbox 360 and PlayStation 3 specifications have been announced,
it is possible to do a real world performance comparison of the two systems.
There are three critical performance aspects of a console:
. Central Processing Unit (CPU) performance.
o The Xbox 360 CPU architecture has three times the general purpose
processing power of the Cell.
. Graphics Processing Unit (GPU) performance
o The Xbox 360 GPU design is more flexible and it has more processing power
than the PS3 GPU.
. Memory System Bandwidth
o The memory system bandwidth in Xbox 360 exceeds the PS3's by five times.
Loading Image...
The Xbox 360's CPU has more general purpose processing power because it has
three general purpose cores, and Cell has just one.
Loading Image...
Cell's claimed advantage is on streaming floating point work which is done
on its seven DSP processors.
Loading Image...
The Xbox 360 GPU has more processing power than the PS3's. In addition, its
innovated features contribute to overall rendering performance.
Loading Image...
Xbox 360 has 278.4 GB/s of memory system bandwidth. The PS3 has less than
one-fifth of Xbox 360's (48 GB/s) of total memory system bandwidth.
CONCLUSION
When you break down the numbers, Xbox 360 has provably more performance than
PS3. Keep in mind that Sony has a track record of over promising and under
delivering on technical performance. The truth is that both systems pack a
lot of power for high definition games and entertainment.
However, hardware performance, while important, is only a third of the
puzzle. Xbox 360 is a fusion of hardware, software and services. Without the
software and services to power it, even the most powerful hardware becomes
inconsequential. Xbox 360 games-by leveraging cutting-edge hardware,
software, and services-will outperform the PlayStation 3.
DETAILED ANALYSIS OF PERFORMANCE SPECIFICATIONS
CPU
The Xbox 360 processor was designed to give game developers the power that
they actually need, in an easy to use form. The Cell processor has
impressive streaming floating-point power that is of limited use for games.
The majority of game code is a mixture of integer, floating-point, and
vector math, with lots of branches and random memory accesses. This code is
best handled by a general purpose CPU with a cache, branch predictor, and
vector unit.
The Cell's seven DSPs (what Sony calls SPEs) have no cache, no direct access
to memory, no branch predictor, and a different instruction set from the
PS3's
main CPU. They are not designed for or efficient at general purpose
computing. DSPs are not appropriate for game programming.
Xbox 360 has three general purpose CPU cores. The Cell processor has only
one.
Xbox 360's CPUs has vector processing power on each CPU core. Each Xbox 360
core has 128 vector registers per hardware thread, with a dot product
instruction, and a shared 1-MB L2 cache. The Cell processor's vector
processing power is mostly on the seven DSPs.
Dot products are critical to games because they are used in 3D math to
calculate vector lengths, projections, transformations, and more. The Xbox
360 CPU has a dot product instruction, where other CPUs such as Cell must
emulate dot product using multiple instructions.
Cell's streaming floating-point work is done on its seven DSP processors.
Since geometry processing is moved to the GPU, the need for streaming
floating-point work and other DSP style programming in games has dropped
dramatically.
Just like with the PS2's Emotion Engine, with its missing L2 cache, the Cell
is designed for a type of game programming that accounts for a minor
percentage of processing time.
Sony's CPU is ideal for an environment where 12.5% of the work is
general-purpose computing and 87.5% of the work is DSP calculations. That
sort of mix makes sense for video playback or networked waveform analysis,
but not for games. In fact, when analyzing real games one finds almost the
opposite distribution of general purpose computing and DSP calculation
requirements. A relatively small percentage of instructions are actually
floating point. Of those instructions which are floating-point, very few
involve processing continuous streams of numbers. Instead they are used in
tasks like AI and path-finding, which require random access to memory and
frequent branches, which the DSPs are ill-suited to.
Based on measurements of running next generation games, only ~10-30% of the
instructions executed are floating point. The remainders of the instructions
are load, store, integer, branch, etc. Even fewer of the instructions
executed are streaming floating point-probably ~5-10%. Cell is optimized for
streaming floating-point, with 87.5% of its cores good for streaming
floating-point and nothing else.
http://majornelson.com/wp/images/moreimages/index.7.gif
http://majornelson.com/wp/images/moreimages/index.8.gif
GPU
Even ignoring the bandwidth limitations the PS3's GPU is not as powerful as
the Xbox 360's GPU.
Below are the specs from Sony's press release regarding the PS3's GPU.
RSX GPU
. 550 MHz
. Independent vertex/pixel shaders
. 51 billion dot products per second (total system performance)
. 300M transistors
. 136 "shader operations" per clock
The interesting ALU performance numbers are 51 billion dot products per
second (total system performance), 300M transistors, and more than twice as
powerful as the 6800 Ultra.
The 51 billions dot products per cycle were listed on a summary slide of
total graphics system performance and are assumed to include the Cell
processor. Sony's calculations seem to assume that the Cell can do a dot
product per cycle per DSP, despite not having a dot product instruction.
However, using Sony's claim, 7 dot products per cycle * 3.2 GHz = 22.4
billion dot products per second for the CPU. That leaves 51 - 22.4 = 28.6
billion dot products per second that are left over for the GPU. That leaves
28.6 billion dot products per second / 550 MHz = 52 GPU ALU ops per clock.
It is important to note that if the RSX ALUs are similar to the GeForce 6800
ALUs then they work on vector4s, while the Xbox 360 GPU ALUs work on
vector5s. The total programmable GPU floating point performance for the PS3
would be 52 ALU ops * 4 floats per op *2 (madd) * 550 MHz = 228.8 GFLOPS
which is less than the Xbox 360's 48 ALU ops * 5 floats per op * 2 (madd) *
500 MHz= 240 GFLOPS.
With the number of transistors being slightly larger on the Xbox 360 GPU
(330M) it's not surprising that the total programmable GFLOPs number is very
close.
Loading Image...
The PS3 does have the additional 7 DSPs on the Cell to add more floating
point ops for graphics rendering, but the Xbox 360's three general purpose
cores with custom D3D and dot product instructions are more customized for
true graphics related calculations.
The 6800 Ultra has 16 pixel pipes, 6 vertex pipes, and runs at 400 MHz.
Given the RSX's 2x better than a 6800 Ultra number and the higher frequency
of the RSX, one can roughly estimate that it will have 24 pixel shading
pipes and 4 vertex shading pipes (fewer vertex shading pipes since the Cell
DSPs will do some vertex shading). If the PS3 GPU keeps the 6800 pixel
shader pipe co-issue architecture which is hinted at in Sony's press
release, this again gives it 24 pixel pipes* 2 issued per pipe + 4 vertex
pipes = 52 dot products per clock in the GPU.
If the RSX follows the 6800 Ultra route, it will have 24 texture samplers,
but when in use they take up an ALU slot, making the PS3 GPU in practice
even less impressive. Even if it does manage to decouple texture fetching
from ALU co-issue, it won't have enough bandwidth to fetch the textures
anyways.
For shader operations per clock, Sony is most likely counting each pixel
pipe as four ALU operations (co-issued vector+scalar) and a texture
operation per pixel pipe and 4 scalar operations for each vector pipe, for a
total of 24 * (4 + 1) + (4*4) = 136 operations per cycle or 136 * 550 = 74.8
GOps per second.
http://majornelson.com/wp/images/moreimages/index.3.gif
Given the Xbox 360 GPU's multithreading and balanced design, you really
can't
compare the two systems in terms of shading operations per clock. However,
the Xbox 360's GPU can do 48 ALU operations (each can do a vector4 and
scalar op per clock), 16 texture fetches, 32 control flow operations, and 16
programmable vertex fetch operations with tessellation per clock for a total
of 48*2 + 16 + 32 + 16 = 160 operations per cycle or 160 * 500 = 80 GOps per
second.
Overall, the automatic shader load balancing, memory export features,
programmable vertex fetching, programmable triangle tesselator, full rate
texture fetching in the vertex shader, and other "well beyond shader model
3.0" features of the Xbox 360 GPU should also contribute to overall
rendering performance.
Bandwidth
The PS3 has 22.4 GB/s of GDDR3 bandwidth and 25.6 GB/s of RDRAM bandwidth
for a total system bandwidth of 48 GB/s.
The Xbox 360 has 22.4 GB/s of GDDR3 bandwidth and a 256 GB/s of EDRAM
bandwidth for a total of 278.4 GB/s total system bandwidth.
http://majornelson.com/wp/images/moreimages/index.4.gif
Why does the Xbox 360 have such an extreme amount of bandwidth? Even the
simplest calculations show that a large amount of bandwidth is consumed by
the frame buffer. For example, with simple color rendering and Z testing at
550 MHz the frame buffer alone requires 52.8 GB/s at 8 pixels per clock. The
PS3's memory bandwidth is insufficient to maintain its GPU's peak rendering
speed, even without texture and vertex fetches.
The PS3 uses Z and color compression to try to compensate for the lack of
memory bandwidth. The problem with Z and color compression is that the
compression breaks down quickly when rendering complex next-generation 3D
scenes.
HDR, alpha-blending, and anti-aliasing require even more memory bandwidth.
This is why Xbox 360 has 256 GB/s bandwidth reserved just for the frame
buffer. This allows the Xbox 360 GPU to do Z testing, HDR, and alpha blended
color rendering with 4X MSAA at full rate and still have the entire main bus
bandwidth of 22.4 GB/s left over for textures and vertices.
CONCLUSION
When you break down the numbers, Xbox 360 has provably more performance than
PS3. Keep in mind that Sony has a track record of over promising and under
delivering on technical performance. The truth is that both systems pack a
lot of power for high definition games and entertainment.
However, hardware performance, while important, is only a third of the
puzzle. Xbox 360 is a fusion of hardware, software and services. Without the
software and services to power it, even the most powerful hardware becomes
inconsequential. Xbox 360 games-by leveraging cutting-edge hardware,
software, and services-will outperform the PlayStation 3.
Xbox 360 vs. PS3
Posted: May 20, 2005 @ 10:37 am (8 hours, 5 minutes ago) By: Major Nelson
One of the great things about working at Xbox is that we have some of the
smartest people in the world working on the Xbox 360. When Sony came
announced the PS3, along with the product specs some of our team started
looking at some of the numbers to see what they mean. Floating Point,
shaders, bandwidth..what does it all mean. Clearly there are some numbers
and stats that mean more to gaming then others, so the team cranked out some
facts for everyone to absorb. Our world class technology team looked at the
numbers and claims and decided to do what everyone else does: compare them
to the PS3. The difference it that these guys are uniquely qualified to do
so, and can cut through the smoke and mirrors to see what the real deal is.
To that end, I present this summary, which I have broken up into four parts
to make it more RSS Reader friendly.
Warning: Some of this stuff may make your head hurt, but these are the facts
as they stand right now. Enjoy the read:
XBOX 360 / PLAYSTATION 3 PERFORMANCE COMPARISON
SUMMARY
Now that the Xbox 360 and PlayStation 3 specifications have been announced,
it is possible to do a real world performance comparison of the two systems.
There are three critical performance aspects of a console:
. Central Processing Unit (CPU) performance.
o The Xbox 360 CPU architecture has three times the general purpose
processing power of the Cell.
. Graphics Processing Unit (GPU) performance
o The Xbox 360 GPU design is more flexible and it has more processing power
than the PS3 GPU.
. Memory System Bandwidth
o The memory system bandwidth in Xbox 360 exceeds the PS3's by five times.
Loading Image...
The Xbox 360's CPU has more general purpose processing power because it has
three general purpose cores, and Cell has just one.
Loading Image...
Cell's claimed advantage is on streaming floating point work which is done
on its seven DSP processors.
Loading Image...
The Xbox 360 GPU has more processing power than the PS3's. In addition, its
innovated features contribute to overall rendering performance.
Loading Image...
Xbox 360 has 278.4 GB/s of memory system bandwidth. The PS3 has less than
one-fifth of Xbox 360's (48 GB/s) of total memory system bandwidth.
CONCLUSION
When you break down the numbers, Xbox 360 has provably more performance than
PS3. Keep in mind that Sony has a track record of over promising and under
delivering on technical performance. The truth is that both systems pack a
lot of power for high definition games and entertainment.
However, hardware performance, while important, is only a third of the
puzzle. Xbox 360 is a fusion of hardware, software and services. Without the
software and services to power it, even the most powerful hardware becomes
inconsequential. Xbox 360 games-by leveraging cutting-edge hardware,
software, and services-will outperform the PlayStation 3.
DETAILED ANALYSIS OF PERFORMANCE SPECIFICATIONS
CPU
The Xbox 360 processor was designed to give game developers the power that
they actually need, in an easy to use form. The Cell processor has
impressive streaming floating-point power that is of limited use for games.
The majority of game code is a mixture of integer, floating-point, and
vector math, with lots of branches and random memory accesses. This code is
best handled by a general purpose CPU with a cache, branch predictor, and
vector unit.
The Cell's seven DSPs (what Sony calls SPEs) have no cache, no direct access
to memory, no branch predictor, and a different instruction set from the
PS3's
main CPU. They are not designed for or efficient at general purpose
computing. DSPs are not appropriate for game programming.
Xbox 360 has three general purpose CPU cores. The Cell processor has only
one.
Xbox 360's CPUs has vector processing power on each CPU core. Each Xbox 360
core has 128 vector registers per hardware thread, with a dot product
instruction, and a shared 1-MB L2 cache. The Cell processor's vector
processing power is mostly on the seven DSPs.
Dot products are critical to games because they are used in 3D math to
calculate vector lengths, projections, transformations, and more. The Xbox
360 CPU has a dot product instruction, where other CPUs such as Cell must
emulate dot product using multiple instructions.
Cell's streaming floating-point work is done on its seven DSP processors.
Since geometry processing is moved to the GPU, the need for streaming
floating-point work and other DSP style programming in games has dropped
dramatically.
Just like with the PS2's Emotion Engine, with its missing L2 cache, the Cell
is designed for a type of game programming that accounts for a minor
percentage of processing time.
Sony's CPU is ideal for an environment where 12.5% of the work is
general-purpose computing and 87.5% of the work is DSP calculations. That
sort of mix makes sense for video playback or networked waveform analysis,
but not for games. In fact, when analyzing real games one finds almost the
opposite distribution of general purpose computing and DSP calculation
requirements. A relatively small percentage of instructions are actually
floating point. Of those instructions which are floating-point, very few
involve processing continuous streams of numbers. Instead they are used in
tasks like AI and path-finding, which require random access to memory and
frequent branches, which the DSPs are ill-suited to.
Based on measurements of running next generation games, only ~10-30% of the
instructions executed are floating point. The remainders of the instructions
are load, store, integer, branch, etc. Even fewer of the instructions
executed are streaming floating point-probably ~5-10%. Cell is optimized for
streaming floating-point, with 87.5% of its cores good for streaming
floating-point and nothing else.
http://majornelson.com/wp/images/moreimages/index.7.gif
http://majornelson.com/wp/images/moreimages/index.8.gif
GPU
Even ignoring the bandwidth limitations the PS3's GPU is not as powerful as
the Xbox 360's GPU.
Below are the specs from Sony's press release regarding the PS3's GPU.
RSX GPU
. 550 MHz
. Independent vertex/pixel shaders
. 51 billion dot products per second (total system performance)
. 300M transistors
. 136 "shader operations" per clock
The interesting ALU performance numbers are 51 billion dot products per
second (total system performance), 300M transistors, and more than twice as
powerful as the 6800 Ultra.
The 51 billions dot products per cycle were listed on a summary slide of
total graphics system performance and are assumed to include the Cell
processor. Sony's calculations seem to assume that the Cell can do a dot
product per cycle per DSP, despite not having a dot product instruction.
However, using Sony's claim, 7 dot products per cycle * 3.2 GHz = 22.4
billion dot products per second for the CPU. That leaves 51 - 22.4 = 28.6
billion dot products per second that are left over for the GPU. That leaves
28.6 billion dot products per second / 550 MHz = 52 GPU ALU ops per clock.
It is important to note that if the RSX ALUs are similar to the GeForce 6800
ALUs then they work on vector4s, while the Xbox 360 GPU ALUs work on
vector5s. The total programmable GPU floating point performance for the PS3
would be 52 ALU ops * 4 floats per op *2 (madd) * 550 MHz = 228.8 GFLOPS
which is less than the Xbox 360's 48 ALU ops * 5 floats per op * 2 (madd) *
500 MHz= 240 GFLOPS.
With the number of transistors being slightly larger on the Xbox 360 GPU
(330M) it's not surprising that the total programmable GFLOPs number is very
close.
Loading Image...
The PS3 does have the additional 7 DSPs on the Cell to add more floating
point ops for graphics rendering, but the Xbox 360's three general purpose
cores with custom D3D and dot product instructions are more customized for
true graphics related calculations.
The 6800 Ultra has 16 pixel pipes, 6 vertex pipes, and runs at 400 MHz.
Given the RSX's 2x better than a 6800 Ultra number and the higher frequency
of the RSX, one can roughly estimate that it will have 24 pixel shading
pipes and 4 vertex shading pipes (fewer vertex shading pipes since the Cell
DSPs will do some vertex shading). If the PS3 GPU keeps the 6800 pixel
shader pipe co-issue architecture which is hinted at in Sony's press
release, this again gives it 24 pixel pipes* 2 issued per pipe + 4 vertex
pipes = 52 dot products per clock in the GPU.
If the RSX follows the 6800 Ultra route, it will have 24 texture samplers,
but when in use they take up an ALU slot, making the PS3 GPU in practice
even less impressive. Even if it does manage to decouple texture fetching
from ALU co-issue, it won't have enough bandwidth to fetch the textures
anyways.
For shader operations per clock, Sony is most likely counting each pixel
pipe as four ALU operations (co-issued vector+scalar) and a texture
operation per pixel pipe and 4 scalar operations for each vector pipe, for a
total of 24 * (4 + 1) + (4*4) = 136 operations per cycle or 136 * 550 = 74.8
GOps per second.
http://majornelson.com/wp/images/moreimages/index.3.gif
Given the Xbox 360 GPU's multithreading and balanced design, you really
can't
compare the two systems in terms of shading operations per clock. However,
the Xbox 360's GPU can do 48 ALU operations (each can do a vector4 and
scalar op per clock), 16 texture fetches, 32 control flow operations, and 16
programmable vertex fetch operations with tessellation per clock for a total
of 48*2 + 16 + 32 + 16 = 160 operations per cycle or 160 * 500 = 80 GOps per
second.
Overall, the automatic shader load balancing, memory export features,
programmable vertex fetching, programmable triangle tesselator, full rate
texture fetching in the vertex shader, and other "well beyond shader model
3.0" features of the Xbox 360 GPU should also contribute to overall
rendering performance.
Bandwidth
The PS3 has 22.4 GB/s of GDDR3 bandwidth and 25.6 GB/s of RDRAM bandwidth
for a total system bandwidth of 48 GB/s.
The Xbox 360 has 22.4 GB/s of GDDR3 bandwidth and a 256 GB/s of EDRAM
bandwidth for a total of 278.4 GB/s total system bandwidth.
http://majornelson.com/wp/images/moreimages/index.4.gif
Why does the Xbox 360 have such an extreme amount of bandwidth? Even the
simplest calculations show that a large amount of bandwidth is consumed by
the frame buffer. For example, with simple color rendering and Z testing at
550 MHz the frame buffer alone requires 52.8 GB/s at 8 pixels per clock. The
PS3's memory bandwidth is insufficient to maintain its GPU's peak rendering
speed, even without texture and vertex fetches.
The PS3 uses Z and color compression to try to compensate for the lack of
memory bandwidth. The problem with Z and color compression is that the
compression breaks down quickly when rendering complex next-generation 3D
scenes.
HDR, alpha-blending, and anti-aliasing require even more memory bandwidth.
This is why Xbox 360 has 256 GB/s bandwidth reserved just for the frame
buffer. This allows the Xbox 360 GPU to do Z testing, HDR, and alpha blended
color rendering with 4X MSAA at full rate and still have the entire main bus
bandwidth of 22.4 GB/s left over for textures and vertices.
CONCLUSION
When you break down the numbers, Xbox 360 has provably more performance than
PS3. Keep in mind that Sony has a track record of over promising and under
delivering on technical performance. The truth is that both systems pack a
lot of power for high definition games and entertainment.
However, hardware performance, while important, is only a third of the
puzzle. Xbox 360 is a fusion of hardware, software and services. Without the
software and services to power it, even the most powerful hardware becomes
inconsequential. Xbox 360 games-by leveraging cutting-edge hardware,
software, and services-will outperform the PlayStation 3.