Starfield’s Performance on Nvidia’s 4090 and AMD’s 7900 XTX analyzed by Chips and Cheese

Marees · Nov 12, 2023

Analysis of RDNA 3 vs RDNA 2 by a blogger:

https://hole-in-my-head.blogspot.com/2023/11/the-performance-uplift-of-rdna-3-over.html?m=1

Straight away, the improvement to the FP32 pipeline is immediately obvious in almost all instruction types - though some have a greater improvement than others. What is interesting to me is the Int 16 improvement in RDNA 3 which I have not seen mentioned anywhere. An additional curiosity is the lack of gains in FP64 (not that it's really that useful for gaming?) given that I've seen it said that the dual-issue FP32 can run as an FP64 instruction as long as the driver is able to identify it. So, maybe this is purely the way this programme is written.

Cyberpunk, especially, seems to enjoy running on the RDNA3 architecture with around a 20 % performance increase over RDNA2 - something not shown before when testing the RX 7600.

Starfield sees another 20 % increase*, likely to do with some of the reasons outlined by Chips and Cheese regarding RDNA2's bottlenecks.

*This was performed before the current beta update...

Alan Wake 2 also shows a good 15 % increase between the two architectures.

Finally, Metro Exodus sees an above 10 % improvement, with increasing performance as the difficulty of running the game gets harder at higher settings. This potentially indicates that with heavier workloads, the gap widens between the two architectures when given the same resources.

Speaking of operating frequencies, we see an interesting behaviour in the RDNA 3 part - at lower core (shader) clocks, the front-end is essentially equal. Whereas, as the shader frequency increases, the front-end frequency moves further ahead so that, by the time the shader clock is around 2050 MHz, the front-end is ~2300 MHz. Additionally, though I've not shown it below, at stock, the front-end reaches ~2800 MHz when the shader clock is ~2400 MHz.

This seems like a power-saving feature to my eyes - and it's not necessary to raise the front-end clock when the workload is light or non-existant!. There's no benefit!

The core clocks for the RX 6800 and the core vs front-end frequencies for the RX 7800 XT in Metro Exodus...

What's interesting here, is that Chips and Cheese documented that the cards based on N31 also use this same trick whereas the N33-based RX 7600 actually clocked the front-end consistently lower than the shader clock, whilst also having lower latency than the N22 (RDNA 2) cards it was succeeding. Implying that there's some architectural improvement in how the caches are linked.

Conclusion...

In this very empirical overview, it is clear that, ignoring the increase in core (shader) frequencies, the RX 7800 XT has an architectural performance increase over the RX 6800. This also extends to the full N31 product (7900 XTX) as well. However, AMD's choice in reducing the L3 cache sizes for Navi 31 and Navi 32 appears to significantly hinder their overall performance. Additionally, the choice to move that L3 cache onto chiplets has resulted in a significant increase in energy use, and an over-dependence on the bandwidth to those chiplets. It also appears to be the case that there is an overhead for fully utilising the L3 cache, with performance dropping even before that limit is reached.

I didn't mention it in this blogpost, but the RX 7600 also doesn't have N31 and N32's increased vector register file size (192 KB vs 128 KB). However, since I don't have an understanding of how I could measure the effect of this on performance, I have decided to gloss over it - especially since Chips and Cheese do not appear to be overly concerned about it affecting N33's performance due to its lower CU count and on-die L3 cache.

What does appear to affect performance negatively is the choice to not clock the front-end higher on N33 and this is likely the source of a good amount of the observed performance bonus between the RDNA 2 and 3.

So, where does this leave us?

From my point of view, it appears that AMD made some smart choices for RDNA 3's architectural design which are then heavily negated by the inflexibility caused by going with the chiplet design and the need to bin/segregate in order to make a full product stack.

Moving to chiplets has also had the knock-on effect of increasing power draw (and likely heat), which has a negative impact on the ideal operating frequencies that each design can work at which has hindered the performance of the card. Just looking back at Metro Exodus, increasing the stock settings on the RX 7800 XT to +15 % power limit increases performance by 4 % (though this is only 3 fps!) showing that the card is still power limited as-released and may potentially see a bigger benefit to reducing operating voltage than RDNA 2 cards did.

Additionally, the RX 7600 appears hamstrung by the lack of increased front-end clock - perhaps due to power considerations? - and it is the choice to decouple front-end and shader clocks that seems to me to be the biggest contributor of RDNA 3's architectural uplift as it is this aspect which appears to allow the other architectural improvements to low-level caches and FP32 throughput to really shine.

Wade88 · Nov 27, 2023

crazycrave said:
How did the topic turn out to be all about Nvidia's Ray Tracing even when Starfield dose not even have it and the RTX 4090 being so good at it while needing $2000 worth of hardware and GDDR 6+ memory like some cheat to compare it to something using plain old GDDR 6. it's better at costing too much.

It does now and almost immediately had a mod available to do it unofficially.

crazycrave · Nov 27, 2023

Wade88 said:
It does now and almost immediately had a mod available to do it unofficially.

Intel also fixed there issue with the lastest Starfield patch that helped Nvidia run better, I don't know why GamersNexus keeps saying Intel is broken with Starfield as to run fine for me.

View: https://youtu.be/trHIybry4zg?si=oIWeF_h0E4nJNgS4

Lakados · Nov 27, 2023

Marees said:
An additional curiosity is the lack of gains in FP64 (not that it's really that useful for gaming?) given that I've seen it said that the dual-issue FP32 can run as an FP64 instruction as long as the driver is able to identify it. So, maybe this is purely the way this programme is written.

Starfield makes extensive use of RDNA’s Dual Issue SIMDs which limit the availability of the FP64 functions.
Frustratingly though compilers are still very terrible at identifying VOPD cases automatically and it requires a lot of hand tuning known shaders and not relying on compiler code generation for it.
The fact Starfield does this so well is proof the game was heavily tailored for AMDs architecture.
Not a complaint, or a criticism, Nvidia can handle that at a driver level it adds CPU overhead and work they were late to get done for what ever reasons.
This was Microsoft’s big Skyrim XBox exclusive and they needed to squeeze the most out of the game they could, because otherwise that would be an embarrassment of the highest order.

Lakados · Nov 27, 2023

crazycrave said:
Intel also fixed there issue with the lastest Starfield patch that helped Nvidia run better, I don't know why GamersNexus keeps saying Intel is broken with Starfield as to run fine for me.

View: https://youtu.be/trHIybry4zg?si=oIWeF_h0E4nJNgS4

Supposedly given the right circumstances on the right boards it can disable ReBAR or simply not use it correctly. It’s one of those will do it on MSI but not Gigabyte unless the moon is full then it will do it on a Gigabyte board unless your using an ASUS GPU sort of issues. It breaks a fair number of test benches. Reddit was having a field day with that one.

crazycrave · Nov 28, 2023

Lakados said:
Supposedly given the right circumstances on the right boards it can disable ReBAR or simply not use it correctly. It’s one of those will do it on MSI but not Gigabyte unless the moon is full then it will do it on a Gigabyte board unless your using an ASUS GPU sort of issues. It breaks a fair number of test benches. Reddit was having a field day with that one.

I have tried the same Sparkle A770 on the MSI B550 /5700x which is that video and my Gigabyte B650 Aorus Elite AX which the game came free with that board and 7600x cpu combo from Best Buy,

so this is a whole different system but same video card.

View: https://youtu.be/QPonGT0ZOlI?si=oLqBkxItBwy-YO2B

pendragon1 · Nov 28, 2023

good that it works, but we dont need a youtube video with every one of your posts.

Wade88 · Nov 28, 2023

crazycrave said:
Intel also fixed there issue with the lastest Starfield patch that helped Nvidia run better, I don't know why GamersNexus keeps saying Intel is broken with Starfield as to run fine for me.

View: https://youtu.be/trHIybry4zg?si=oIWeF_h0E4nJNgS4

Their

emphy · Nov 28, 2023

crazycrave said:
..., I don't know why GamersNexus keeps saying Intel is broken with Starfield as to run fine for me.

They don't; in their "one year later" arc video they reported that the intel issue with starfield had been fixed.

Lakados · Nov 28, 2023

emphy said:
They don't; in their "one year later" arc video they reported that the intel issue with starfield had been fixed.

The big issue was Intel just not working or detecting ReBAR in some cases even when it was enabled and it caused Starfield and really most games to run like an ass straight from Tacobell.

Starfield’s Performance on Nvidia’s 4090 and AMD’s 7900 XTX analyzed by Chips and Cheese

Marees

2[H]4U

Conclusion...

So, where does this leave us?

Wade88

[H]ard|Gawd

crazycrave

[H]ard|Gawd

Lakados

[H]F Junkie

Lakados

[H]F Junkie

crazycrave

[H]ard|Gawd

pendragon1

Extremely [H]

Wade88

[H]ard|Gawd

emphy

Limp Gawd

Lakados

[H]F Junkie

Starfield’s Performance on Nvidia’s 4090 and AMD’s 7900 XTX analyzed by Chips and Cheese

2[H]4U

Conclusion...​

So, where does this leave us?​

[H]ard|Gawd

[H]ard|Gawd

[H]F Junkie

[H]F Junkie

[H]ard|Gawd

Extremely [H]

[H]ard|Gawd

Limp Gawd

[H]F Junkie

Conclusion...

So, where does this leave us?