4GB on GK104 (GTX 670/GTX 680) Is Useless

fomoz

Limp Gawd
Joined
Jan 5, 2012
Messages
394
Hello everyone,

In case some of you have missed my post in the GTX 670 4GB 4-Way SLI thread, I have shown that the 256-bit bus does have enough bandwidth for 4GB on the GK104 cards. This conclusively proves that there is absolutely no reason to go with 4GB cards for GK104. This is unless you want to spend $2,000 on video cards to play games at 40 FPS or less.

If you're running 7680x1440 or 7680x1600 and need more than 2GB frame buffer, you should wait for 4GB cards with 512-bit bus to come out. Right now, it looks like 4GB on GK104 is a scam. I hope that I'm wrong about this.
 
Go bench Metro 2033 and get back to us. Might shine some white light on the situation :)

It was the same deal with the 3gb gtx580's. Showed no difference in anything EXCEPT Metro 2033.
AKA: it makes a difference in game(s) that are developed for extreme graphics usage... lol metro 2033.
 
Go bench Metro 2033 and get back to us. Might shine some white light on the situation :)

It was the same deal with the 3gb gtx580's. Showed no difference in anything EXCEPT Metro 2033.
AKA: it makes a difference in game(s) that are developed for extreme graphics usage... lol metro 2033.
Will do this tonight, thank you for the suggestion! :) Also, if anyone has any more benchmarks that you would like me to do, please post them here! It looks like I'm one of the only people on [H] who has this sort of setup and I am very interested in investigating this issue.

With 1440p IPS screen prices dropping, soon a lot more people will be running these kinds of setups. It would be good to have some benchmarks to know where we are in terms of hardware right now and I would like to help!

BTW, is Metro 2033 more demanding than Heaven 3.0?
 
Last edited:
This is bunk. You're making this conclusion with a benchmark? Seriously bro, go play some games.
 
This is bunk. You're making this conclusion with a benchmark? Seriously bro, go play some games.

I did. Same story in Crysis 2, Dragon Age II, and Heroes VI. I wouldn't be posting this otherwise. I just posted the benchmark to demonstrate my findings in a more conclusive way. There's nobody else here who wishes that this setup worked well more than me.

The only game that works well is Dirt 3, but it uses a lot less than 2GB VRAM at 8064x1440 maxed out.
 
Last edited:
I did. Same story in Crysis 2, Dragon Age II, and Heroes VI. I wouldn't be posting this otherwise. I just posted the benchmark to demonstrate my findings in a more conclusive way. There's nobody else here who wishes that this setup worked well more than me.

The only game that works well is Dirt 3, but it uses a lot less than 2GB VRAM at 8064x1440 maxed out.

Battlefield 3? SP & MP? Skyrim?

Crysis 2 is the only stressful game you tested, and we all know how well they code.
 
Skyrim with a ton of HD mods. Possibly even GTA IV with the ICEenhancer things.

As I understand it the main benefit in larger vram is being able to play smoothly with higher res textures/more detail provided by third parties. If you're into modded games, there's a benefit.

I haven't done much of this since FO3 so I'm not sure if it holds true as well now.

There used to be limits with the Fakefactory HL2 mods too... more vram helped.
 
Battlefield 3? SP & MP? Skyrim?

Crysis 2 is the only stressful game you tested, and we all know how well they code.
I used to think this way, but the reality is that I get around 32 FPS in Crysis 2, 42 FPS in Dragon Age II, and 29 FPS in Heroes VI (no improvement at all compared to 2-way SLI in the last one, maybe it needs an SLI profile).

How do I jump to a map in BF3 SP? Is there a map in particular that you would like me to test? I haven't played through the campaign yet, so I'm only on the train right now and it doesn't look like it's very GPU-intensive.

Same thing for Skyrim, please tell me if I should install any mods and which map to load for the test.
 
Last edited:
How about testing one display 2560x1440 with one or two cards only? Same result? With max CPU OC of course.
 
Skyrim with a ton of HD mods. Possibly even GTA IV with the ICEenhancer things.

As I understand it the main benefit in larger vram is being able to play smoothly with higher res textures/more detail provided by third parties. If you're into modded games, there's a benefit.

I haven't done much of this since FO3 so I'm not sure if it holds true as well now.

There used to be limits with the Fakefactory HL2 mods too... more vram helped.

This was my experience with the 580s going from 1.5 to 3GB. It's not that my average fps was any higher at all, it's just the fact that I no longer noticed stuttering when textures were being loaded.

I just got 3 680s delivered today and went for the 4gb version simply because at 8050x1600 I've seen MSI Afterburner report very near 2GB of vram usage on occasion. Although it may end up being a waste, I think of it more as insurance in case something that I do actually play wants to use that vram.

As a side note: If I do not turn off Desktop Compsition, I'll use more than 2GB with my 580s.
 
How about testing one display 2560x1440 with one or two cards only? Same result? With max CPU OC of course.
Heaven at 2560x1440 from 2-way SLI to 4-way SLI scales well, but it only uses 1,400MB VRAM. 8064x1440 uses 3,500MB and that's where I think I'm running into a VRAM bandwidth bottleneck.
 
I don't get the OP's rather unscientific conclusion... The bus does not "limit" bandwidth, but rather just determines it. The calculation for determining memory bandwidth (for GDDR5) is Effective Speed (MHz) x (Memory Bus Width (bits) / 8) = Bandwidth throughput

Hence, the stock memory bandwidth of 192.2GB/s.

No matter how fast your 4gb GTX 680 memory is (within realistic terms), there is no way it will be limited to any unusable extent by the 256bit bus or PCI-E 3.0.
 
Heaven at 2560x1440 from 2-way SLI to 4-way SLI scales well, but it only uses 1,400MB VRAM. 8064x1440 uses 3,500MB and that's where I think I'm running into a VRAM bandwidth bottleneck.

Just based on this sentence, my guess is that (if Heaven were an actual game you were playing) you'd FEEL a large difference between the 2GB cards and the 4GB cards.
 
This was my experience with the 580s going from 1.5 to 3GB. It's not that my average fps was any higher at all, it's just the fact that I no longer noticed stuttering when textures were being loaded.

I just got 3 680s delivered today and went for the 4gb version simply because at 8050x1600 I've seen MSI Afterburner report very near 2GB of vram usage on occasion. Although it may end up being a waste, I think of it more as insurance in case something that I do actually play wants to use that vram.

As a side note: If I do not turn off Desktop Compsition, I'll use more than 2GB with my 580s.

We would REALLY like to hear about your experience, and congratulations on your setup!
 
I don't get the OP's rather unscientific conclusion... The bus does not "limit" bandwidth, but rather just determines it. The calculation for determining memory bandwidth (for GDDR5) is Effective Speed (MHz) x (Memory Bus Width (bits) / 8) = Bandwidth throughput

Hence, the stock memory bandwidth of 192.2GB/s.

No matter how fast your 4gb GTX 680 memory is (within realistic terms), there is no way it will be limited to any unusable extent by the 256bit bus or PCI-E 3.0.

I skipped over commenting on that statement by the OP mostly because it made my brain hurt to hear it and filtered it out, but it does point to some level of ignorance and is part of the reason I question his results.

By the way fomoz, I really don't mean to rag on you, and I'm happy that you're willing to share your experience with us! Your results are just not what I'm expecting :).
 
I don't get the OP's rather unscientific conclusion... The bus does not "limit" bandwidth, but rather just determines it. The calculation for determining memory bandwidth (for GDDR5) is Effective Speed (MHz) x (Memory Bus Width (bits) / 8) = Bandwidth throughput

Hence, the stock memory bandwidth of 192.2GB/s.

No matter how fast your 4gb GTX 680 memory is (within realistic terms), there is no way it will be limited to any unusable extent by the 256bit bus or PCI-E 3.0.
I don't know what it is exactly, but all I can say is that I think the VRAM is too slow.

Just based on this sentence, my guess is that (if Heaven were an actual game you were playing) you'd FEEL a large difference between the 2GB cards and the 4GB cards.
I'd say that it 2GB or 4GB wouldn't matter at 2560x1440. At 8064x1440 it just wouldn't run at all with 2GB. With 4GB it runs, it just doesn't look like it's using the GPU's to their full potential, so in the end it doesn't run as well as it should.
 
I skipped over commenting on that statement by the OP mostly because it made my brain hurt to hear it and filtered it out, but it does point to some level of ignorance and is part of the reason I question his results.

By the way fomoz, I really don't mean to rag on you, and I'm happy that you're willing to share your experience with us! Your results are just not what I'm expecting :).
My results are not what I was expecting either when I was spending over $2,000 on those video cards. As I said in the other thread, I'm not an engineer. I don't really know what I'm talking about. All that I can do is provide some benchmarks for you guys to judge and tell me what's happening.

If any of you can prove me wrong, then I will be grateful. If this is a driver bug, that would be truly amazing. I just want to play some games on my machine. When I was getting 30 FPS in Heroes VI with 2-way SLI, I would expect to get around 60 FPS in 4-way SLI, but I'm still getting 30 FPS. It doesn't look like a CPU bottleneck.
 
Wait, so you're saying GPU utilization is failing when vram usage is over 2gb?
 
Hello everyone,

In case some of you have missed my post in the GTX 670 4GB 4-Way SLI thread, I have shown that the 256-bit bus does have enough bandwidth for 4GB on the GK104 cards. This conclusively proves that there is absolutely no reason to go with 4GB cards for GK104. This is unless you want to spend $2,000 on video cards to play games at 40 FPS or less.

If you're running 7680x1440 or 7680x1600 and need more than 2GB frame buffer, you should wait for 4GB cards with 512-bit bus to come out. Right now, it looks like 4GB on GK104 is a scam. I hope that I'm wrong about this.


You wont notice a difference bewteen 2 gigs V-RAM and 4 gigs V-RAM until you actually use more than 2 gigs of VRAM in an application.

If you have a card with 2 gigs, and your game uses 2.5 gigs of VRAM, then besically the game will be unplayable.


The reason why a 4 gig card runs slowley when at a very high resolution, usung more than 2 gigs of VRAM is because of the larger amount of rendering than the card is trying to do. Not because there is a problem with the speed of the memory or because of the 256 memory bus


I would argue that unless one plans upon doing 2 way SLI or greater, then a 4 gig gtx680 is kind of useless, because it alone is not powerful enough to render at such high resolutions where more than 2 gigs of VRAM is being used.

So the game might work with one 4 gig card at 7680x1600, however, the game would not be playable.
 
You wont notice a difference bewteen 2 gigs V-RAM and 4 gigs V-RAM until you actually use more than 2 gigs of VRAM in an application.

If you have a card with 2 gigs, and your game uses 2.5 gigs of VRAM, then besically the game will be unplayable.


The reason why a 4 gig card runs slowley when at a very high resolution, usung more than 2 gigs of VRAM is because of the larger amount of rendering than the card is trying to do. Not because there is a problem with the speed of the memory or because of the 256 memory bus


I would argue that unless one plans upon doing 2 way SLI or greater, then a 4 gig gtx680 is kind of useless, because it alone is not powerful enough to render at such high resolutions where more than 2 gigs of VRAM is being used.

So the game might work with one 4 gig card at 7680x1600, however, the game would not be playable.
The whole point is that it's not running well at 3,500MB VRAM usage in 4-way SLI while they're only getting 50% GPU usage.
 
Low GPU usage with multiple cards at a high resolution is often an indication of running on PCIE 2.0 vs 3.0. It could also be driver issues.
 
Low GPU usage with multiple cards at a high resolution is often an indication of running on PCIE 2.0 vs 3.0. It could also be driver issues.
The thing is that it gets proper GPU usage and scaling at 2560x1440, just not at 8064x1440. Also, I confirmed that I'm running at 8x/8x/8x/8x PCI-E 3.0 in GPU-Z. I really, really hope that you're right and that it's a driver issue, though. nVidia still hasn't answered my support ticket.
 
The thing is that it gets proper GPU usage and scaling at 2560x1440, just not at 8064x1440. Also, I confirmed that I'm running at 8x/8x/8x/8x PCI-E 3.0 in GPU-Z. I really, really hope that you're right and that it's a driver issue, though. nVidia still hasn't answered my support ticket.


your cards run at 8x8x8x8 at pcie 3.0, but the plx chip only has a single pcie 16x connection to the cpu. that being said it could be a pcie bandwidth issue, but it could also be a driver issue.

anyone with a x79 setup running quad 670's with 4gb of ram?
 
Wow, I didn't look to see that he was trying to do this on Z77. I can't say concretely that it would make a difference, but I can say that it is one variable that could have been eliminated before going down this road.
 
your cards run at 8x8x8x8 at pcie 3.0, but the plx chip only has a single pcie 16x connection to the cpu. that being said it could be a pcie bandwidth issue, but it could also be a driver issue.

I think that's probably the biggest factor - no matter how you slice it, the CPU is only giving you 16 lanes of PCIe 3.0. Vega showed that PCIe 3.0 can matter a lot with multiple cards, and you are essentially running at x4 PCIe 3.0 with the multiplexing going on. I'm not sure the hit is actually quite that severe (depending on how smart the NF200), but it certainly isn't going to be the same as having 32 actual lanes.
 
I think that's probably the biggest factor - no matter how you slice it, the CPU is only giving you 16 lanes of PCIe 3.0. Vega showed that PCIe 3.0 can matter a lot with multiple cards, and you are essentially running at x4 PCIe 3.0 with the multiplexing going on. I'm not sure the hit is actually quite that severe (depending on how smart the NF200), but it certainly isn't going to be the same as having 32 actual lanes.

i agree with all that, but the z77 mb dont use the nf200 chip. they use a newer plx chip that is pcie 3.0.
 
Well how hot are they getting? Could this be the result of throttling (70c and again at 80c if I remember correctly)?
 
Well how hot are they getting? Could this be the result of throttling (70c and again at 80c if I remember correctly)?

I've seen this elsewhere too. I'm actually baffled that Nvidia made the air intakes for these mid-range silicon cards flush, so that they are stuck up against whatever is in the next slot; they accounted for this very well with Fermi.

It makes sense when most multi-GPU boards are putting an extra slot in between the two PCIe slots, but it's a weird design decision that impacts anyone using three or four cards.
 
i agree with all that, but the z77 mb dont use the nf200 chip. they use a newer plx chip that is pcie 3.0.

Doesn't matter (and my use of NF200 was just because I don't know what the new chip is, I know it is 3.0 capable). It is still splitting 16 PCIe 3.0 lanes into 32 PCIe 3.0 lanes. So you aren't getting 8 lanes of PCIe 3.0 to each of the GPUs, you are getting some less-than-100% factor of 8 lanes.
 
Wow, I didn't look to see that he was trying to do this on Z77. I can't say concretely that it would make a difference, but I can say that it is one variable that could have been eliminated before going down this road.
Agreed, please see my reply in the other thread.

Well how hot are they getting? Could this be the result of throttling (70c and again at 80c if I remember correctly)?
Hottest card is hitting 69C at 2560x1440 and 63C at 8064x1440. You can see it in the graphs in the other thread.
 
http://nl.hardware.info/reviews/2641/nvidia-geforce-gtx-680-quad-sli-review-english-version

here is a quad 680 2gb review on a x79 system. why dont you run some of these games at 5760x1080 to see how they compare to yours. if your numbers are lower you will know if the z77 chipset is limiting you.
Will do. So far, all I can say is that I scored X11646 in 3DMark11 (with overclocked cards) compared to their X10850. Downloading Metro 2033 right now (I have it on Steam). Will post results tomorrow.
 
Last edited:
Hello everyone,

In case some of you have missed my post in the GTX 670 4GB 4-Way SLI thread, I have shown that the 256-bit bus does have enough bandwidth for 4GB on the GK104 cards. This conclusively proves that there is absolutely no reason to go with 4GB cards for GK104. This is unless you want to spend $2,000 on video cards to play games at 40 FPS or less.

If you're running 7680x1440 or 7680x1600 and need more than 2GB frame buffer, you should wait for 4GB cards with 512-bit bus to come out. Right now, it looks like 4GB on GK104 is a scam. I hope that I'm wrong about this.

You say you feel the problem is with the 670's/680's having 4GB is that the 256-bit is slowing them down. This could be a stupid question but, have you compared your results with anyone using GTX 690 Quad SLI with the same drivers and similar resolution?

Also, try benching The Witcher 2 and Total War: Shogun 2.
 
Last edited:
You say you feel the problem is with the 670's/680's having 4GB is that the 256-bit is slowing them down. This could be a stupid question but, have you compared your results with anyone using GTX 690 Quad SLI with the same drivers and similar resolution?

Also, try benching The Witcher 2 and Total War: Shogun 2.

I find the idea that the memory bus could be slowing the cards down to be quite suspect, but fomoz did mention that increasing memory clocks by 13% increased overall performance by 10%, which supports the argument.

Driver issues are much more likely though, as is running the set off of a single 16x PCIe 3.0 connection. I've seen a thread he referenced where people (including Vega!) were trying to get PCIe 3.0 working properly with four cards on an X79 board, but they have not been very successful yet, which lends further support to the idea of a driver issue.
 
I find the idea that the memory bus could be slowing the cards down to be quite suspect, but fomoz did mention that increasing memory clocks by 13% increased overall performance by 10%, which supports the argument.

Driver issues are much more likely though, as is running the set off of a single 16x PCIe 3.0 connection. I've seen a thread he referenced where people (including Vega!) were trying to get PCIe 3.0 working properly with four cards on an X79 board, but they have not been very successful yet, which lends further support to the idea of a driver issue.

I'm pretty much thinking the same but having GTX 690 Quad SLI VRAM usage to compare may be interesting to see with his issue here.
 
I'm pretty much thinking the same but having GTX 690 Quad SLI VRAM usage to compare may be interesting to see with his issue here.

Why would that matter? It's the same memory bandwidth, right? The GPUs are talking to each other through the PLX chip instead of the motherboard/SLI connector, but I wouldn't think that would make much difference. Or are you talking about a 2GB/4GB difference?
 
Back
Top