RTX 3000 series for distributed computing.

pututu

[H]ard DC'er of the Year 2021
Joined
Dec 27, 2015
Messages
3,067
I just watched the "live" presentation by Jensen. Here is a quick summary of the new RTX3000 series card. Nothing on distributed computing. Not even a mention on folding at home in which they are one of the corporate sponsors/donors. I'm guessing it will translate to improvement in crunching particularly on the power efficiency.

Here is the pre-recorded official presentation by Nvidia and the webpage that summarizes the performance.

Card summary:
RTX 3080, Sept 17, $699
RTX 3070, sometime in Oct, $499. According to Jensen, this card is comparable in performance as RTX 2080Ti.
RTX 3090, Sept 24, $1499

1598979431621.png


I'm guessing that 3070 and 3080 will be great cards for distributed computing. The 3070 with 5,888 cores seems very compelling!
 
Last edited:
Folding at home doesn't care about total VRAM overly much, and I don't think it uses Tensor cores at all. Overall, I think performance it will come down to the number of CUDA cores which means maybe it's possible that the 3080 takes the efficiency crown this time around?

I think I'm in for the 3090 anyway, but whatever I buy will end up with more time cruching for F@H than what I end up gaming on it.
 
Doubling of the cuda cores is expected with the node shrink and pack some more RT and Tensor cores. Games are taking advantage quickly of the use of these new cores. For DC projects, kind of a waste that the software is not optimized to use the RT and/or Tensor cores. So far, I've seen Amicable Numbers and maybe GPUgrid project does very well with Turing than Pascal cards for roughly the same cuda count and clock speed. Not sure how easy it is to implement. Understand that these cards primary target is not DC but just saying.
 
Yeah its not totally clear as to what made the 2000 series so much faster than 1000 series for boinc. 2080's are close to twice as fast in a lot of projects and they have no where near twice the cuda cores of 1080ti for example.
I am sure we will see that carry over if not improve and as well see a big boost due to cuda core count
 
Folding at home doesn't care about total VRAM overly much, and I don't think it uses Tensor cores at all. Overall, I think performance it will come down to the number of CUDA cores which means maybe it's possible that the 3080 takes the efficiency crown this time around?

I think I'm in for the 3090 anyway, but whatever I buy will end up with more time cruching for F@H than what I end up gaming on it.

It looks like you'll get 17,408 CUDA cores for $100 less with 2 x 3080's tho ...
 
Yeah its not totally clear as to what made the 2000 series so much faster than 1000 series for boinc. 2080's are close to twice as fast in a lot of projects and they have no where near twice the cuda cores of 1080ti for example.
I am sure we will see that carry over if not improve and as well see a big boost due to cuda core count

I would guess its architectural. For instance concurrent FP and Integer ops, etc. and some projects benefit from it as frequency didn't improve much if at all. My 1070Ti & 1060 will do 2GHz+ easily and the 1050 1.95GHz with little effort .
 
It looks like you'll get 17,408 CUDA cores for $100 less with 2 x 3080's tho ...
True, though my wife is much more concerned with points per watt than total points, as she knows how much trash hardware I could conjure up if given free reign
 
I would guess its architectural. For instance concurrent FP and Integer ops, etc. and some projects benefit from it as frequency didn't improve much if at all. My 1070Ti & 1060 will do 2GHz+ easily and the 1050 1.95GHz with little effort .

I agree with the general idea, but the part that I find interesting is that this didn't seem to translate to the same level of improvements in the gaming / FPS front. 2080ti beat out 1080ti by like only 30% and it had an almost 25% cuda core increase. You would thing these magic under the hood improvements to the things you mentioned along with a core count uplift would have brought more to the table. Perhaps a lot of them were inherited more so from the cards intended for non gaming tasks and were really not suited for gaming improvements *shrug
 
Saw this german article posted by Stefan. Translated to English.

If the project uses FP32 calculation, will see almost double the computational speed with all else been equal (i.e. almost double the cuda cores). However no significant gain if the project uses integer operations. Anyone know what DC project uses only integer operations? Nothing about FP64, so some AMD cards (vii, 7970, 280x) are still the king unless you get nvidia non-consumer cards/setup.
1599317506246.png
 
It looks like you'll get 17,408 CUDA cores for $100 less with 2 x 3080's tho ...

A question that comes up with these card's design is do you want multiple GPUs in the same case ? One of the reasons I'm a least considering getting the 3090 for the Threadripper and just run that one GPU in it. Although I may just get a single 3080 and call it a day.
 
True, though my wife is much more concerned with points per watt than total points, as she knows how much trash hardware I could conjure up if given free reign

Just go stealth then. Get a case with a solid side panel instead of a window, replace any tool-less thumbscrews with real screws and lock your toolbox. It's just one more GPU, "harmless" you might say. What she doesn't know … ;)
 
Here is the closest distributed computing (cuda & openCL) benchmarking performance of the RTX 3000 cards that I read this morning: https://videocardz.com/newz/nvidia-...080-performance-in-cuda-and-opencl-benchmarks.

The n-body simulation is widely used in astrophysics. The RTX 3080 can performed up to 78% faster than RTX 2080 Super in this test. On average RTX 3080 will see about 68% and 38%/41% computational increase over RTX 2080 Super and 2080Ti respectively. Nothing on power efficiency. Will know more after Sep 14 but I'm guessing we are going to see a lot of gaming benchmark and probably very few of these.

1599417324862.png


edit: Considering that the 3080 has twice the cuda count than 2080Ti, the overall average compute performance is not twice but only 38-41% higher on average. Could be memory bandwidth limited (760 vs 616 GB/s) or memory bus width (320 vs 352-bit) or maybe un-optimized driver or some combination of all of these. Will know when the 3090 compute performance is out.
 
Last edited:
A question that comes up with these card's design is do you want multiple GPUs in the same case ? One of the reasons I'm a least considering getting the 3090 for the Threadripper and just run that one GPU in it. Although I may just get a single 3080 and call it a day.

Yeah I didn't even think about this! its gonna dump straight up into the fan of the card above it. Yes a lot of the heat does this with past cards, but I am sure some of it escapes over the top and doesn't get sucked in.
 
Yeah I didn't even think about this! its gonna dump straight up into the fan of the card above it. Yes a lot of the heat does this with past cards, but I am sure some of it escapes over the top and doesn't get sucked in.

Could be solved by some form of ducting perhaps, and/or some serious air movement in the case. But I don't think the past method(s) of just adding an additional GPU to increase output will be the best solution for 30xx. At least initially.
 
Also because of the overall size of the airflow pattern coming off the back i expect that the heat will not be as high as we see coming off the top vent area of more traditional coolers. For example I put my hand behind my tower cooler on my CPU and the air is warm but not super hot like it is coming off the top of my graphics card. Warm air moving at a high rate will not be that much of a burden for the card above it. Additionally if you increase the fan speed on the GPU the air will be cooler.
 
Yeah I didn't even think about this! its gonna dump straight up into the fan of the card above it. Yes a lot of the heat does this with past cards, but I am sure some of it escapes over the top and doesn't get sucked in.
AIB cards will prob have different fan choices more like we're used to seeing and then it's just the status quo. Good case/airflow choices should mitigate the issue.
 
edit: Considering that the 3080 has twice the cuda count than 2080Ti, the overall average compute performance is not twice but only 38-41% higher on average. Could be memory bandwidth limited (760 vs 616 GB/s) or memory bus width (320 vs 352-bit) or maybe un-optimized driver or some combination of all of these. Will know when the 3090 compute performance is out.

From my understanding of the details Nvidia has shared, the 3k series doubles the FP32 units per SM, which is what Nvidia ises to count CUDA cores, while leaving the number of INT32 cores unchanged, which must limit improvements for some workloads.

EDIT:

Looks like Pututu already shared this info.
 
I will be getting a 3090 to replace my TITAN V. Then next year sell it and get a TITAN if they come out.
Thats just me.
I am also building a new system with an 18 core processor to replace my 8 core 5960x from 2015 which I hope will be ready for the 20 year anniversary of F@H on Oct 1.
 
Awesome build. Please share results here when you got one after Sep 24th (y)
 
I will be getting a 3090 to replace my TITAN V. Then next year sell it and get a TITAN if they come out.
Thats just me.
I am also building a new system with an 18 core processor to replace my 8 core 5960x from 2015 which I hope will be ready for the 20 year anniversary of F@H on Oct 1.

the 3090 is the Titan, they inadvertently killed(or it was intentional, who knows) the naming scheme with the RTX Titan so there's nothing they can really call it except something stupid like the RTX Titan X. also because of that this allows AIB's to produce the cards which was never an option. they were always a nvidia only product other than the first Titan which allowed some of the AIB's to sell the reference model under their brand.
 
I was a few minutes late in the live presentation (y)
 
Videocardz has link to all the reviews on 3080. Didn't have time to go through those reviews with compute benchmark.

I use Geekbench website as reference to estimate relative compute benchmark. Performance will also depend on the DC application software written. If someone has the card on hand soon, please post result.

Cuda benchmark: about 28% faster than 2080 Ti and 66% faster than 2080S according to Geekbench
1600266569959.png


OpenCL: 39% faster than 2080, 65% faster than 2080S according to Geekbench

1600266731963.png
 
From TechPowerUp

First voltage-frequency graph that I've seen for 3080. I think the sweet spot could be around 0.85V and still get decent core clock (~1700MHz), I think this is going to be a very efficient card if tune properly. Not sure what power limit setting this will be but I sometimes use MSI AB to set a fix voltage in the voltage-frequency curve.

Note that power goes up as a square of the voltage (or current), assuming the system electrical resistance varies very little within a narrow range of operating temperature.

1600271640780.png
 
Some initial result of 3080 in folding forum. Need WUs will large number of atoms to see significant performance gain in PPD.

FAH PPD/W can be found here. Only a guesstimate number.

The usual overclock.net GPU database does not have any data yet on the 3080 but should fill up soon, hopefully.
 
From the same folding forum link in my previous post.

With cuda running, 5.6M PPD is possible.

Some quick numbers from Project 11765 in Linux:

TPF 73s - GTX 1080Ti running OpenCL/ 1.554 M PPD
TPF 57s - GTX 1080Ti running CUDA / 2.253 M PPD
TPF 49s - RTX 2080Ti running OpenCL/ 2.826 M PPD
TPF 39s - RTX 2080Ti running CUDA / 3.981 M PPD
TPF 36s - RTX 3080 running OpenCL / 4.489 M PPD
TPF 31s - RTX 3080 running CUDA / 5.618 M PPD

I do expect that the numbers might potentially be better once the drivers have matured a bit, generally in about 6 months. By that time, we might have a new version of FahCore_22 that can unlock more performance too!
 
very nice, especially If you were one of the lucky few to get a 3080. I want too upgrade my launch day 1080ti, but honestly I’m petty happy with how it has held up. I completely struck out on the 3080, don’t Expect any better with the 3090, so I guess I’ll have to live with 2.2mil ppd :D
 
3080 PrimeGrid performance is discussed in this post.

The AP result really stands out. Hopefully not a typo. 3080 is more than 2 times faster than 2080 in AP. The rest looks like a decent gain with INT32 operation rather than FP32.

AP27:
2080: 660-720
3080: 290

PPS Sieve: (2 tasks)
2080: 215
3080: 170
 
Just got my 3080 today so gonna do some testing as time permits.

First test was PrimeGrid GFN 19
Running windows 10 / 64bit
GPU at stock clocks but 65% power target
Average time over 2 tasks was around 18 minutes 40 seconds

Capture.JPG
 
Folding@home I am getting around 4.8M PPD at the same 65% power target stock clocks

This hasn't been running that long so i don't know how accurate that estimate is yet but it's probably pretty good. I will leave it overnight and see what it looks like tomorrow.

Capture.JPG
 
Toconator inspired me to see what this card could do when pushed a bit more.
This is 105% power target and +50 core +750 memory. Card is probably pulling close to 400 watts at this setting so there is no way i would run it like this unless it was a competition since the difference between this and my previous numbers is not worth the Huge difference in power draw. TPF was around 33 seconds at these settings and this puts it pretty much right in line with pututu estimated calculations.

Capture.JPG
 
Granted these are not the same work units so can't compare them directly.
Edit TPF has dropped a bit to 32 seconds and am now getting around 5.6M ppd
 
Wow! Those high GFN## PrimeGrid tasks probably all get done within 1 day then. Crazy.
 
Run under FAH linux and should see at least 10% - 15% or more improvement in PPD.
 
motqalden, here is a FAH gpu site that compiles the GPU PPD performance. Looks better than the overclock.net spreadsheet and hosted by LTT.

Here is the link within this site to all the GPU comparisons. There is also performance comparison by projects. The 3090 seems to do well above 6M PPD but for the price of the card....

You can install the software to upload your stats to the gpu database.
 
Hmm looks like the 14905 project i sampled was one of the lowest scoring ones
 
Back
Top