Some games hard-crash my box but simultaneous Furmark & Prime95 torture tests don't... I'm going nuts

cheezchat

n00b
Joined
Oct 22, 2021
Messages
7
I am not much of a gamer other than Factorio but since I built myself a fancy new computer I thought I’d try some games. I’ve tried Satisfactory and Rust: both exhibit this problem but Factorio plays fine for hours (and hours and hours and hours...)

The problem:

playing a game mentioned above will make my computer shut itself off. No warnings, no messages, just OFF. Sometimes even _launching_ the game is enough to cause this to happen. Sometimes it happens the very first time I move the cursor to look around. Sometimes, after troubleshooting (see below) it will last a half hour or so and then BLAMMO. Gone.

The Steps Taken

* I verified that any overclocking stuff was disabled on the gpu and cpu. Re-test, same result.
* I set the power limit of the gpu to 50 in MSI Afterburner: this worked at least for 30 minutes during play testing. Games were playable but with vastly degraded performance
* I monitored power consumption via a Kill-a-Watt meter and saw during gameplay (with gpu power at 100) the consumption never exceeded 350W
* I ran the torture test of Prime95 + FurMark simultaneously while logging in hwinfo for up to 55 minutes without a crash. FurMark ran at 4096x2160 with an average frame rate of 109.
* Power consumption during this testing maxed out at 598W
* Peak CPU temp during this torture test was 90.8c with an 81.5c average. The cpu never thermal-throttled.
* Peak GPU temp during this torture test was 77.9c with a 75.3c average. The GPU was marked performance limited-power and it did trigger the utilization and reliability voltage values at least once although they are “no” most of the time.
* I have closed almost all open applications and tasks while attempting to play the games (antivirus, wallpaperengine, rainmeter, discord, &c)
* I am running the current GeForce Game Ready driver as of 11-1-2021
* BIOS is current, windows update is current
* I replaced the 3080Ti with my old 980Ti in the same slot and it is able to play and launch the games albeit at a reduced power draw and frame rate
* Reset BIOS to defaults (including disabling BAR and DOCP) - same result although much more stable
* Ran DDU to purge Nvidia and reinstalled
* Ran Windows Memory Diagnostic (no problems found)
* Reinstalled Steam
* Reinstalled games
* Ran Superposition stress test for 30 minutes: passed
* Moved GPU from pcie_1 to pcie_2

The Hardware

* Asus ProArt 570 motherboard
* Seasonic 700w fanless PSU TX-700
* Ryzen 9 5900x with Be Quiet Dark Rock 4 Pro Borg Cube Coolermajig
* Crucial 64GB Ballistix DDR4-3600 2x32GB (BL2K32G36C16U4B) (note: not on QVL [ I thought they were but I was wrong], confirmed installed in correct slots)
* Seagate Firecuda 4TB (Data 2)
* Seagate Firecuda 2TB (Data 1)
* Samsung 980 Pro 512GB (Boot)
* MSI GeForce RTX 3080 Ti GAMING X TRIO (who NAMES these things?!)
* Windows 10 Pro
* LG 4096x2160 monitor

I spoke with MSI on the phone and they felt fairly confident that the card was OK since it passed the stress tests. I spoke with Asus on the phone and they're going to send me a new motherboard but I'm not entirely convinced that's what's wrong. At the moment, my spidey-sense is looking at the RAM. The best success I've had is after resetting the BIOS thus disabling the DOCP (and BAR).

So... what have I missed? I'm going nuts.
 
* Seasonic 700w fanless PSU TX-700
Is that PSU from a year ago or more? Some of their older Focus models had an oversensitive OCP problem with 30 series cards, and some of their older Prime series are sensitive to noise on the 12V sense line also caused by 30 series cards. I experienced the latter with the exact same symptoms you describe.

Although, yeah 700w is cutting it pretty tight for a 3080ti, your kill-a-watt won't catch the brief but significant spikes these cards can draw, so you may be truly running into overcurrent territory.

Either way, I'll second the notion that you should suspect the PSU is probably the problem here.
 
What is your SOC voltage in bios? 64 gigs at cas 16 might require some tinkering to get it right. Fan less psu wasn't a good idea for that rig either. Gaming sessions likely build some heat on those regulators.
 
Is that PSU from a year ago or more? Some of their older Focus models had an oversensitive OCP problem with 30 series cards, and some of their older Prime series are sensitive to noise on the 12V sense line also caused by 30 series cards. I experienced the latter with the exact same symptoms you describe.

Although, yeah 700w is cutting it pretty tight for a 3080ti, your kill-a-watt won't catch the brief but significant spikes these cards can draw, so you may be truly running into overcurrent territory.

Either way, I'll second the notion that you should suspect the PSU is probably the problem here.
PSU was purchased a month ago but I’ll need to check manufacturing date. I hadn’t planned on getting a 3080Ti but that’s what came up in the Newegg shuffle. I’ll get a new PSU and report back. Thanks!
 
Yeah the 3080ti is quite the gas guzzler of a GPU. Very very high power requirements. I'm not surprised that it's causing issues with demanding titles. Cards like my HD7750 are dramatically easier for a PSU to cope with.
 
I ordered a 1000w Titanium which I know is overkill but hey, I'd rather have extra headroom than the current problem I have. Thanks, all! I guess I'm surprised that this would happen despite the machine running these stress tests. Like you said though that doesn't account for sudden spikes. We'll find out tomorrow when it gets here!
 
What is your SOC voltage in bios? 64 gigs at cas 16 might require some tinkering to get it right. Fan less psu wasn't a good idea for that rig either. Gaming sessions likely build some heat on those regulators.
I know _nothing_ about RAM specs so I'm going to have to research that and get back to you. I appreciate the input, thanks!
 
I had the exact same problem with a 3090 and a 5950x on a 850W PSU. Games would shut off the computer, but benchmarks, etc. would run fine. It was definitely the PSU tripping the OCP protection with the performance spikes in gaming. A new PSU fixed the problem.
 
ill second the ram/voltage thing, i missed that part....
up the voltage on the ram to 1.4v and vsoc to 1.1-1.15v but id still suspect psu.
 
my 5800x and 3080 will pull almost 500 watts from the wall when I game
 
It's not so much the power being drawn, it's the spikes that set off the protection. In my case 850W should be plenty even if the 5950X was drawing 250W and the GPU 400W.
I just mentioned that since the OP says his never exceeded 350watts from the wall. I am pulling 360 from the wall right now as I type this since I am mining with the 3080.
 
I was actually mixed up. It's closer to 600w sustained while running the benchmarks and about 350w while gaming.
 
ill second the ram/voltage thing, i missed that part....
up the voltage on the ram to 1.4v and vsoc to 1.1-1.15v but id still suspect psu.

The spec says they're 1.35v sticks. Is that safe to bump them up beyond that? Even if it's just .05v?
 
The spec says they're 1.35v sticks. Is that safe to bump them up beyond that? Even if it's just .05v?
totally, wouldnt recommend it otherwise. my ram is at 1.4v and my vsoc 1.1v. not uncommon to have to tweak it a bit on amd systems either.
 
Woohoo!!! Thank you all SO much! I was able to play for two hours straight tonight without a hiccup. I put in a 1000w PSU instead of the 700w and that did the trick!

I'm running the RAM at 1.4v but I couldn't find "vsoc" in my Asus bios. Is it under a different name, maybe?
 
Woohoo!!! Thank you all SO much! I was able to play for two hours straight tonight without a hiccup. I put in a 1000w PSU instead of the 700w and that did the trick!

I'm running the RAM at 1.4v but I couldn't find "vsoc" in my Asus bios. Is it under a different name, maybe?
nice. look for something with "soc" in it at is at or around 1.1v, usually between the cpu and ram voltage settings.
 
PSU was purchased a month ago but I’ll need to check manufacturing date. I hadn’t planned on getting a 3080Ti but that’s what came up in the Newegg shuffle. I’ll get a new PSU and report back. Thanks!
Welcome to the club. First I wanted a 3080. Then I wanted a 3080 or 6800XT. Then I wanted a 3080, 6800XT or 6900XT. I bought a 3090 last December (2020) because I managed to get it into my cart and check out. At any rate a 700W PSU is a bit light for a 3080Ti or 3090. Thankfully I picked up a 1kw PSU when I built my rig. At the time I felt a little buyer's remorse, but not anymore! Just glad to have a nice card after looking at current prices. The 3090 is my first top end stupid expensive card, but I figure if there was ever a time to go big on a rig it was the winter of Covid and ya gotta do it once, right? You bought the fancy silent stuff instead of the high power stuff, but you'll be ok with a 1kw Seasonic. I have a Seasonic Prime TX 1000W and it's part time fanless in that it can shut the fan off under light load. So no fan browsing the web, etc. It kicks on when I'm gaming. That's pretty much how I build my rigs. A little noise is ok while gaming since I won't hear it over the game anyway, but I want them inaudible when I'm not gaming or otherwise doing anything CPU or GPU intensive.
 
Um did you do MEM-TEST could be Memory issues........

I don't remember having a lot of hard crash memory issues. Usually I would get a blue screen or a soft reboot if the memory was bad. The fact that it just turns the computer OFF seems like it was a tripping OCP issue. Plus the fact that a new one fixed the issue seems to indicate that the PSU was the problem.

FWIW, I had the same issue with a 3090 and an 850W PSU.
 
Glad OP got it resolved!

When people ask me why I went with 750W PSU for OC'd 5800X and 3070Ti, I show them one of these.
Not OPs exact card but close...
power-spikes.png

Note the spike duration... they last just long enough to sometimes trip OCP, but short enough to be ignored by regular power meters. HWiNFO / GPUz / etc can sometimes barely detect them if their sampling rate is set extremely rapid but you'd really need an external 'scope.
Edit to add: Since this is with 20ms duration threshold and AFAIK, some PSUs have a threshold as low as 10ms, there's the potential for even larger, shorter spikes to trip protection. I believe Vega was notorious for this as well... I could be full of it but I'd like to say I've seen 800W numbers thrown around for 10-20ms spikes?
Anything that has a large die / high transistor count running at a high power & voltage level has the potential to do this.
 
Last edited:
Glad OP got it resolved!

When people ask me why I went with 750W PSU for OC'd 5800X and 3070Ti, I show them one of these.
Not OPs exact card but close...

Note the spike duration... they last just long enough to sometimes trip OCP, but short enough to be ignored by regular power meters. HWiNFO / GPUz / etc can sometimes barely detect them if their sampling rate is set extremely rapid but you'd really need an external 'scope.
Edit to add: Since this is with 20ms duration threshold and AFAIK, some PSUs have a threshold as low as 10ms, there's the potential for even larger, shorter spikes to trip protection. I believe Vega was notorious for this as well... I could be full of it but I'd like to say I've seen 800W numbers thrown around for 10-20ms spikes?
Anything that has a large die / high transistor count running at a high power & voltage level has the potential to do this.
Really the simple test of your PSU is go into AB and limit power to 60%. If it stops crashing you know what the issue is most likely to be.
 
Back
Top