13900K - Out of Video Memory or BSOD during shader compilation in two UE5 games

Interesting, I've not had any crashing lately, but following the advice above, I fired up "The Last of Us: Part 1" and let it do shader comp (haven't played this game since it first came out on PC, lol). Ran fine for a few minutes, all good, CPU's never went over 92c.

Then it just stopped:

WER Fault TLOU.png


I then immediately set Prime95 going and no crashes or errors for a good 15 minutes now, despite hitting 100c and throttling and drawing 260 watts compared to 200w for TLOU shader compiling.
 
Just run DX12 games that compile shaders before loading the game (or loading saved game) and not compiling them during gameplay. Good examples are The Last of Us and Horizon - Forbidden West.

the thing is, you shouldn't have to do that with a $949 cpu (14900ks) or any cpu. it should just work, indefinitely, at stock settings or even with a mild overclock. people have chips from 40 years ago that still work, when you have computer problems it's usually almost never the cpu.

I am still glad I can stabilize my CPU after watching that video!

yeah but if you're having that problem that means you've already suffered degradation which could be caused by a defect in the chip. who's to say it doesn't come back in a few months or a year? if i was you i'd be looking to get a replacement while you still can. or you might end up getting $15 in a class action lawsuit in a year or two and be stuck with a processor that randomly crashes and you are having to run underclocked. i mean i have ocd but just having to run underclocked alone would bug the crap out of me. especially when you paid whatever you paid to have one that's suppose to run at "whatever" frequency.
 
the thing is, you shouldn't have to do that with a $949 cpu (14900ks) or any cpu. it should just work, indefinitely, at stock settings or even with a mild overclock. people have chips from 40 years ago that still work, when you have computer problems it's usually almost never the cpu.



yeah but if you're having that problem that means you've already suffered degradation which could be caused by a defect in the chip. who's to say it doesn't come back in a few months or a year? if i was you i'd be looking to get a replacement while you still can. or you might end up getting $15 in a class action lawsuit in a year or two and be stuck with a processor that randomly crashes and you are having to run underclocked. i mean i have ocd but just having to run underclocked alone would bug the crap out of me. especially when you paid whatever you paid to have one that's suppose to run at "whatever" frequency.

You're mostly right, but it is not the chip's fault if motherboard makers decide to abuse power limits and fry it, be it slow or fast, just to be ahead of competition. Whether stock defaults or overclocking, there are safe intervals and once you go past those, you can expect your CPU to degrade quickly. Intel needs to come and produce an official statement and at least blame motherboard makers or clarify. I bet it knew about motherboad makers' practices and just didn't inform the public.

I still don't know why these instability reports began surfacing in such high volumes only in 2024 for those who purchased Raptor Lake recently and for those who purchased it a year ago. I keep thinking motherboard firmware microcode. What else can it be?
 
Last edited:
Interesting, I've not had any crashing lately, but following the advice above, I fired up "The Last of Us: Part 1" and let it do shader comp (haven't played this game since it first came out on PC, lol). Ran fine for a few minutes, all good, CPU's never went over 92c.

Then it just stopped:

View attachment 649865

I then immediately set Prime95 going and no crashes or errors for a good 15 minutes now, despite hitting 100c and throttling and drawing 260 watts compared to 200w for TLOU shader compiling.

CPU's have all kinds of instructions and functions. Shader compilation is a specific process and Prime95 probably doesn't stress the same area of CPU that shader compilation stresses. It is still not fully known why it is this process that creates instability and how exactly Windows 8 compatibility mode affects it (for some games). It is also why I hope a microcode or firmware with a fix or semi-fix is a possibility. Vulkan shader compilation isn't affected by this issue. Why not? Many quesitons remain...
 
Last edited:
So if shader compilating is crashing on all cpus where is fault? I bought 14900K, tommorow i will install some game with that shader compilating and check if it crashes.
On 13900K on stock i had crashes or bsod, in first shader compilating in Remnant 2 and Lords of The Fallen.
 
So if shader compilating is crashing on all cpus where is fault? I bought 14900K, tommorow i will install some game with that shader compilating and check if it crashes.
On 13900K on stock i had crashes or bsod, in first shader compilating in Remnant 2 and Lords of The Fallen.
go into your bios and turn of the boards "enhancements".
 
So if shader compilating is crashing on all cpus where is fault? I bought 14900K, tommorow i will install some game with that shader compilating and check if it crashes.
On 13900K on stock i had crashes or bsod, in first shader compilating in Remnant 2 and Lords of The Fallen.

Settings applied likely weren't actually stock if motherboard was the one making the decision. Motherboard makers load their own "optimized defaults", which aren't Intel defaults and aren't safe. It is hard to say whether CPU is defective or not if motherboards abuse power settings to stay competetive. If I were you, I wouldn't put any new CPU's into your motherboard for now. I'd use the possibly defective CPU until there is more information and possibly firmware/BIOS update to at least mitigate this issue a bit. Replace CPU (if defective) afterwards. If you can't wait for whatever reason and have to install a new CPU now, then at least use safe Intel settings (without overclocking for now), which require manual adjustments and not using motherboard "Optimized Defaults":
- Power Limit 1 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Limit 2 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Current - do not exceed 188W until further notice from motherboard makers and Intel
- SVID / CPU Load Lite - do not use anything other "Intel Default"

CPU Vcore voltages have not been shown to be the cause of shader compilation instability, but you should still use manual override because some motherboard enhancement can decide to pump 1.4v into Raptor Lake's. Make sure to monitor clocks and temperatures. It is normal for shader compilation to make CPU temperature spike to 100C Tj Max, but it shouldn't be crashing.
 
Settings applied likely weren't actually stock if motherboard was the one making the decision. Motherboard makers load their own "optimized defaults", which aren't Intel defaults and aren't safe. It is hard to say whether CPU is defective or not if motherboards abuse power settings to stay competetive. If I were you, I wouldn't put any new CPU's into your motherboard for now. I'd use the possibly defective CPU until there is more information and possibly firmware/BIOS update to at least mitigate this issue a bit. Replace CPU (if defective) afterwards. If you can't wait for whatever reason and have to install a new CPU now, then at least use safe Intel settings (without overclocking for now), which require manual adjustments and not using motherboard "Optimized Defaults":
- Power Limit 1 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Limit 2 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Current - do not exceed 188W until further notice from motherboard makers and Intel
- SVID / CPU Load Lite - do not use anything other "Intel Default"

CPU Vcore voltages have not been shown to be the cause of shader compilation instability, but you should still use manual override because some motherboard enhancement can decide to pump 1.4v into Raptor Lake's. Make sure to monitor clocks and temperatures. It is normal for shader compilation to make CPU temperature spike to 100C Tj Max, but it shouldn't be crashing.
But i have new mobo too. Z790 Aorus Elite X Wifi7.
When i plug new pc i will test that to see if it crashing on shaders for first time,
 
Last edited:
But i have new mobo too. Z790 Aorus Elite X Wifi7.
When i plug new pc i will test that to see if it crashing on shaders for first time,
I would follow their advice. Performance degradation on CPU is common due to electron migration (any overclocker site have had those discussions when it comes to OC and how long the CPU can last), which many suspect this is. You have a CPU that has been pushed hard from Intel, then pushed even further from MB makers. Those CPUs have increased boost clocks and also increased boost length. This means that you can get very rapid aging of the CPU. If you run shader compilation, you will see in HWmonitor, hwinfo or any other CPU monitor software that you get those big bursts when it compiles shaders.

Better to set it to intels defaults while waiting for bios fix. Better to have a long lasting, stable system then a short lived benchmark queen.
 
Researched this a bit more. It looks like shader compilation performs thorough intergrity checks that isn't performed during other processes. CPU's/GPU's can "soft-fail" duirng gameplay, which sometimes results in a brief stutter, but without errors or BSODs or exits to desktop if CPU/GPU recovers. agree with idea of soft-fail to keep the system running, but the user should be notified that CPU is failing.
 
Researched this a bit more. It looks like shader compilation performs thorough intergrity checks that isn't performed during other processes. CPU's/GPU's can "soft-fail" duirng gameplay, which sometimes results in a brief stutter, but without errors or BSODs or exits to desktop if CPU/GPU recovers. agree with idea of soft-fail to keep the system running, but the user should be notified that CPU is failing.
I dont believe this is the issue here. Nothing to suggest there is something wrong with the design of those CPUs that make them fail at shader compilation. People run them without any issues. Those that had issues, had increasing failure rate and some that RMAd their CPU saw those issues go away for a while after just changing CPU. The problem is not that the 13900K and 14900Ks are not up to the task when it comes to run games. They are perfectly fine, but act like they have a really bad OC and degrade over time like bad OCs do.

This is [H]ardOCP. Been an overclock site since the beginning. Many, especially the old timers here, recognize the symptoms and the fixes already. Some of the universal truths in OC are:
All chips degrade over time due to electronmigration, but you most likely wont notice it through the lifetime of the chip.
If you OC the chip, it might degrade faster, but within reason, you most likely wont notice it through the lifetime of the chip.
A bad OC can degrade the chip so fast that you will notice it through the lifetime of the chip. Over time, you might need to increase voltage then to maintain clocks, decrease heat to maintain clocks, or clock the chip down to achieve stability on same voltage.
A really bad OC can cause shorts and make the chip fail during the lifetime of the chip. If you want to read what electronmigration is and what it does to the chip, here is a good recent link:
https://semiengineering.com/electromigration-concerns-grow-in-advanced-packages/

As said, the 13900K and 14900K are perfectly capable CPUs. They dont have an inherit flaw that makes them not capable of shader compilation. Binned CPUs can maintain higher clocks on lower voltage and though heat can increase electronmigration, most cool them properly. However, the siliicon is pushed a bit already from Intel and motherboard vendors pushed them even further. During shader compilation, the CPU do those short boosts with higher current and clocks. You can have limits put in place, but MB manufacturers does not seem to respect them. Here is a link about those limits:
https://edc.intel.com/content/www/c...heet-volume-1-of-2/004/package-power-control/

Looks like they also increased the PL3 to reach 6.2ghz for some millisekunds on 14900KS.

So, what people suspects, since symptoms and fixes are the same, is that there is a default bad OC on the MB for the 14900 and 13900 chips causing the CPUs to degrade rapidly (some experience issues after 1 month, some faster, some slower, guess ASIC quality and cooling can vary). High current and high heat cause more electromigration, while high voltage can cause oxide breakdown. If MB manufacturers push both above limit, it can shorten lifespan of your CPU.
People who got the problems, lower clocks, then things are stable again. Just like what you need to do if OC gets unstable. Or you can increase voltage if heat is under control. Still, you might have an unhealthy OC from your MB manufacturer and only temporarily fixed the issue.

You can put in Intels power limits and run them safely, like you suggested yourself earlier in this thread, while waiting for the Bios update some already have in beta. :)
 
Last edited:
Back
Top