13900K - Out of Video Memory or BSOD during shader compilation in two UE5 games

Interesting, I've not had any crashing lately, but following the advice above, I fired up "The Last of Us: Part 1" and let it do shader comp (haven't played this game since it first came out on PC, lol). Ran fine for a few minutes, all good, CPU's never went over 92c.

Then it just stopped:

WER Fault TLOU.png


I then immediately set Prime95 going and no crashes or errors for a good 15 minutes now, despite hitting 100c and throttling and drawing 260 watts compared to 200w for TLOU shader compiling.
 
Just run DX12 games that compile shaders before loading the game (or loading saved game) and not compiling them during gameplay. Good examples are The Last of Us and Horizon - Forbidden West.

the thing is, you shouldn't have to do that with a $949 cpu (14900ks) or any cpu. it should just work, indefinitely, at stock settings or even with a mild overclock. people have chips from 40 years ago that still work, when you have computer problems it's usually almost never the cpu.

I am still glad I can stabilize my CPU after watching that video!

yeah but if you're having that problem that means you've already suffered degradation which could be caused by a defect in the chip. who's to say it doesn't come back in a few months or a year? if i was you i'd be looking to get a replacement while you still can. or you might end up getting $15 in a class action lawsuit in a year or two and be stuck with a processor that randomly crashes and you are having to run underclocked. i mean i have ocd but just having to run underclocked alone would bug the crap out of me. especially when you paid whatever you paid to have one that's suppose to run at "whatever" frequency.
 
the thing is, you shouldn't have to do that with a $949 cpu (14900ks) or any cpu. it should just work, indefinitely, at stock settings or even with a mild overclock. people have chips from 40 years ago that still work, when you have computer problems it's usually almost never the cpu.



yeah but if you're having that problem that means you've already suffered degradation which could be caused by a defect in the chip. who's to say it doesn't come back in a few months or a year? if i was you i'd be looking to get a replacement while you still can. or you might end up getting $15 in a class action lawsuit in a year or two and be stuck with a processor that randomly crashes and you are having to run underclocked. i mean i have ocd but just having to run underclocked alone would bug the crap out of me. especially when you paid whatever you paid to have one that's suppose to run at "whatever" frequency.

You're mostly right, but it is not the chip's fault if motherboard makers decide to abuse power limits and fry it, be it slow or fast, just to be ahead of competition. Whether stock defaults or overclocking, there are safe intervals and once you go past those, you can expect your CPU to degrade quickly. Intel needs to come and produce an official statement and at least blame motherboard makers or clarify. I bet it knew about motherboad makers' practices and just didn't inform the public.

I still don't know why these instability reports began surfacing in such high volumes only in 2024 for those who purchased Raptor Lake recently and for those who purchased it a year ago. I keep thinking motherboard firmware microcode. What else can it be?
 
Last edited:
Interesting, I've not had any crashing lately, but following the advice above, I fired up "The Last of Us: Part 1" and let it do shader comp (haven't played this game since it first came out on PC, lol). Ran fine for a few minutes, all good, CPU's never went over 92c.

Then it just stopped:

View attachment 649865

I then immediately set Prime95 going and no crashes or errors for a good 15 minutes now, despite hitting 100c and throttling and drawing 260 watts compared to 200w for TLOU shader compiling.

CPU's have all kinds of instructions and functions. Shader compilation is a specific process and Prime95 probably doesn't stress the same area of CPU that shader compilation stresses. It is still not fully known why it is this process that creates instability and how exactly Windows 8 compatibility mode affects it (for some games). It is also why I hope a microcode or firmware with a fix or semi-fix is a possibility. Vulkan shader compilation isn't affected by this issue. Why not? Many quesitons remain...
 
Last edited:
So if shader compilating is crashing on all cpus where is fault? I bought 14900K, tommorow i will install some game with that shader compilating and check if it crashes.
On 13900K on stock i had crashes or bsod, in first shader compilating in Remnant 2 and Lords of The Fallen.
 
So if shader compilating is crashing on all cpus where is fault? I bought 14900K, tommorow i will install some game with that shader compilating and check if it crashes.
On 13900K on stock i had crashes or bsod, in first shader compilating in Remnant 2 and Lords of The Fallen.
go into your bios and turn of the boards "enhancements".
 
So if shader compilating is crashing on all cpus where is fault? I bought 14900K, tommorow i will install some game with that shader compilating and check if it crashes.
On 13900K on stock i had crashes or bsod, in first shader compilating in Remnant 2 and Lords of The Fallen.

Settings applied likely weren't actually stock if motherboard was the one making the decision. Motherboard makers load their own "optimized defaults", which aren't Intel defaults and aren't safe. It is hard to say whether CPU is defective or not if motherboards abuse power settings to stay competetive. If I were you, I wouldn't put any new CPU's into your motherboard for now. I'd use the possibly defective CPU until there is more information and possibly firmware/BIOS update to at least mitigate this issue a bit. Replace CPU (if defective) afterwards. If you can't wait for whatever reason and have to install a new CPU now, then at least use safe Intel settings (without overclocking for now), which require manual adjustments and not using motherboard "Optimized Defaults":
- Power Limit 1 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Limit 2 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Current - do not exceed 188W until further notice from motherboard makers and Intel
- SVID / CPU Load Lite - do not use anything other "Intel Default"

CPU Vcore voltages have not been shown to be the cause of shader compilation instability, but you should still use manual override because some motherboard enhancement can decide to pump 1.4v into Raptor Lake's. Make sure to monitor clocks and temperatures. It is normal for shader compilation to make CPU temperature spike to 100C Tj Max, but it shouldn't be crashing.
 
Settings applied likely weren't actually stock if motherboard was the one making the decision. Motherboard makers load their own "optimized defaults", which aren't Intel defaults and aren't safe. It is hard to say whether CPU is defective or not if motherboards abuse power settings to stay competetive. If I were you, I wouldn't put any new CPU's into your motherboard for now. I'd use the possibly defective CPU until there is more information and possibly firmware/BIOS update to at least mitigate this issue a bit. Replace CPU (if defective) afterwards. If you can't wait for whatever reason and have to install a new CPU now, then at least use safe Intel settings (without overclocking for now), which require manual adjustments and not using motherboard "Optimized Defaults":
- Power Limit 1 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Limit 2 - do not exceed 253W until further notice from motherboard makers and Intel
- Power Current - do not exceed 188W until further notice from motherboard makers and Intel
- SVID / CPU Load Lite - do not use anything other "Intel Default"

CPU Vcore voltages have not been shown to be the cause of shader compilation instability, but you should still use manual override because some motherboard enhancement can decide to pump 1.4v into Raptor Lake's. Make sure to monitor clocks and temperatures. It is normal for shader compilation to make CPU temperature spike to 100C Tj Max, but it shouldn't be crashing.
But i have new mobo too. Z790 Aorus Elite X Wifi7.
When i plug new pc i will test that to see if it crashing on shaders for first time,
 
Last edited:
But i have new mobo too. Z790 Aorus Elite X Wifi7.
When i plug new pc i will test that to see if it crashing on shaders for first time,
I would follow their advice. Performance degradation on CPU is common due to electron migration (any overclocker site have had those discussions when it comes to OC and how long the CPU can last), which many suspect this is. You have a CPU that has been pushed hard from Intel, then pushed even further from MB makers. Those CPUs have increased boost clocks and also increased boost length. This means that you can get very rapid aging of the CPU. If you run shader compilation, you will see in HWmonitor, hwinfo or any other CPU monitor software that you get those big bursts when it compiles shaders.

Better to set it to intels defaults while waiting for bios fix. Better to have a long lasting, stable system then a short lived benchmark queen.
 
Researched this a bit more. It looks like shader compilation performs thorough intergrity checks that isn't performed during other processes. CPU's/GPU's can "soft-fail" duirng gameplay, which sometimes results in a brief stutter, but without errors or BSODs or exits to desktop if CPU/GPU recovers. agree with idea of soft-fail to keep the system running, but the user should be notified that CPU is failing.
 
Researched this a bit more. It looks like shader compilation performs thorough intergrity checks that isn't performed during other processes. CPU's/GPU's can "soft-fail" duirng gameplay, which sometimes results in a brief stutter, but without errors or BSODs or exits to desktop if CPU/GPU recovers. agree with idea of soft-fail to keep the system running, but the user should be notified that CPU is failing.
I dont believe this is the issue here. Nothing to suggest there is something wrong with the design of those CPUs that make them fail at shader compilation. People run them without any issues. Those that had issues, had increasing failure rate and some that RMAd their CPU saw those issues go away for a while after just changing CPU. The problem is not that the 13900K and 14900Ks are not up to the task when it comes to run games. They are perfectly fine, but act like they have a really bad OC and degrade over time like bad OCs do.

This is [H]ardOCP. Been an overclock site since the beginning. Many, especially the old timers here, recognize the symptoms and the fixes already. Some of the universal truths in OC are:
All chips degrade over time due to electronmigration, but you most likely wont notice it through the lifetime of the chip.
If you OC the chip, it might degrade faster, but within reason, you most likely wont notice it through the lifetime of the chip.
A bad OC can degrade the chip so fast that you will notice it through the lifetime of the chip. Over time, you might need to increase voltage then to maintain clocks, decrease heat to maintain clocks, or clock the chip down to achieve stability on same voltage.
A really bad OC can cause shorts and make the chip fail during the lifetime of the chip. If you want to read what electronmigration is and what it does to the chip, here is a good recent link:
https://semiengineering.com/electromigration-concerns-grow-in-advanced-packages/

As said, the 13900K and 14900K are perfectly capable CPUs. They dont have an inherit flaw that makes them not capable of shader compilation. Binned CPUs can maintain higher clocks on lower voltage and though heat can increase electronmigration, most cool them properly. However, the siliicon is pushed a bit already from Intel and motherboard vendors pushed them even further. During shader compilation, the CPU do those short boosts with higher current and clocks. You can have limits put in place, but MB manufacturers does not seem to respect them. Here is a link about those limits:
https://edc.intel.com/content/www/c...heet-volume-1-of-2/004/package-power-control/

Looks like they also increased the PL3 to reach 6.2ghz for some millisekunds on 14900KS.

So, what people suspects, since symptoms and fixes are the same, is that there is a default bad OC on the MB for the 14900 and 13900 chips causing the CPUs to degrade rapidly (some experience issues after 1 month, some faster, some slower, guess ASIC quality and cooling can vary). High current and high heat cause more electromigration, while high voltage can cause oxide breakdown. If MB manufacturers push both above limit, it can shorten lifespan of your CPU.
People who got the problems, lower clocks, then things are stable again. Just like what you need to do if OC gets unstable. Or you can increase voltage if heat is under control. Still, you might have an unhealthy OC from your MB manufacturer and only temporarily fixed the issue.

You can put in Intels power limits and run them safely, like you suggested yourself earlier in this thread, while waiting for the Bios update some already have in beta. :)
 
Last edited:
Intel issues preliminary statement (https://www.igorslab.de/en/intel-re...ion-k-sku-processor-instability-issue-update/):
Intel® has observed that this issue may be related to out of specification operating conditions resulting in sustained high voltage and frequency during periods of elevated heat.
Analysis of affected processors shows some parts experience shifts in minimum operating voltages which may be related to operation outside of Intel® specified operating conditions.

While the root cause has not yet been identified, Intel® has observed the majority of reports of this issue are from users with unlocked/overclock capable motherboards.
Intel® has observed 600/700 Series chipset boards often set BIOS defaults to disable thermal and power delivery safeguards designed to limit processor exposure to sustained periods of high voltage and frequency, for example:
– Disabling Current Excursion Protection (CEP)
– Enabling the IccMax Unlimited bit
– Disabling Thermal Velocity Boost (TVB) and/or Enhanced Thermal Velocity Boost (eTVB)
– Additional settings which may increase the risk of system instability:
– Disabling C-states
– Using Windows Ultimate Performance mode
– Increasing PL1 and PL2 beyond Intel® recommended limits

Intel® requests system and motherboard manufacturers to provide end users with a default BIOS profile that matches Intel® recommended settings.

Intel® strongly recommends customer’s default BIOS settings should ensure operation within Intel’s recommended settings.
In addition, Intel® strongly recommends motherboard manufacturers to implement warnings for end users alerting them to any unlocked or overclocking feature usage.

Intel® is continuing to actively investigate this issue to determine the root cause and will provide additional updates as relevant information becomes available.

Intel® will be publishing a public statement regarding issue status and Intel® recommended BIOS setting recommendations targeted for May 2024.

Is there an official chart/list of Intel's "Power Delivery" profiles for each Raptor Lake CPU model for power current (ICCMax)? Wikipedia only publishes PL1 and PL2 limits, not power current (ICCMax) and Intel's official PDF is confusing - https://cdrdv2.intel.com/v1/dl/getContent/743844 .
 
Last edited:
A bit off-topic, but how is Turbo Boost different from manual all core OC with Turbo Boost disabled? In either case safe settings make CPU core clocks fluctuate.
 
Hi. Today i launched Lords of The Fallen and during shader compiling it crashed with WHEA LOGGER in event. Should i worry? Otherwise all is stable. Cinebench,games etc.Only in shaders first time compile. Screen: Internal Parity Error
sh.jpg


I have 14900K stock, 2x16GB DDR5 6800,rtx 4090,Seasonic Px 1600
 
Today i was downloaded Remnant 2. And i cant play. Before main menu it will pop up OUT OF VIDEO MEMORY . Then again next launch start new game and again the same crash. Ok finally when i started new game,after 5 minutes crash to desktop. Again and 10 minutes next crash.
Its only one game giving me this issues.

screen also WHEA LOGGERS events:
sh.jpg

I dont have updated bios. But on latest bios i see this line::"

https://wccftech.com/gigabyte-baseline-gaming-stability-bios-option-turns-intel-14th-13th-gen-core-i9-cpus-into-core-i7-multi-thread-gaming-performance-loss/


"Intel Baseline" BIOS option. How much % performance i can loose when i use this? Iit will be visible loss?

BaseLine-vs-Auto-CPU-Performance-In-Cyberpunk-2077.jpg




I have 14900K stock with NZXT kraken elite 360, 2x16GB DDR5 6800,rtx 4090,Seasonic Px 1600,SSD 2TB,Aorus Z790 Elite X.






sorry for my bad language i m so nervous
 
Last edited:
I'm not sure about that game, but, you see the Gigabyte boards went ultra conservative on the Intel Baseline specs and the performance is measurable. That site you linked had data that is similar to other data I've seen posted.

The difference in other games you play will vary. You may not see any real difference in GPU dependent games, using high resolutions, running variable synch, etc.

For the one you mentioned, going from nonplayable to playable is like a 100% improvement.
 
So i turned off STEAM FPS COUNTER and run Remnant 2 x64 shipping exe in binaries as ADMIN. No whea loggers during loading LEVELS and no crashes. Try it. Maybe i should test longer but it not crashed anymore with this. But i think i will crash again sooner or later. Ok i will try update BIOS first.

https://www.gigabyte.pl/products/page/mb/Z790-AORUS-ELITE-X-WIFI7-10/support#support-dl

[F6g] = newest

  1. Checksum : D7CB
  2. Optimize CEP and power settings
    https://www.gigabyte.com/Press/News/2156
  3. Processor support and optimization for i9-14900KS
  4. Update Intel APO (DTT) framework version to 9.0.11405.42569
  5. Add Turbo Power Limits : Intel BaseLine support for 13th/14th GEN K-SKU CPU
 
Last edited:
Guys i dont know whats going on. But launching Hogwart Legacy with MSI AFTERBURNER on ,generating WHEA logger warnings in event logs during shader compile. I am checking that.
When i turn off MSI AFTERBURNER it will complete shader process without WHEAs. Any ideas why?

I m still on stock bios.
 
Oki it rebooted this time lol. I guess it was bsod. When i was joining COOP match in this game. Reboot. So update bios time i must.
 
Someone said this.
I strongly recommend this guide:"

https://wccftech.com/asus-intel-bas...te-14th-13th-gen-cpu-gaming-stability-issues/

This has solved my problems with Unreal Engine games!

BUT:
14900KS SP111 / Asus Z790 Apex (white) / Bios 2002

CB23: 41.628 points
CB23: 31.462 points (according to Intel defaults)

10.000 points performance loss!



But one option brought the performance back:

I have now only set the bios feature "IA CEP" to disabled and all other settings as in the guide
and thus reach 41,297 points!

Here are more informations about this "IA CEP" feature:

https://rog-forum.asus.com/t5/techn...h-gen-non-k-processors-to-enhance/ba-p/996643

I would first recommend these settings as in the guide if it runs stable then you can deactivate IA CEP.

Games like Horizon Dawn run with this guide according to Intel specifications just as fast as before "unlocked".

Edit:
I have to select "Intel's Fail Safe" instead of AUTO for SVID Behavior in the bios,
RoboCop: Rogue City will also run! (with AUTO game crashes)."











So what now? Update bios first and what then ? Update bios enable INTEL BASE line profile, and then disable CEP ? I have Gigabyte A790 Aorus Elite X Wifi7, board
 
Today i was downloaded Remnant 2. And i cant play. Before main menu it will pop up OUT OF VIDEO MEMORY . Then again next launch start new game and again the same crash. Ok finally when i started new game,after 5 minutes crash to desktop. Again and 10 minutes next crash.
Its only one game giving me this issues.

screen also WHEA LOGGERS events:
View attachment 651725
I dont have updated bios. But on latest bios i see this line::"

https://wccftech.com/gigabyte-baseline-gaming-stability-bios-option-turns-intel-14th-13th-gen-core-i9-cpus-into-core-i7-multi-thread-gaming-performance-loss/


"Intel Baseline" BIOS option. How much % performance i can loose when i use this? Iit will be visible loss?

View attachment 651726



I have 14900K stock with NZXT kraken elite 360, 2x16GB DDR5 6800,rtx 4090,Seasonic Px 1600,SSD 2TB,Aorus Z790 Elite X.






sorry for my bad language i m so nervous
This lowers the max power usage to the actual power usage of a 13600k/14600k. Of course its going to be a huge performance loss. This will only actually be fixed with a micro code update. Until then, all of these "Fixes" are wildly swinging to get waaaay around the issue, maybe.

IIRC, 13900k/14900k need something like 210 watts, to not really lose much game performance in most games. But, something like Starfield or Cyberpunk (which are CPU/Thread heavy) might still need more.

But one option brought the performance back:

I have now only set the bios feature "IA CEP" to disabled and all other settings as in the guide
and thus reach 41,297 points!

Here are more informations about this "IA CEP" feature:

https://rog-forum.asus.com/t5/techn...h-gen-non-k-processors-to-enhance/ba-p/996643

I would first recommend these settings as in the guide if it runs stable then you can deactivate IA CEP.

Games like Horizon Dawn run with this guide according to Intel specifications just as fast as before "unlocked".

Edit:
I have to select "Intel's Fail Safe" instead of AUTO for SVID Behavior in the bios,
RoboCop: Rogue City will also run! (with AUTO game crashes)."











So what now? Update bios first and what then ? Update bios enable INTEL BASE line profile, and then disable CEP ? I have Gigabyte A790 Aorus Elite X Wifi7, board
More wild swinging. CEP really shouldn't ever be disabled, unless you are purposefully pushing a CPU really hard and either want an overclock record, or don't care about damaging the CPU.

It seems to me that in this situation you have created, turning off CEP is allowing the CPU to sidestep many of the limits you have otherwise set.
 
  • Like
Reactions: hu76
like this
This lowers the max power usage to the actual power usage of a 13600k/14600k. Of course its going to be a huge performance loss. This will only actually be fixed with a micro code update. Until then, all of these "Fixes" are wildly swinging to get waaaay around the issue, maybe.

IIRC, 13900k/14900k need something like 210 watts, to not really lose much game performance in most games. But, something like Starfield or Cyberpunk (which are CPU/Thread heavy) might still need more.


More wild swinging. CEP really shouldn't ever be disabled, unless you are purposefully pushing a CPU really hard and either want an overclock record, or don't care about damaging the CPU.

It seems to me that in this situation you have created, turning off CEP is allowing the CPU to sidestep many of the limits you have otherwise set.
So thank you for reply. Really greatful bro. So my issue in Remnant 2 is during transitions , i mean loading new levels or joining in COOP it will throw OUT OF VIDEO error or BSOD rarely ,but in most cases is WHEA LOGGER in event logs .
So i guess i need update bios first?
 
what fixes? you mean P1 P2 power limits or that INTEL BASE LINE PROFILE ( which performance degrade ) ?
I don't pretend to be an expert. But, I havebhad a few Intel systems over the past 4 years.

If I were having problems right now, I would update to the newest bios and use the new "baseline profile" or whatever each brand is calling it. That should disable most of their auto enhancement features, auto disabling protection like CEP, etc.

Then, I would look at the Intel turbo/power spec for my cpu. Here is an example from the articles you linked:

  • Intel Standard Config (125W): 125W/253W/307A
  • Gigabyte BaseLine Profile: 125W/188W/249A
I think that's for a 14900k.

So the Intel spec is 125w for long duration and 253w for short duration.

Often, people will set the long duration to be 253w. (253w/253w). This will improve your multicore performance and cinibench score. And before these recent problems, was totally safe. You just need decent cooling.

(There is also a setting for how long the "short" duration is. Intel spec is pretty low, like 45 seconds or something. It's short enough, you can watch the clock speed drop while running cinibench.

Often people will max out the time limit to something like 400 seconds. Which is effectively unlimited turbo time).

For gaming, 125w/253w and 253w/253w usually performs almost exactly the same. If your main focus is gaming, I would compare them.


For the current numbers (307A or 249A): Some workloads it can make a big difference. Some it will make no difference.

I would try the Intel spec first, and only try 249A, if I still experienced errors and crashing.


_____________

For educational purposes:

What I used to do with 10700,11700,12700k,13600k-----is use Intel settings. Then, set unlimited turbo time. Then set the power limit way higher than needed (400w/400w) then run cinibench for 5 minutes, and see what the actual power use is. (For a 13600k, it's somewhere around 180 watts).

Then I would make that my max power limit.

The only game I ever saw 180 watts in, was Starfield.
Running the 13600k with no hyperthreading and only 2 ecores, lost only 3-5 fps, and used about 105 watts.
 
  • Like
Reactions: hu76
like this
So P1 253W , P2 253W , LCCMAX 307A?


Ah btw CPU CURRENT LIMIT is LCCMAX? This is the same ?

Ah also i checked now bios and my is F3, that the oldest one. So i bet updating will help
 
Last edited:
Back
Top