Intel Arrow Lake to remove HyperThreadding?

Lakados

[H]F Junkie
Joined
Feb 3, 2014
Messages
10,738
https://www.tweaktown.com/news/9597...threads-no-hyper-threading-support/index.html
https://www.hardwaretimes.com/intel...able-units-why-hyper-threading-is-going-away/

This is just cool, I have no idea how well it will or won't work but I'm interested.

The jobs are sent into the CPU instead of simply going OK this is a big one send it to the P core, or this is a little one off to the E core with it.
The Job is broken up into multiple smaller parts and simply dispatched to whichever portion of the CPU is available and best equipped to deal with the job.
Based on the patent diagrams in the Rentable Units articles, if it works as intended then it is both more flexible than HyperThreading and does a better job at filling out the cores with incoming workloads, while also simplifying the actual CPU cores at the expense of a more complex scheduler. The removal of hyperthreading will also address the numerous holes that are being poked in the process by security teams who are finding new and more complicated means of exploiting it on AMD and Intel alike, obviously, it introduces some new ones but hey at least they will probably take a few years to be found.
 
I don't know enough to speculate how, but I have a feeling this will make the 15th gen launch particularly interesting....
 
With their ability to pack in E-Cores and the updates to Windows scheduler to support this, I can see them totally getting away from HT. It was a cool technology, and very helpfull in the early 2000's when it came out. But now days it probably gets in the way more than it helps (while designing chips). I support this.

I remember the first CPU I had with HT (I think 3.06GHz P4 maybe), when it was just 1 core 2 threads. It was a massive increase when doing things as simple as multiple windows open at the same time.
 
https://www.tweaktown.com/news/9597...threads-no-hyper-threading-support/index.html
https://www.hardwaretimes.com/intel...able-units-why-hyper-threading-is-going-away/

This is just cool, I have no idea how well it will or won't work but I'm interested.

The jobs are sent into the CPU instead of simply going OK this is a big one send it to the P core, or this is a little one off to the E core with it.
The Job is broken up into multiple smaller parts and simply dispatched to whichever portion of the CPU is available and best equipped to deal with the job.
Based on the patent diagrams in the Rentable Units articles, if it works as intended then it is both more flexible than HyperThreading and does a better job at filling out the cores with incoming workloads, while also simplifying the actual CPU cores at the expense of a more complex scheduler. The removal of hyperthreading will also address the numerous holes that are being poked in the process by security teams who are finding new and more complicated means of exploiting it on AMD and Intel alike, obviously, it introduces some new ones but hey at least they will probably take a few years to be found.

I mean, I suppose with so many cores being packed on to a die these days, there's less need for HT. It was super useful back in the day because it could compensate a bit for a lack of cores and it COULD make a measurable impact (you could see the useable longevity of the i7 4770K vs the i5 4670K, for example, the latter of which is a CPU I still own), but it has come at a cost in recent times we've seen to single core performance for things like gaming and security holes that have cropped up as you pointed out. I'll wait for the reviews and benchmarks and everything as always, but I agree, I'll be interested in seeing how this plays out.
 
second article has a great example of what hyperthreading actually does, and it makes sense.

When we only had 1 or 2 cores in a CPU, hyperthreading was a game changer as it really allowed us to parallelize at the CPU level that was not possible before.

Now though, its a hinderance to functionality. Better off using that die space to make more physical cores per package. RU will be interesting to see how it plays out. If it can give huge gains in IPC and use less power, will be a game changer.
 
I've toyed with just turning SMT off on my 7800x3d before I even read this article. Just run 8 raw cores instead.

I hope AMD is doing this in coming gens. I'm sure AMD knew about Intel doing this long before any of us did and in response are researching it now in thier labs.
 
Considering intels struggle with properly allocating work across p and e cores this may be a bit of a stretch as far as positive architecture change goes.

Many operations wil abstract themselves far enough away from the cpu instructions requiring some degree of optimization/schedualing to occur at the os level.

Although a shift to more flexible x86 could be beneficial, and recently hyperthreading hasn't been terribly essential. I ran a 10 core xeon with no ht and it was perfectly capable.
 
The description from the article seems to be let's just schedule smaller time slices to utilize all cores a bit more? (And hope the scheduler doesn't eat too many cycles)

I think we're going to have to wait and see what it does when it comes out.

I'm not surprised if hyperthreading is on the way out. The security implications are hard to mitigate without tanking performance, and the interactivity benefits we all saw when going from 1 core to 2 hyperthreads aren't necessarily there at 4 cores. Especially when you can take the die space for 1 P core and use it for multiple e or c cores.
 
https://www.tweaktown.com/news/9597...threads-no-hyper-threading-support/index.html
https://www.hardwaretimes.com/intel...able-units-why-hyper-threading-is-going-away/

This is just cool, I have no idea how well it will or won't work but I'm interested.

The jobs are sent into the CPU instead of simply going OK this is a big one send it to the P core, or this is a little one off to the E core with it.
The Job is broken up into multiple smaller parts and simply dispatched to whichever portion of the CPU is available and best equipped to deal with the job.
Based on the patent diagrams in the Rentable Units articles, if it works as intended then it is both more flexible than HyperThreading and does a better job at filling out the cores with incoming workloads, while also simplifying the actual CPU cores at the expense of a more complex scheduler. The removal of hyperthreading will also address the numerous holes that are being poked in the process by security teams who are finding new and more complicated means of exploiting it on AMD and Intel alike, obviously, it introduces some new ones but hey at least they will probably take a few years to be found.


Interesting.

So is this like one big shared decode unit, that decodes everything and sends the micro-ops to the individual cores?
 
Last edited:
Interesting.

So is this like one big shared decode unit, that decides everything and sends the micro-ops to the individual cores?
That’s how my brain wrapped around it.
No clue if it’s a gross simplification of the tech but it’s what I’m going with.
It was that or something more akin to how a GPU scheduler works, and I wonder if this revelation came as a result of Intels work with GPUs?
 
If this can advance their chips in other ways and make them more highly optimized, then by all means, trim the fat. Intel has been doing some good things lately, but the 14th gen stuff has stagnated.
 
That’s how my brain wrapped around it.
No clue if it’s a gross simplification of the tech but it’s what I’m going with.
It was that or something more akin to how a GPU scheduler works, and I wonder if this revelation came as a result of Intels work with GPUs?

I bet this could be really useful with chiplets.... :p
 
Most gamers turn it off to see if it will boost performance in games. So yeah not a new idea but new at a hardware level of just leaving it off.
Do you realize that they are not just turning off Hyperthreading but instead replacing it with an entirely different scheduler that will break the incoming instructions down into smaller parts and batch them across the registers in a much more efficient method?
 
I bet this could be really useful with chiplets.... :p
Given their Tile strategy and their OneAPI framework yeah.
Who knows it may be what makes the next Xbox a serious unit, word on the street is Microsoft is seriously considering Intel for the next one.
 
  • Like
Reactions: noko
like this
With a big core and small core on 1 cpu package I am sure it was causing issues for Intel. But AMD may follow suit as well, as relying on Microsoft to use the cores properly is not exactly ideal.
 
Seem so, so not so much HT going away but being a much more extreme version of it implemented

Could also be a gradual move towards a new RISC based ISA.

Keep the master x86 decode unit separate, and have RISC-like cores, and when ready, make the move either make the decode unit optional, or just remove it :p
 
Do you realize that they are not just turning off Hyperthreading but instead replacing it with an entirely different scheduler that will break the incoming instructions down into smaller parts and batch them across the registers in a much more efficient method?
Make you wonder if it could become a step that the unit that offer to be rented could be a gpu/other one of those tile eventually, with some instruction not minding running on those from time to time or even prefering it.
 
Make you wonder if it could become a step that the unit that offer to be rented could be a gpu/other one of those tile eventually, with some instruction not minding running on those from time to time or even prefering it.

Seem quite general:
, the instructions including parallel threads; and processor circuitry including one or more of: at least one of a central processor unit, a graphics processor unit, or a digital signal processor, ...
In some examples, the processor circuitry 912 of FIG. 9 may be in one or more packages. For example, the microprocessor 1000 of FIG. 10 and/or the FPGA circuitry 1100 of FIG. 11 may be in one or more packages. In some examples, an XPU may be implemented by the processor circuitry 912 of FIG. 9, which may be in one or more packages. For example, the XPU may include a CPU in one package, a DSP in another package, a GPU in yet another package, and an FPGA in still yet another package.



https://www.freepatentsonline.com/y2023/0168898.html
 
Do you realize that they are not just turning off Hyperthreading but instead replacing it with an entirely different scheduler that will break the incoming instructions down into smaller parts and batch them across the registers in a much more efficient method?

I will wait for reviews to see if that is actually how it works. If so then we should see more than a 5% uplift in performance. I am all for a hardware based scheduler if thats what they are truly doing, Just skeptical of performance uplift.
 
I've toyed with just turning SMT off on my 7800x3d before I even read this article. Just run 8 raw cores instead.

I hope AMD is doing this in coming gens. I'm sure AMD knew about Intel doing this long before any of us did and in response are researching it now in thier labs.

Did you see a difference in gaming with it on or off for the 7800X3D?
 
I've toyed with just turning SMT off on my 7800x3d before I even read this article. Just run 8 raw cores instead.

I hope AMD is doing this in coming gens. I'm sure AMD knew about Intel doing this long before any of us did and in response are researching it now in thier labs.

I have it disabled on mt Threadripper 3960x.

Turning off SMT does reduce performance in highly threaded loads, but it also greatly improves performance in some titles and programs that get confused y very large numbers of threads (Starfield was an example of that)

Having SMT off, also reduces temps a good deal, which in some circumstances may allow for higher turbo clocks.
 
I've toyed with just turning SMT off on my 7800x3d before I even read this article. Just run 8 raw cores instead.

I hope AMD is doing this in coming gens. I'm sure AMD knew about Intel doing this long before any of us did and in response are researching it now in thier labs.
AMD hasn't even gone big little core design like Intel, and still has superior power consumption. I wouldn't be shocked if they just continue SMT and still beat Intel in IPC.
 
AMD hasn't even gone big little core design like Intel, and still has superior power consumption. I wouldn't be shocked if they just continue SMT and still beat Intel in IPC.

AMD has laptop/APU chips with a mix of Zen4 and Zen4c, which is sort of BIG.little, although there's no difference in capabilities between the two core types, just clocks are limited in exchange for less die space. SMT for both though.
 
I turned SMT off. It's back on now. I can't tell a difference in games with it on. I'd rather have the full ppwer of my cpu for now.
 
I turned SMT off. It's back on now. I can't tell a difference in games with it on. I'd rather have the full ppwer of my cpu for now.

Only time I have noticed a real difference has been on my Threadripper.

Some titles (and especially 3DMark) seem to get really confused on large core count systems with SMT enabled.
 
Only time I have noticed a real difference has been on my Threadripper.

Some titles (and especially 3DMark) seem to get really confused on large core count systems with SMT enabled.
Yeah I used to have threadripper. Long sold it. I used process lasso to make things easier on those chips.
 
Considering intels struggle with properly allocating work across p and e cores this may be a bit of a stretch as far as positive architecture change goes.

Intel is not responsible for the Windows task scheduler. It’s a hundred percent on Microsoft.

AMD hasn't even gone big little core design like Intel, and still has superior power consumption. I wouldn't be shocked if they just continue SMT and still beat Intel in IPC.

AMD’s got multiple Zen5+Zen4c designs in the works. But both core designs will still have two-way hyperthreading, so an 8+8 chip will have 32 threads.

IIRC what will make Zen4c different from Zen4 will be a reduced instruction set, and maybe less cache?
 
Intel is not responsible for the Windows task scheduler. It’s a hundred percent on Microsoft.



AMD’s got multiple Zen5+Zen4c designs in the works. But both core designs will still have two-way hyperthreading, so an 8+8 chip will have 32 threads.

IIRC what will make Zen4c different from Zen4 will be a reduced instruction set, and maybe less cache?
AMD also has a solid 2 generation lead on fabrication processing.
A rose by any other name is still a rose, and 10nm by any other name is still 10nm, TSMC 5/4 is very far ahead of Intel 7 in performance and power.
 
Back
Top