New multi-threading technique promises to double processing speeds

kac77 · Feb 25, 2024

New multi-threading technique promises to double processing speeds

'SHMT' also sliced power usage by 51% compared to existing techniques

By Zane Khan Today 11:12 AM

Researchers at the University of California Riverside developed a technique called Simultaneous and Heterogeneous Multithreading (SHMT), which builds on contemporary simultaneous multithreading. Simultaneous multithreading splits a CPU core into numerous threads, but SHMT goes further by incorporating the graphics and AI processors.

DouglasteR · Feb 25, 2024

So, HyperMegaBlasthreading ?

Nobu · Feb 25, 2024

Sounds like what AMD was trying to do with HSA on APUs, just going by the summary in the OP. Wouldn't surprise me if someone picked up the pieces and finally found something that worked, but otoh I'm sure there are still tradeoffs.

Lakados · Feb 25, 2024

Also sounds like what Intel is doing with their new schedulers as well.

1_rick · Feb 25, 2024

Nobody else is getting Itanium flashbacks? Just like this, that required a lot of heavy lifting on the compilers part to get results, and it was one of the reasons Itanium died.

uOpt · Feb 25, 2024

1_rick said:
Nobody else is getting Itanium flashbacks? Just like this, that required a lot of heavy lifting on the compilers part to get results, and it was one of the reasons Itanium died.

Yeah,, except that if Itanium had the magic compiler they were betting on it would speed up more general code than just "AI".

LukeTbk · Feb 25, 2024

Good luck to the people trying to make this work well, must be quite the challenge

KazeoHin · Feb 25, 2024

The reason SMT works so well is that it's completely invisible to the program. It's just another core. If this requires ANY and I mean ANY consideration from ANY developer aside from the OS, then it's enterprise-only.

LukeTbk · Feb 25, 2024

KazeoHin said:
then it's enterprise-only.

Gaming development is quite hardcore, they jumped on GPU hard, avx-512, etc... look at what they can run on a PS4. It is so competitive and some engine like Unreal-COD-what they do for GTA-etc... get so much budget and some of the best people in the world on them, it is not needing consideration from their team that would ever stop them I think, how much issue it create and % of target audience in the field that have hardware that support it, if the PS6 has something like this that need the code to made in some ways to take advantage of it, that Gran Turismo 9 team will do it.

CUDA and many other use of GPU is not software invisible and quite used by user facing software.

Specially with how much of a performance difference using the TPU-GPU or CPU can be for different operation, if you make it easier not harder to use them versus now, it will be inviting.

And in a way programmer needed to change their code to take advantage of computer becoming multithread otherwise it only helped in a running many app at the same time type of scenario (which is common on PC, but you get what I mean)

stormy1 · Feb 25, 2024

More timing and side channel vulnerabilities in 3 2 1 GO!!!!

OutOfPhase · Feb 25, 2024

uOpt said:
Yeah,, except that if Itanium had the magic compiler they were betting on it would speed up more general code than just "AI".

"So, all we need is magic technology which doesn't exist, and possibly cannot. Sounds good, right?"
"Well, no, because"
<junior engineer is slapped>

"Anyone else have inputs?"

GotNoRice · Feb 25, 2024

I think that there is potential here. One of the big problems with new features is adoption. If everything is done through a virtualized hardware interface, then the actual underlying hardware could potentially change without affecting compatibility. Specific hardware is used if it exists, otherwise it's passed on to the next best processor for that task, or eventually the CPU if nothing else can do it.

We just need to avoid proprietary standards that purposefully excludes certain brands of hardware and/or require unrealistic time/effort on the part of software developers to integrate.

OutOfPhase · Feb 25, 2024

GotNoRice said:
We just need to avoid proprietary standards that purposefully excludes certain brands of hardware and/or require unrealistic time/effort on the part of software developers to integrate.

There is work going on there. General ways of making a problem set, describing assets, the cost of computations on that asset, and away we go.

It is of course - a really hard problem to solve without a priori knowledge of problem set.

kac77 · Feb 27, 2024

Nobu said:
Sounds like what AMD was trying to do with HSA on APUs, just going by the summary in the OP. Wouldn't surprise me if someone picked up the pieces and finally found something that worked, but otoh I'm sure there are still tradeoffs.

AMD actually was able to put out a driver on Linux that worked with the Fusion processors and Sea Island GCN cards (Bonaire,Curacao, etc) . However, when Lisa Su came she focused work on putting together Zen and eventually ROCm.

Lepardi · Feb 27, 2024

This is why you get maximum threads available on your CPU and dont settle for the "6 or 8 cores is enough"

LukeTbk · Feb 27, 2024

Lepardi said:
This is why you get maximum threads available on your CPU and dont settle for the "6 or 8 cores is enough"

In general, sound like terrible advice and here sending more work that the CPU would have done to the NPU, TPU, GPU instead maybe could end up making easier to maximise CPU threads but could be the other way around, make cpu multithreading performance less important.

westrock2000 · Feb 28, 2024

Ready your PhysX Cards boys, we ride again!

GotNoRice · Feb 28, 2024

westrock2000 said:
Ready your PhysX Cards boys, we ride again!

It's easy to dismiss new tech so quick, but just keep in mind that there was a time when even a GPU was a new thing with an uncertain future. Early first-person shooters (Doom, etc) were played with all graphics rendered using the CPU.

LukeTbk · Feb 28, 2024

Phone-Laptop-Consoles sold with already NPU-TPU present (when not attach to the CPU as one of the cores) is also an easier entry than consumer buying a discrete card like the physics cards attempt of the past, GPU, soundcard for a while worked well.

We could be a situation where all customer has it, without a race of we need game that use PhysX card for people to buy them butwe need people to have PhysX card for game to put effort in it.

About all Apple device, AMD/Intel laptop will have them, I imagine soon Intel desktop CPU will have them, everyone has a GPU, consoles could simply ship with them

erek · Mar 18, 2024

Reverse-HyperThreading

https://web.archive.org/web/20070910190648/http://www.theinquirer.net/?article=32589

philb2 · Mar 18, 2024

erek said:
Reverse-HyperThreading

https://web.archive.org/web/20070910190648/http://www.theinquirer.net/?article=32589

View attachment 642276

So how did all that turn out in practice?

ChadD · Mar 18, 2024

Looking forward to the Intel implementation....
Process 4444 core 22 Save Cache data
Process 4445 core 36 requests load Cache data
Hypervisor check privilege level for process 4445...... NAAA just give em the data checks take too long.

New multi-threading technique promises to double processing speeds

2[H]4U