AMD Zen Rumours Point to Earlier Than Expected Release

The answer is in your own post. Most games today are GPU bound. Keeping that in mind it means the CPU can take on more work and tasks, as I stated before. An infinitely fast GPU allows for the option of seeing how much, rather how far the CPU can go in DX12. Akin to running benchmarks at 800x600 to see what the CPU framerate max/avg is, not very practical in an age where 1080p is generally the bottom line for anyone playing games today.

Idk what you mean by 1080p; I'm assuming 16:9 1920x1080. I have a CRT monitor, so I use different resolutions for different games, because at lower resoluions, the montitor can handle better refresh rates. 2048x1536@85Hz is cool for Civ V and such, slow action games are better at 1600x121200 @100 Hz. Fast paced games like CS or CoD are best at 1280x960@140 or 1152x864@160.

At lower resolutions, the CPU can bottleneck
 
At lower resolutions, the CPU can bottleneck

Sure, but for most CPUs, that typically doesn't happen. Hence why I've always held that lower tier CPUs, specifically the Core i3 and Pentium lineups, are going to see the biggest gains under DX12, since the GPU driver load is going to significantly reduce CPU overhead. For anything much above an i5, you aren't going to see much movement with DX12 (at least on NVIDIA; AMD has driver side issues which have been apparent since Mantle came out).
 
Sure, but for most CPUs, that typically doesn't happen. Hence why I've always held that lower tier CPUs, specifically the Core i3 and Pentium lineups, are going to see the biggest gains under DX12, since the GPU driver load is going to significantly reduce CPU overhead. For anything much above an i5, you aren't going to see much movement with DX12 (at least on NVIDIA; AMD has driver side issues which have been apparent since Mantle came out).

This statement is so wrong it is not even funny. I3 will crap out on you if you move beyond 60K+ batches in a serious way we saw proof of that with Oxide nitrous engine running Mantle. Sure you can have engines that will not demonstrate the behaviour as what Nitrous does but it would not make much sense. Frostbyte 3 version of PC BF4 did something of 30K+ batches if game development moves along the lines where they would develop for coming architectures this number is doubled by the next year at least.

Then you can take your I3 and put it where it belongs .....
 
Last edited:
All the speculation is driving me CRAZY! :D I want real information AMD, where is it? Please? ;) :) Right now, changing from my FX 8350 at work would be a waste of money.
 
I expect that AMD does not yet know what the stock frequency will be in 6+ months when these are released.
 
This statement is so wrong it is not even funny. I3 will crap out on you if you move beyond 60K+ batches in a serious way we saw proof of that with Oxide nitrous engine running Mantle. Sure you can have engines that will not demonstrate the behaviour as what Nitrous does but it would not make much sense. Frostbyte 3 version of PC BF4 did something of 30K+ batches if game development moves along the lines where they would develop for coming architectures this number is doubled by the next year at least.

Then you can take your I3 and put it where it belongs .....

Batches are relatively cheap though, and the performance loss of doing more will be offset by reduced driver overhead.
 
Dresdenboy has some more info about what is going on with Zen: The New Citavia Blog

There are strings for both a "uop cache" and a "uop buffer". So far I knew about this uop buffer patent filed by AMD in 2012, which describes different related techniques aimed at saving power, e.g. when executing loops or to keep the buffer physically small by leaving immediate and displacement data of decoded instructions in an instruction byte buffer ("Insn buffer") sitting between instruction fetch and decode. The "uop cache" clearly seems to be a separate unit. Even without knowing how many uops per cycle can be provided by that cache, it will help to save power and remove an occaisional fetch/decode bottleneck when running two threads.


Zen-Architektur%2BCore%2BV0.3.2.png
Notable changes are:
uOp Cache has been added based on the new patch
FMUL/FADD for FMAC pairing removed, based on some corrections of the znver1 pipeline description.
4x parallel Page Table Walkers added, based on US20150121046
128b FP datapaths (also to/from the L1 D$) based on "direct" decode for 128b wide SIMD and "double" decode for 256b AVX/AVX2 instructions
32kB L1 I$ has been mentioned in some patents. With enough ways, a fast L2$ and a uOp cache this should be enough, I think.
issue port descriptions and more data paths added
2R1W and 4 cycle load-to-use-latency added for the L1 D$ based on info found on a LinkedIn profile and the given cylce differences in the znver1 pipeline description
Stack Cache speculatively added based on patents and some interesting papers. This doesn't help so much with performance, but a lot with power efficiency.

It seems that the compiler prefetcher is not something to be used in GCC it seems that the hardware part is better.
 
Last edited:
I expect that AMD does not yet know what the stock frequency will be in 6+ months when these are released.

The process they are using is optimized for sub-3GHz clock speeds, so I don't expect clocks much higher then that, simply because power draw will start to rise at a rate too hard to ensure quality control on. As a byproduct of this, I don't expect Zen to OC particularly well.
 
The process they are using is optimized for sub-3GHz clock speeds, so I don't expect clocks much higher then that, simply because power draw will start to rise at a rate too hard to ensure quality control on. As a byproduct of this, I don't expect Zen to OC particularly well.

And you have articles for this where other chips that use LPP at Samsung did not get anywhere over 3ghz by design or did not have to get there ?
 
And you have articles for this where other chips that use LPP at Samsung did not get anywhere over 3ghz by design or did not have to get there ?

From a really old article:

Samsung, Nvidia may collaborate on 14nm GPUs — but on what type of silicon? | ExtremeTech

Samsung has two 14nm nodes: 14nm LPE and 14nm LPP. 14nm LPE stands for “Low Power Early,” and promises improved power consumption and performance, but the cream of the crop will arrive later, with 14nm LPP (Laser-Produced Plasma) and a further 15% improvement. Critically, however, both of these are low-power nodes with a specific focus on SoCs and IoT (Internet of Things) types of devices. To date, no manufacturer has announced that they intend to build a high-power product on a Samsung process node, and Samsung has no experience in building that kind of hardware.

14nm LPE/LPP simply aren't designed for anything larger then a mobile SoC, so power scaling with clockspeed is almost certain to be poor. This is also why I'm expecting much lower base clockspeeds compared to BD.
 
Ye and we know they aren't using custom libs on Zen at this point coming from two different fabs. So its reasonable to think clock rates aren't going to going too high.
 
I think the worry/pessimism about low clocks is a bit unfounded. AMD has spent the last few years learning quite a lot about high clocked designs. Some of what they've learned has to apply to zen.
ex. http://ewh.ieee.org/r5/denver/sscs/Presentations/2012_05_Sathe.pdf
Recall that AMD was able to add ~500Mhz from bulldozer to piledriver without shrinking the node or raising the TDP.

I'd say speculation on clock speed either way is pretty unfounded at this point seeing as we really know nothing at all about the topic.
 
From a really old article:

Samsung, Nvidia may collaborate on 14nm GPUs — but on what type of silicon? | ExtremeTech



14nm LPE/LPP simply aren't designed for anything larger then a mobile SoC, so power scaling with clockspeed is almost certain to be poor. This is also why I'm expecting much lower base clockspeeds compared to BD.
I figure we are prob looking at 3.2 - 3.4Ghz initially, actually how most AMD lines come out.

But wouldn't the architecture have a big impact on speed? I don't know myself so really genuine in asking. Or in this case, Samsung only makes Arm chips which are inherently low power low clock processors. How much can we parallel their arm freq to AMDs x86?
 
Ages ago on this very thread razor1 and I were talking about how process/nodes and architectures are inherently tied together.

You definitely adjust your transistor design for high performance versus low power. Yes, there will be differences in clockspeeds between an ARM SOC and a dedicated x86 processor, even on the same process. Need look no further than graphics chips, which, though on a HP process, aren't clocked anywhere near a desktop/server CPU.
 
Yeah design does have a big impact on how high you can clock, but the other constraints are how the big the chip is (total power usage), transistor density, gating, etc). So as much as the processes itself can push clocks, you still have physical limitations based on design and the node.
 
Come on guys; let's save the obvious fake benchmarks until at least a month before Zen releases, OK?
 
Well, I see time has thoroughly debunked this article now. There is no way Zen is coming out last month now isn't it?

They'll be lucky to get it out before the end of the year.
 
Well, I see time has thoroughly debunked this article now. There is no way Zen is coming out last month now isn't it?

They'll be lucky to get it out before the end of the year.

Until AMD says something differently, it is still due for shipments Q4 this year, as they've stated more than once. Nobody knows anything other than that. Everything else is speculation.
 
If the improved IPC is there with 32 full function cores I'm gonna be pretty happy.

I did read somewhere the 32 core version will be two full 16 core cpu's on the same chip?

Wonder how that is going to work with Win 10 cpu/core max by license?
 
It will still be seen as 1 physical processor with 32 logical cores. Same as the way the Pentium D was. The Windows licensing is based on physical sockets.
 
It will still be seen as 1 physical processor with 32 logical cores. Same as the way the Pentium D was. The Windows licensing is based on physical sockets.
I'm guessing that there is a difference between Windows for consumers and the server environment.
If the improved IPC is there with 32 full function cores I'm gonna be pretty happy.
I did read somewhere the 32 core version will be two full 16 core cpu's on the same chip?
Wonder how that is going to work with Win 10 cpu/core max by license?
Linux if I recall correctly allows 32 cores :) What you are describing is a chip for the server market and will not be "cheap". The consumer version supposedly will be 8 (SMT) cores and 95 Watt TDP.
Speculations about Zen, after our April's Fool


Until AMD says something differently, it is still due for shipments Q4 this year, as they've stated more than once. Nobody knows anything other than that. Everything else is speculation.

One of the rumours was that it should be released this October, sadly that rumour came from 4chan so not likely even tho October has been echoing elsewhere not sure if that resonated from the same source...
The "article" was about the AM4 motherboards that were supposedly coming out March but it seems that whole thing has shifted now to Computex with the arrival of the first cpu which will have Excavator cores.
 
If the improved IPC is there with 32 full function cores I'm gonna be pretty happy.

I did read somewhere the 32 core version will be two full 16 core cpu's on the same chip?

Wonder how that is going to work with Win 10 cpu/core max by license?

I would expect to pay 5 thousand dollars for the 32 core chip on the 8 channel ddr4 Server platform sometime in 2017 (if it is released by then and not sometime in 2018). That is unless Zen is not competitive with Intel's 20+ core Broadwell and Skylake based E5 xeons. In the case of Zen not being competitive prices would have to be much lower but still I can't see it being less than a 2 thousand dollar part.
 
Last edited:
Linux if I recall correctly allows 32 cores :) What you are describing is a chip for the server market and will not be "cheap". The consumer version supposedly will be 8 (SMT) cores and 95 Watt TDP.
Linux can handle far more than 32 cores.
 
Microsoft isn't solely about physical procs. If you're using SQL, you're limited to the number of vCPU you can assign to a VM via the license. IIRC...
 
If the improved IPC is there with 32 full function cores I'm gonna be pretty happy.

I did read somewhere the 32 core version will be two full 16 core cpu's on the same chip?

Wonder how that is going to work with Win 10 cpu/core max by license?

1: The 32-core version is almost certainly an Opteron series server chip. It's not going to be available for typical users.

2: Windows licensing is per physical socket, though I note there has been some talk MSFT could change this going forward to match what other companies (IBM) does...
 
One of the rumours was that it should be released this October, sadly that rumour came from 4chan so not likely even tho October has been echoing elsewhere not sure if that resonated from the same source...
The "article" was about the AM4 motherboards that were supposedly coming out March but it seems that whole thing has shifted now to Computex with the arrival of the first cpu which will have Excavator cores.

Limiting sampling Q4, general availability Q1 2017. AMD has said nothing that changes this.
 
If they are sampling in Q4, then the zen benchmark on the prev page is total bs, way too early. Benches arrive with the sampling.
 
1: The 32-core version is almost certainly an Opteron series server chip. It's not going to be available for typical users.

2: Windows licensing is per physical socket, though I note there has been some talk MSFT could change this going forward to match what other companies (IBM) does...

Already done.

Windows Server 2016 moving to per core, not per socket, licensing

The software has not been released yet, but the new license terms were discussed last year. Should be released by Q3, so in-advance of this 32-core single socket monstrosity :D
 
Already done.

Windows Server 2016 moving to per core, not per socket, licensing

The software has not been released yet, but the new license terms were discussed last year. Should be released by Q3, so in-advance of this 32-core single socket monstrosity :D

That's such a bullshit Intel/Microsoft anti-competitive move. There's only one reason that would make anyone more money. It's to punish AMD from dare competing. I remember when they did that shit with SQL. Clearly hurts AMD more than anyone else.
 
That's such a bullshit Intel/Microsoft anti-competitive move. There's only one reason that would make anyone more money. It's to punish AMD from dare competing. I remember when they did that shit with SQL. Clearly hurts AMD more than anyone else.

I don't see how this would punish AMD that much more than it does Intel. I mean I expect AMD to price their 32 core processor similar to Intel's 22 core E5 xeon v4 processor and future 22+ core E5 xeon v5 cpu (which should be available or very close when the 32 core Zen is released). That is unless AMD is not able to compete in performance.
 
Last edited:
It's not about the processor pricing. It's about hte software pricing. It will cost 32% more to run a per core licensed software on a 32 core machine versus a 22 core machine.
 
Back
Top