Roy Taylor is wrong again: Watch_Dogs 2 does not support DX12, not "optimized for AMD"

CSI_PC · Nov 30, 2016

JustReason said:
This is why I never take you seriously, a wolf in sheeps clothing if you will.

View attachment 11423

I am guessing you are using a 1070 review for those being they have a number of them in the list. This is from a 1070 review July 6,2016. This is also why I am not a big pusher of benches for anything more than "BALLPARK" answers. Seriously you are trying too hard to bash AMD. I cant say I have ever seen you show this kind of irreverence for Nvidia nor Intel, though in your defense I haven't seen you post much on CPUs.

Sigh,
before going all righteous, why not look into how they do their testing....
I knew some would not read when I try to emphasise point about benchmark/in-game/ and critically using independant tool such as PresentMon (and gone on about this many times in the past with concerns of internal benchmarks and importance of independant performance tool)...
This is why I only use a few sites with regards to DX12 testing, those that get it.
Tom's Hardware did not use PresentMon, and critically THEY USED THE AOTS INTERNAL LOGGING TOOL THAT HAS BEEN SHOWN TO EXAGGERATE THE PERFORMANCE OF AMD - that is why your seeing those results in the chart from Tom's Hardware.

The Ashes charts represent DirectX 12 performance using the game’s built-in benchmark/logging tool.

That AoTS internal tool is designed to give the best results for internal rendering engine 'sync'd' in a way that is perfect for AMD, it does not represent what the performance the gamer sees and perceives or in other words is presented.

PresentMon is developed by Intel for the specific intention of capturing and monitoring DX12 GPU performance that would be representative to the end user, not skewed like several internal benchmark functions for games that base it upon the internal engine.
To put it simply it is closer to FRAPs and FCAT independant approach than what many internal benchmarks do and especially AoTS (albeit this is not the only game to have such skewed measurements due to numbers are internal to game engine and not the actually and real presented ones).
It's weakness (still better than internal engine benchmark functions) would be its lowered accuracy when it comes to SLI/Crossfire as this is also influenced at a much lower level by drivers, this is an area that will be a challenge moving forward to be able to benchmark accurately with DX12 tools, although in theory it should still be fine for mGPU DX12 function but would need testing and validating.

Think that is clear enough, but felt I had to be this blunt considering your response.
Anyway this highlights why one needs to be careful with DX12 and especially internal game function benchmarks.

Remon · Nov 30, 2016

Roy Taylor is wrong again: Watch_Dogs 2 does not support DX12, not "optimized for AMD"

More like Nvidia got afraid and started poaching every Gaming Evolved title they could...

razor1 · Nov 30, 2016

Remon said:
More like Nvidia got afraid and started poaching every Gaming Evolved title they could...

You do know Roy Taylor when working for nV did the same stupid shit right? Hence why he was let go.......

razor1 · Nov 30, 2016

CSI_PC said:
Sigh,
before going all righteous, why not look into how they do their testing....
I knew some would not read when I try to emphasise point about benchmark/in-game/ and critically using independant tool such as PresentMon (and gone on about this many times in the past with concerns of internal benchmarks and importance of independant performance tool)...
This is why I only use a few sites with regards to DX12 testing, those that get it.
Tom's Hardware did not use PresentMon, and critically THEY USED THE AOTS INTERNAL LOGGING TOOL THAT HAS BEEN SHOWN TO EXAGGERATE THE PERFORMANCE OF AMD - that is why your seeing those results in the chart from Tom's Hardware.

That AoTS internal tool is designed to give the best results for internal rendering engine 'sync'd' in a way that is perfect for AMD, it does not represent what the performance the gamer sees and perceives or in other words is presented.

PresentMon is developed by Intel for the specific intention of capturing and monitoring DX12 GPU performance that would be representative to the end user, not skewed like several internal benchmark functions for games that base it upon the internal engine.
To put it simply it is closer to FRAPs and FCAT independant approach than what many internal benchmarks do and especially AoTS (albeit this is not the only game to have such skewed measurements due to numbers are internal to game engine and not the actually and real presented ones).
It's weakness (still better than internal engine benchmark functions) would be its lowered accuracy when it comes to SLI/Crossfire as this is also influenced at a much lower level by drivers, this is an area that will be a challenge moving forward to be able to benchmark accurately with DX12 tools, although in theory it should still be fine for mGPU DX12 function but would need testing and validating.

Think that is clear enough, but felt I had to be this blunt considering your response.
Anyway this highlights why one needs to be careful with DX12 and especially internal game function benchmarks.

Dude that is too much for him to understand lol, he doesn't know what frame variances are, you need to start there and then explain to him how present mon works and why the internal benchmarks for so many DX12 games are not representative of what is actually happening lol.

JustReason · Nov 30, 2016

razor1 said:
Dude that is too much for him to understand lol, he doesn't know what frame variances are, you need to start there and then explain to him how present mon works lol.

one day I hope we meet so I can see if you will talk to me the same way. I understand more of this than either of you apparently. He posted those with no context as far as present on or ingame benchmark. Besides after looking around seems only hardware Canucks uses it on ashes, although most all others use it on doom so they have access. Besides I know Skytml and don't trust his results ever, he is very anti-AMD. I would think if this was as huge as you allude it would be covered by more than just him, though I think [H] used it to.

besides it doesn't prove async superiority, rational intelligent individuals know AMD still leads in that regard, in function of architecture which thus far isnt really levied to any great degree in real world usage as of yet. That is what he is alluding to and why I definitely don't take his posts seriously.

razor1 · Nov 30, 2016

Then tell me what frame time variances are and why the internal benchmarks are flawed in so many DX12 games? yes they are flawed because of this.

Why does frame time variances occur?

Do you even know?

You don't have any understanding how 3d renderers work let alone inter frame rendering, so I doubt you know. Very few people on this forum know it anyways, so this doesn't even pertain only to you.

Just looking at pretty numbers and not understanding where the numbers came from doesn't mean you know something lol.

Really going back to async for your last stand type deal? Common, feeling a little inadequate to talk about this so fall back on something that isn't even remotely contextual to the discussion which we full well know you don't know what you are talking about with architecture and async, can link post after post about the BS you spewed about aysnc and architecture.

ManofGod · Nov 30, 2016

razor1 said:
Then tell me what frame time variances are and why the internal benchmarks are flawed in so many DX12 games? yes they are flawed because of this.

Why does frame time variances occur?

Do you even know?

You don't have any understanding how 3d renderers work let alone inter frame rendering, so I doubt you know. Very few people on this forum know it anyways, so this doesn't even pertain only to you.

Just looking at pretty numbers and not understanding where the numbers came from doesn't mean you know something lol.

Really going back to async for your last stand type deal? Common, feeling a little inadequate to talk about this so fall back on something that isn't even remotely contextual to the discussion which we full well know you don't know what you are talking about with architecture and async, can link post after post about the BS you spewed about aysnc and architecture.

Yada, yada, I am going to hide my Nvidia bias in a long, drawn out post that says nothing at all. Enjoy, I am sure you bash Nvidia just as often, right? Right? LOL! Personally, I trust my own eyes long before I trust what others say online that have obvious bent towards one tech over another. (Obvious is obvious.) You see, I have never hidden my preference for AMD hardware nor the reasons why.

LOL! Intel made a monitor that is guaranteed to be "unbiased"? Possible but, highly unlikely, all things considered. *Cough* AMD competitor *Cough*

razor1 · Nov 30, 2016

ManofGod said:
Yada, yada, I am going to hide my Nvidia bias in a long, drawn out post that says nothing at all. Enjoy, I am sure you bash Nvidia just as often, right? Right? LOL! Personally, I trust my own eyes long before I trust what others say online that have obvious bent towards one tech over another. (Obvious is obvious.) You see, I have never hidden my preference for AMD hardware nor the reasons why.

LOL! Intel made a monitor that is guaranteed to be "unbiased"? Possible but, highly unlikely, all things considered. *Cough* AMD competitor *Cough*

There is no preference there, that is what you don't understand, its understanding how something works, and why old tools are no longer as functional, GPU's have evolved in different ways because of their programmability in pipeline stages which causes the older tools to be inaccurate. Because things aren't rendering point A to Z in a straight way anymore, there are many things going on in different paths to get to Z that we don't see with the old tools. This is more evident in DX12 and LLAPI's then before because of their structure.

Unfortunately you guys like to talk about one side of things (multiple core utilization in LLAPI's vs serial path in DX11) but don't really understand what that entails when it comes to measuring those changes.

Intel was the not first company to notice this change either. nV noticed it years ago (well before any LLAPI even the conception of them), along with techreport's owner who now works at AMD. Intel just made a tool that can see those variances.

JustReason · Nov 30, 2016

razor1 said:
Then tell me what frame time variances are and why the internal benchmarks are flawed in so many DX12 games? yes they are flawed because of this.

Why does frame time variances occur?

Do you even know?

You don't have any understanding how 3d renderers work let alone inter frame rendering, so I doubt you know. Very few people on this forum know it anyways, so this doesn't even pertain only to you.

Just looking at pretty numbers and not understanding where the numbers came from doesn't mean you know something lol.

Really going back to async for your last stand type deal? Common, feeling a little inadequate to talk about this so fall back on something that isn't even remotely contextual to the discussion which we full well know you don't know what you are talking about with architecture and async, can link post after post about the BS you spewed about aysnc and architecture.

I understand more than most and it is why I said I take all reviews as a guideline more than definitive proof. It is also why I said I didn't like the shift away from these graphs it just looking at fps because of the "per second" part as it tends to hide inherent issues. Or the lack of interest in the max-to-min time frame which is of greater import than fps avg. Trust I probably have a far greater understanding of these concepts than you.

as far as ashes and presentmon, I gather it shows the end post of a frame whereas the in game bench shows the engine sending the frame to post. Problem here is the discrepancy in between the 2, frame rendered to frame posted. What is happening here, assuming each monitor is accurate? Dropped frames made sense in dx11 and the existence of drivers. But with dx12 and the visibilty of the hardware, I am not certain on the cause, gonna take a little more thought.

razor1 · Nov 30, 2016

JustReason said:
I understand more than most and it is why I said I take all reviews as a guideline more than definitive proof. It is also why I said I didn't like the shift away from these graphs it just looking at fps because of the "per second" part as it tends to hide inherent issues. Or the lack of interest in the max-to-min time frame which is of greater import than fps avg. Trust I probably have a far greater understanding of these concepts than you.

as far as ashes and presentmon, I gather it shows the end post of a frame whereas the in game bench shows the engine sending the frame to post. Problem here is the discrepancy in between the 2, frame rendered to frame posted. What is happening here, assuming each monitor is accurate? Dropped frames made sense in dx11 and the existence of drivers. But with dx12 and the visibilty of the hardware, I am not certain on the cause, gonna take a little more thought.

No you don't, I suggest you look up the pixel shader stage after the rasterization is done, this is why frame times vary. So unless you know how the pixel shader stage works in depth and what the implications are on what types of shaders are being used, you have no damn clue on frame times lol. All you are looking at is the output the output is and end result, you don't know why the output is the way it is.

The problem again, you assume you know, but in fact you don't. This is why you can have different frame times on different IHV and different generations of same IHV hardware.

razor1 · Nov 30, 2016

For people interested in this, just to show the complexity of the problem and not to digress it to 2 different items like JustReason did (frame rendering to frame posted), there are many things going on in the middle that affect Frame time and end resultant.

https://www.cg.tuwien.ac.at/research/vr/rendest/rendest_egsr2003.pdf

This is old but the same concepts are present and more so in today's GPU's and latest API.

JustReason · Nov 30, 2016

razor1 said:
No you don't, I suggest you look up the pixel shader stage after the rasterization is done, this is why frame times vary. So unless you know how the pixel shader stage works in depth and what the implications are on what types of shaders are being used, you have no damn clue on frame times lol. All you are looking at is the output the output is and end result, you don't know why the output is the way it is.

The problem again, you assume you know, but in fact you don't. This is why you can have different frame times on different IHV and different generations of same IHV hardware.

first back off the attitude and arrogance, I didn't respond to you that way. Second, no shit. Really each ihv reaches the end frame differently, I'm shocked (astronomical sarcasm). I am talking about the final frame not the path to it. This is what you do, obfuscate, blur the argument to something that has nothing to do with i posted.

razor1 · Nov 30, 2016

JustReason said:
first back off the attitude and arrogance, I didn't respond to you that way. Second, no shit. Really each ihv reaches the end frame differently, I'm shocked (astronomical sarcasm). I am talking about the final frame not the path to it. This is what you do, obfuscate, blur the argument to something that has nothing to do with i posted.

Really doesn't have anything to do with what you posted?

This is what you posted

JustReason said:
I understand more than most and it is why I said I take all reviews as a guideline more than definitive proof. It is also why I said I didn't like the shift away from these graphs it just looking at fps because of the "per second" part as it tends to hide inherent issues. Or the lack of interest in the max-to-min time frame which is of greater import than fps avg. Trust I probably have a far greater understanding of these concepts than you.

as far as ashes and presentmon, I gather it shows the end post of a frame whereas the in game bench shows the engine sending the frame to post. Problem here is the discrepancy in between the 2, frame rendered to frame posted. What is happening here, assuming each monitor is accurate? Dropped frames made sense in dx11 and the existence of drivers. But with dx12 and the visibilty of the hardware, I am not certain on the cause, gonna take a little more thought.

In red, you don't know what you just stated because its not a simple as min max, inter-frame rendering, has a multitude of variables as the link that I just showed you has to be looked into when creating the FPS calculations.

https://www.cg.tuwien.ac.at/research/vr/rendest/rendest_egsr2003.pdf

Further complicating the matter, what you wrote and I have highlighted in blue, you assumed what is going on in how the programming of the FPS calculator is doing in AOTS, which is not what it is doing, because if it was, end frame rates SHOULD NOT change much based on IHV of comparable performance cards. This is because once a frame is done rendering the next frame is started on, they don't go on simultaneously, again lack of understanding how the pixel pipeline works. This is why you can't look at latency within different parts of an engine (as drivers and architecture can greatly influence how latency is hidden) and figure out performance either, and that's why the presentmon numbers, to really understand what is going on is broken up to much more then end results.

Now highlighted in yellow, dropped frames didn't make sense in DX11, if you look at what has been going on since the advent of programmable shaders (DX9, and onward) that is when the problem first started not with LLAPI's now.

Again assumption or lack of knowledge and then assuming you know what you are posting about is just doing a disfavor.

Using sarcasm to cover up your inability to read and understand the paper I linked to doesn't work either, that paper is not hard to read, yeah some of the formulas are complex but not expecting you to know the math behind the formulas. On top of that they break down the rendering pipeline based on performance, so if you knew what each IHV's architecture truly is, like you stated here in red.

JustReason said:
one day I hope we meet so I can see if you will talk to me the same way. I understand more of this than either of you apparently. He posted those with no context as far as present on or ingame benchmark. Besides after looking around seems only hardware Canucks uses it on ashes, although most all others use it on doom so they have access. Besides I know Skytml and don't trust his results ever, he is very anti-AMD. I would think if this was as huge as you allude it would be covered by more than just him, though I think [H] used it to.

besides it doesn't prove async superiority, rational intelligent individuals know AMD still leads in that regard, in function of architecture which thus far isnt really levied to any great degree in real world usage as of yet. That is what he is alluding to and why I definitely don't take his posts seriously.

You wouldn't have made those assumptions. Based on the breakdown you can easily see how the new LLAPI's would be affecting this too (general understanding)

FrgMstr · Nov 30, 2016

I was talking to an NVIDIA employee about 6 months ago or so, and he made the comment, "Every time I talk to someone at AMD they always ask me, "How did you guys get rid of Roy Taylor?" LOL! True story.

Rvenger · Nov 30, 2016

I guess at least AMD knows he has zero credibility. They just can't afford the lawsuit after they terminate him I guess.

CSI_PC · Nov 30, 2016

JustReason said:
one day I hope we meet so I can see if you will talk to me the same way. I understand more of this than either of you apparently. He posted those with no context as far as present on or ingame benchmark. Besides after looking around seems only hardware Canucks uses it on ashes, although most all others use it on doom so they have access. Besides I know Skytml and don't trust his results ever, he is very anti-AMD. I would think if this was as huge as you allude it would be covered by more than just him, though I think [H] used it to.

besides it doesn't prove async superiority, rational intelligent individuals know AMD still leads in that regard, in function of architecture which thus far isnt really levied to any great degree in real world usage as of yet. That is what he is alluding to and why I definitely don't take his posts seriously.

Then you would accept why it was wrong to use Tom's Hardware results for AoTS (or any game that use an internal tool as it can easily be set with focus that skews compared to real-world game results or measures in a way that does not represent the gamers' experience) as it does not capture the frame data in the same place as FRAPs/FCAT and uses metrics within the internal rendering engine like I said.
BTW I have debated this with a couple of AAA developers in the past so I also understand this subject thanks and the various PoVs on the topic.

You do realise that async compute for AMD is only providing a marginal benefit over Pascal, and the original point was AMD/Roy rammed it down our eyes about AoTS superior performance in DX12+async compute against comparable Nvidia but this game has been put to the side now there is a decent way to independently benchmark the game and also how Nvidia has improved.
Pascal can be around 3% to 4% while AMD cards can be be 7% to 10%, in games working well for both; in Doom the greatest benefit was not Async Compute but actually AMD's Vulkan shader extension.

Most have dropped AoTS because it is a royal pain in the arse to benchmark in a way that is satisfactory even with PresentMon and is time consuming.

If HardwareCanuck was as biased as you say then they would not use Hitman DX12 and this game punishes Nvidia, they also have kept using Quantum Break as well that initially punished Nvidia.
I would say HardOCP/HardwareCanucks/PCGamesHardware/PCPer/and a few others are all pretty good and take great interest both technically and from a methodology approach.
Tom's Hardware is one of the very best for power/efficiency/performance envelope analysis IMO.
Cheers

JustReason · Nov 30, 2016

CSI_PC said:
Then you would accept why it was wrong to use Tom's Hardware results for AoTS (or any game that use an internal tool as it can easily be set with focus that skews compared to real-world game results or measures in a way that does not represent the gamers' experience) as it does not capture the frame data in the same place as FRAPs/FCAT and uses metrics within the internal rendering engine like I said.
BTW I have debated this with a couple of AAA developers in the past so I also understand this subject thanks and the various PoVs on the topic.

You do realise that async compute for AMD is only providing a marginal benefit over Pascal, and the original point was AMD/Roy rammed it down our eyes about AoTS superior performance in DX12+async compute against comparable Nvidia but this game has been put to the side now there is a decent way to independently benchmark the game and also how Nvidia has improved.
Pascal can be around 3% to 4% while AMD cards can be be 7% to 10%, in games working well for both; in Doom the greatest benefit was not Async Compute but actually AMD's Vulkan shader extension.

Most have dropped AoTS because it is a royal pain in the arse to benchmark in a way that is satisfactory even with PresentMon and is time consuming.

If HardwareCanuck was as biased as you say then they would not use Hitman DX12 and this game punishes Nvidia, they also have kept using Quantum Break as well that initially punished Nvidia.
I would say HardOCP/HardwareCanucks/PCGamesHardware/PCPer/and a few others are all pretty good and take great interest both technically and from a methodology approach.
Tom's Hardware is one of the very best for power/efficiency/performance envelope analysis IMO.
Cheers

first, I have issues with the writer skytml as he has always been biased against AMD, I have had plenty of conversations with him. But that's neither here nor there, as I generally don't place too much emphasis on any site, again just as general information.

second you are trying too hard to dance around the facts about async. I stated long ago Maxwell didn't have it, lo and behold it doesn't. After reading Pascals release info I stated Pascal does as far as the array, if you will, but I felt they still lack concurrent tasking as Nvidia intentionally cut the Gigathread portion off their die diagram. And sure enough I was correct about that as well.

As far as Ashes i haven't kept up with the news on it, but I do reserve the fact they stated they would only ever have DX12, yet they have dx11 now and Nvidia is running without issue. Again accepting some things at face value isn't always the best course.

razor1 · Nov 30, 2016

JustReason said:
first, I have issues with the writer skytml as he has always been biased against AMD, I have had plenty of conversations with him. But that's neither here nor there, as I generally don't place too much emphasis on any site, again just as general information.

second you are trying too hard to dance around the facts about async. I stated long ago Maxwell didn't have it, lo and behold it doesn't. After reading Pascals release info I stated Pascal does as far as the array, if you will, but I felt they still lack concurrent tasking as Nvidia intentionally cut the Gigathread portion off their die diagram. And sure enough I was correct about that as well.

As far as Ashes i haven't kept up with the news on it, but I do reserve the fact they stated they would only ever have DX12, yet they have dx11 now and Nvidia is running without issue. Again accepting some things at face value isn't always the best course.

And all those assumptions are wrong, Maxwell does have async compute, just doesn't function well after the first partitioning of the SMX unit. Async compute has nothing to do with concurrent work either. That is also a fallacy hence why you STILL don't get what is going on even after many threads and posts about the topic. Your inability to separate the two based on architecture is causing you confusion.

You have to separate the thought of async compute vs the ability to share resources within a single SMX units. This is where concurrency comes in within a single SMX unit. The way that you have posted sounds like you are thinking all SMX units are doing either graphics or compute that is not the case in Maxwell either. That is why async compute works in Maxwell because you can have one SMX unit working on graphics while another unit working on compute, the problem with this though is you will get into under utilization of the SMX units as a whole if you don't have a similar amount of graphics to compute work loads on a SMX level, and of course after the first partition within the SMX unit as well.

Let me see if I can clear some of the confusion up by giving you an example of branching.

Do you remember the granularity of dynamic branching of the g80 to r600 or the g70 to the r520 or r580? Think of it this way.

ATi had better handing of dynamic branching because they had smaller portions to work with, with the r520 and r580. That changed with the g80 to the r600......

Its akin to what Maxwell and GCN is like with concurrency within the SMX, once Maxwell partitions its SMX its "granularity" is static, it cannot be changed until the SMX is flushed. But it can still do concurrency because there are portions of the SMX that have been designated to do compute and portions of the smx designated to graphics as well. As well as doing concurrency via using other SMX blocks. But if the next batch of instructions doesn't have the same ratio of compute to graphics, some ALU's within the SMX are underutilized, and hence the performance penalty to Maxwell.

Pascal gets around this problem with dynamic load balancing, so concurrency doesn't need to be stopped and the smx being flushed.

AMD has something similar, but they don't explain it in their white papers, because hell why should they when it works. nV only explained it because something was fubar with Maxwell and that problem has been fixed with Pascal. The gigatheaded engine has nothing to do with this. Scheduling has nothing to do with this. The moment you start talking about the gigathreaded engine, you are talking about scheduling of instructions, this is not where the problem resided with Maxwell nor does this have anything to do with concurrent work being done, it is something that comes after the capability to do concurrent work in the first place.

Is AMD's implementation better for concurrent workloads over Pascal? Yes it is but that doesn't mean much in the over all picture of what has happened in most games to date. And will this change in upcoming titles to a degree that will affect Pascal adversely or give AMD more advantage then what we have seen so far or to the point where it makes Pascal unusable in the time frame these graphics cards are being sold. No it does not to any of these situations, it doesn't even change Maxwell position much either. So why are you harping about this over and over again?

CSI_PC · Nov 30, 2016

JustReason said:
first, I have issues with the writer skytml as he has always been biased against AMD, I have had plenty of conversations with him. But that's neither here nor there, as I generally don't place too much emphasis on any site, again just as general information.

second you are trying too hard to dance around the facts about async. I stated long ago Maxwell didn't have it, lo and behold it doesn't. After reading Pascals release info I stated Pascal does as far as the array, if you will, but I felt they still lack concurrent tasking as Nvidia intentionally cut the Gigathread portion off their die diagram. And sure enough I was correct about that as well.

As far as Ashes i haven't kept up with the news on it, but I do reserve the fact they stated they would only ever have DX12, yet they have dx11 now and Nvidia is running without issue. Again accepting some things at face value isn't always the best course.

You inferred async compute superiority, I just rightly pointed out that it is not as great as AMD and Roy made it out to be, especially as its superiority is doing nothing to help it beat Maxwell even in AoTS, the best implementation to date.
And we see that repeating in a few other games; in Gears of War 4 the 480 Strix gets an 7% increase while the FuryX gets 8%, good but not the game changer it was made out to be and that comes back to the results of AoTS that was being rammed down everyone's eyes by AMD and Roy before it could be independently benchmarked and at the frame data point where it is presented to gamers rather than internal rendering engine, along with time for it to mature with Nvidia it seems (same thing with Quantum Break on DX12 that was appalling to begin with but now is pretty competitive for Nvidia).
The 390X is a stand-out product and one that does set itself apart from the 980, and at times competes strongly with Fury X.
People point to Doom (even you mention it) but that game had the most performance boost from the Shader Extension rather than async compute.

Seems to me you are the one dancing around the issues and have issues with results that show well for Nvidia but happily ignore that HardwareCanucks also has results strong for AMD with Hitman or the fact 4k AoTS has AMD edging out Nvidia but does so with lower minimum frames and consistency, or the fact the best we have seen from async compute is around 10% boost (very rare) and that is far from consistent or persistent, and again that as with my original point that you started arguing with was about AoTS (not going to re-iterate that a 3rd time).

BTW those were ideal settings for async compute in Gears of War 4, changing to more demanding setting and things become worse;

Point being as I mentioned above it is far from consistent or even persistent, and can be either good (but not a game changer) or meh, it is not panacea for superiority performance over Nvidia.

Remon · Nov 30, 2016

razor1 said:
You do know Roy Taylor when working for nV did the same stupid shit right? Hence why he was let go.......

http://www.thecountrycaller.com/546...-12-support-will-be-highly-optimized-for-amd/

And everyone in Nvidia lied back then. I guess your illustrious leader should be fired for showing that wood screw card? Or that wasn't lying?

razor1 · Nov 30, 2016

No he is talking out of his ass and this isn't the first time this has happened, Roy Taylor has a habit of putting his foot in his mouth because he doesn't know what he is talking about. As a marketing person, he should never talk about something he is not involved in, nor would have any information on it.

Now you are trying to put this parallel to the CEO of nV? Guess what, nV is having record breaking quarters for years now...... At least the CEO of nV gets his job done lol. He is an illustrious leader, (not my leader) but he gets the job done and the numbers in almost all financial quarters show what he has done for nV and what nV has done under him has shown his capabilities.

Do you see Roy doing anything right from VR, TAM, this, and a few other things in the past few months and many more things for years now? How many mistakes does a person have to tweet or talk about or have a false pretense in an interview.

Even after making a mistake like saying the wrong words or terms or just being plan wrong about something, he doesn't change his ways. Not only that, his type of thinking has spread to Huddy, to Hallock, to some of Raja's statements (I don't think he thinks like Roy, but once its out there, kinda have to support what is being talked about).

Action speaks louder than words, and what Roy has been saying for years now, is just noise and BS in the background of what is really going on. Everything Roy has talked about, there is no action to it, just a lot of hot air. Might as well just make ridiculous statements like Vega is going to be 2 x faster than Titan Pascal and be done with it.

razor1 · Nov 30, 2016

Ok here are just the first few articles in google about Roy Taylor

http://www.hardwarecanucks.com/news/amd-roy-taylor-directx12/

really end of DX?

http://hexus.net/tech/features/graphics/76353-roy-taylor-amd-radeon-gpus-remain-unsurpassed/

Lots of good stuff here, wow efficiency, perf/watt doesn't matter love those

http://www.gamersnexus.net/industry/1950-nvidia-disappointed-in-amd-false-allegations

Remember the whole witcher 3 thing about AMD not being involved or didn't have time to be involved when they had years of what was going on prior to its release and allegation so of Hairworks being put in during the 11th hour, but the developer showed Hairworks working in engine a couple of years before hand?

https://www.techpowerup.com/180345/...amd-radeon-graphics-teleconference-transcript

Roy: "Basically from what we've seen so far, we don't think there's any conc…[stops short of saying "concern"]…of course things could change, but early indications are, that this (GeForce Titan) is not something which was designed to be a graphics card."

This was a beauty. He has problems talking about his own products yet he is going to go out on a limb and talk about another company's product?

http://vr-zone.com/articles/nimble-like-a-starfish-amds-roy-taylor-sits-down-with-vr-zone/49320.html

Ah yes forecasting the future, hmm that seems like more hot air.

Lets not forget the way he handled the nano incident (blacklist of websites), which was a PR nightmare for AMD.

FrgMstr · Nov 30, 2016

He blocked me from his Twitter account. LOL!

However he does not know that I have super secret Incognito mode! Muaahahahaha!

Remon · Dec 1, 2016

razor1 said:
Now you are trying to put this parallel to the CEO of nV? Guess what, nV is having record breaking quarters for years now...... At least the CEO of nV gets his job done lol. He is an illustrious leader, (not my leader) but he gets the job done and the numbers in almost all financial quarters show what he has done for nV and what nV has done under him has shown his capabilities.

I won't pretend that Taylor hasn't lied, but, you're ok with lying as long as they're successful? That's the most shill thing I've read from you, and that means a lot.

TaintedSquirrel · Dec 1, 2016

Just to be clear I didn't call him a liar, I said he was wrong.
Although he did say the game would be "highly optimized for AMD" which might be interpreted as a lie. If he knew Ubisoft still had their partnership with Nvidia, and he assumed it would run better on AMD due to DX12 without any evidence, then he's a liar.

The difference between lying and being wrong is intent.

Since we like Roy around here, let's give him the benefit of the doubt and just assume he got this one wrong.

CSI_PC · Dec 1, 2016

Remon said:
I won't pretend that Taylor hasn't lied, but, you're ok with lying as long as they're successful? That's the most shill thing I've read from you, and that means a lot.

The thing is, there was a lot of vocal critisicm of the 'wood screws' incident and how it means Nvidia is lying and they do not have a product.
Yet they released the Drive PX2 on time (now backed up with other models as well), released Pascal in various models on time (same people were suggesting Polaris would release earlier to retail and so gain market but in reality their gains was the quarter before Polaris), and had released the NVLink version of the P100 on time.
BTW I have been contacted about a full DGX1 for delivery guaranteed 1-week later, and this was a little time ago, context being this was the NVLink model as the PCIe is the one that was launched later (and to date only to core high value customer for HPC multi-million dollar projects).

So they used a 'mock-up' to some extent, but importantly the products did exist as suggested in his presentation.
Yeah historically Jen-Hsun Huang has said a few questionable things in the past, but then we can say the same about AMD executives just as much, Roy is on another level to these though

This is going to be controversial to some, but as one who has done private presentations to senior tech analysts I can say there is nothing 'loose-cannon' about Jen-Hsun Huang presentation approach and content, it is pretty much at the top of the game and you can clearly tell he has had a lot of experience and training from very good professionals, some may not like his style but his presentations are very slick when it is just him on stage.
Not saying I actually like the guy (do not know what is a front or his actual personality) but I seriously respect his skills.
And sometimes one has to improvise when presenting very new tech as plans can go out of the window when trying to pull it all together in a tight schedule with a major press/public conference.

Just to say about Roy, I think his approach with some of those wild out there statements is more of a strategy than a mistake, his intent is to make it resonate with consumers even if a lot of it is fud, and for the most part it does, although the other key player in that is Richard Huddy at AMD and we should not forget about him and some of his accusations/claims.
I would like AMD more if these two backed off a bit, or just focused more on presenting the improvements of their own products (hardware and tech solutions) for each gen.
Cheers

JustReason · Dec 1, 2016

razor1 said:
And all those assumptions are wrong, Maxwell does have async compute, just doesn't function well after the first partitioning of the SMX unit. Async compute has nothing to do with concurrent work either. That is also a fallacy hence why you STILL don't get what is going on even after many threads and posts about the topic. Your inability to separate the two based on architecture is causing you confusion.

You have to separate the thought of async compute vs the ability to share resources within a single SMX units. This is where concurrency comes in within a single SMX unit. The way that you have posted sounds like you are thinking all SMX units are doing either graphics or compute that is not the case in Maxwell either. That is why async compute works in Maxwell because you can have one SMX unit working on graphics while another unit working on compute, the problem with this though is you will get into under utilization of the SMX units as a whole if you don't have a similar amount of graphics to compute work loads on a SMX level, and of course after the first partition within the SMX unit as well.

Let me see if I can clear some of the confusion up by giving you an example of branching.

Do you remember the granularity of dynamic branching of the g80 to r600 or the g70 to the r520 or r580? Think of it this way.

ATi had better handing of dynamic branching because they had smaller portions to work with, with the r520 and r580. That changed with the g80 to the r600......

Its akin to what Maxwell and GCN is like with concurrency within the SMX, once Maxwell partitions its SMX its "granularity" is static, it cannot be changed until the SMX is flushed. But it can still do concurrency because there are portions of the SMX that have been designated to do compute and portions of the smx designated to graphics as well. As well as doing concurrency via using other SMX blocks. But if the next batch of instructions doesn't have the same ratio of compute to graphics, some ALU's within the SMX are underutilized, and hence the performance penalty to Maxwell.

Pascal gets around this problem with dynamic load balancing, so concurrency doesn't need to be stopped and the smx being flushed.

AMD has something similar, but they don't explain it in their white papers, because hell why should they when it works. nV only explained it because something was fubar with Maxwell and that problem has been fixed with Pascal. The gigatheaded engine has nothing to do with this. Scheduling has nothing to do with this. The moment you start talking about the gigathreaded engine, you are talking about scheduling of instructions, this is not where the problem resided with Maxwell nor does this have anything to do with concurrent work being done, it is something that comes after the capability to do concurrent work in the first place.

Is AMD's implementation better for concurrent workloads over Pascal? Yes it is but that doesn't mean much in the over all picture of what has happened in most games to date. And will this change in upcoming titles to a degree that will affect Pascal adversely or give AMD more advantage then what we have seen so far or to the point where it makes Pascal unusable in the time frame these graphics cards are being sold. No it does not to any of these situations, it doesn't even change Maxwell position much either. So why are you harping about this over and over again?

Holy crap you love talking thru your arse. Maxwell can not and it is because the whole array is tied together. Pascal gets by this not by load balancing but because the array is split into 4 parts, each independent from the other. And yes the gigathread matters as it is the contact with which the API has when executing tasks. AMDs GCN ACEs are each visible to the API hence why they have never had any issue with concurrent operations at any level, and yes I like to split them as upto the GPU and within, because the lot of you like to obfuscate the issues at hand. The SMX units are not visible at all to the API and being the have only a single controller, the Gigathread, it is apparent they can not accept concurrent tasks. But with Pascal because they separated the SMX unit into quadrants they now have the abitlity to at least run graphics and compute ques concurrently after dispatch. It isn't that hard to figure out especially with the reading of Nvidias release where that part was cut off because it would in fact show a weakness with concurrent task or in basic terms having more than one CPU core speak to the GPU at a time. But unlike you I didn't feel the need to condemn Nvidia on this lack of forethought, hell they have enough money to throw at software to ensure that such weaknesses aren't too detrimental. And the likelihood of any game utilizing such coding would be quite slim early on. You concern yourself too much with coding and act like that is the determining factor and limit yourself to that part for every discussion. I look at the architecture and see the possibilities or the lack there of in this case. Same could be said of AMDs choice for going with the FX as Intel was doing everything they could to ensure IPC and low core counts stayed king.

JustReason · Dec 1, 2016

CSI_PC said:
You inferred async compute superiority, I just rightly pointed out that it is not as great as AMD and Roy made it out to be, especially as its superiority is doing nothing to help it beat Maxwell even in AoTS, the best implementation to date.
And we see that repeating in a few other games; in Gears of War 4 the 480 Strix gets an 7% increase while the FuryX gets 8%, good but not the game changer it was made out to be and that comes back to the results of AoTS that was being rammed down everyone's eyes by AMD and Roy before it could be independently benchmarked and at the frame data point where it is presented to gamers rather than internal rendering engine, along with time for it to mature with Nvidia it seems (same thing with Quantum Break on DX12 that was appalling to begin with but now is pretty competitive for Nvidia).
The 390X is a stand-out product and one that does set itself apart from the 980, and at times competes strongly with Fury X.
People point to Doom (even you mention it) but that game had the most performance boost from the Shader Extension rather than async compute.

Seems to me you are the one dancing around the issues and have issues with results that show well for Nvidia but happily ignore that HardwareCanucks also has results strong for AMD with Hitman or the fact 4k AoTS has AMD edging out Nvidia but does so with lower minimum frames and consistency, or the fact the best we have seen from async compute is around 10% boost (very rare) and that is far from consistent or persistent, and again that as with my original point that you started arguing with was about AoTS (not going to re-iterate that a 3rd time).

BTW those were ideal settings for async compute in Gears of War 4, changing to more demanding setting and things become worse;

Point being as I mentioned above it is far from consistent or even persistent, and can be either good (but not a game changer) or meh, it is not panacea for superiority performance over Nvidia.

I really DGAF if Nvidia is winning at some benchmark, never did. However ignorant claims like you make over and over I do have issue with. You stated because Nvidia had the lead in AOTS, which I am not full certain of as one site does not proof make, it meant they were better at Async which in reality that is purely false. What I find laughable is that you so-called-self-proclaimed-know-it-alls don't understand that coding makes the greater difference and it so alludes that you don't even consider how the tables turning in AoTS might be because Nvidia paid for it, as is their right if they so wish. Look at the few titles lately where AMD bests Nvidia in DX11, does the mean AMD is better at DX11? Hell No. Given architectures Nvidia is far better. If you weren't trying so hard to pretend you are only the innocent messenger and just admit to being the Nvidia loving AMD-bashing individual you are then these things would go so much better.

SighTurtle · Dec 1, 2016

TaintedSquirrel said:
Just to be clear I didn't call him a liar, I said he was wrong.
Although he did say the game would be "highly optimized for AMD" which might be interpreted as a lie. If he knew Ubisoft still had their partnership with Nvidia, and he assumed it would run better on AMD due to DX12 without any evidence, then he's a liar.

The difference between lying and being wrong is intent.

Since we like Roy around here, let's give him the benefit of the doubt and just assume he got this one wrong.

I can't tell what isn't sarcasm and what is, but moving on.

So I decided to try figure out for myself, a little bit more, regarding Watch Dogs 2, DX12, AMD and Nvidia, attempting to figure out what exactly occurred between the point of GDC to the release of the Nvidia Gameworks trailer on Watch Dogs 2 on Nov. 22.

First off, the main thing I found regarding AMD and Watch Dogs 2 seems to have originated at GDC, the AMD Capsaicin event. On March 27, 2016, a article by The Country Caller was released, with the summary being AMD is helping Ubisoft develop Watch Dogs 2, indicated by the appearance of Ubi at AMD's press conference. Now, this article seems to have formed the basis for the other articles on AMD and Watch Dogs 2 during March, April and beyond, all the way to about November. But considering the kinda limited mention of information within the Country Caller article, I decided to hunt more. So, after some searching I found a article about the Capsaicin event held where according to the Country Caller, Ubisoft announced AMD and DX12 cooperation. This other article provides info on who was talking, Chris Early from Ubisoft.

Basically, according to Techfaqs, this "announcement" from Ubisoft was anything but. But, maybe theres more information since GDC so I decided to Google search up till November for info.

A look before March provides nothing on Watch Dogs 2. So now looking beyond March, The closest reference to Watch Dogs 2 and Nvidia comes from Gamescom 2016, August 19, a video interview between a Nvidia Gaming Network host and a Senior Producer at Ubisoft. Aside from that, I get no hints of a partnership between Ubisoft and Nvidia until October 18, when Nvidia posts the system requirements of Watch Dogs 2 and Ubisoft delays WD2 for enhancements to the PC version. It can be assumed at this point, Nvidia is partnering with Ubisoft conclusively, if not before. (Even before March? I dunno, I can't say.)

Now, concerning DX12, I can find absolutely no intent from Ubisoft to ever putting in DX12 for Watch Dogs 2. Absolutely no articles, no news I can find that does not point back to the Country Caller regarding DX12, AMD, Ubisoft. If AMD ever actually was supposed to help with DX12 and Watch Dogs 2, I want to say that aside from that line at GDC, and whatever AMD marketing has stated, there has been nothing from Ubisoft itself regarding a partnership. To be fair, beyond that tweet done by Roy Taylor at the end of March, AMD has itself said nothing regarding Watch Dogs 2 since then.

(Do not regard my google searches as conclusive, maybe I missed stuff, i dunno)

YeuEmMaiMai · Dec 1, 2016

this thread reminded me of why I do not come into the video card forum anymore....

Bowman15 · Dec 1, 2016

YeuEmMaiMai said:
this thread reminded me of why I do not come into the video card forum anymore....

Bingo...the OP knew exactly how this thread would turn out before he posted. If you want actual talk go to the Nvidia or AMD sub forums and post. The fan boys know this so they post in the general video card forum to get a rise and not get ban'd outright. Quite obvious.

CSI_PC · Dec 1, 2016

JustReason said:
I really DGAF if Nvidia is winning at some benchmark, never did. However ignorant claims like you make over and over I do have issue with. You stated because Nvidia had the lead in AOTS, which I am not full certain of as one site does not proof make, it meant they were better at Async which in reality that is purely false. What I find laughable is that you so-called-self-proclaimed-know-it-alls don't understand that coding makes the greater difference and it so alludes that you don't even consider how the tables turning in AoTS might be because Nvidia paid for it, as is their right if they so wish. Look at the few titles lately where AMD bests Nvidia in DX11, does the mean AMD is better at DX11? Hell No. Given architectures Nvidia is far better. If you weren't trying so hard to pretend you are only the innocent messenger and just admit to being the Nvidia loving AMD-bashing individual you are then these things would go so much better.

Nothing like ignoring the facts and info posted and respond with personal attacks.
I am making ignorant claims you say, and yet I use a valid independent tool benchmark comparable to FRAPs/FCAT rather than relying upon the results of the in-game tool like you did that is known to take its data from the rendering engine and in a way that works perfectly with the AMD buffer tech solution, which does not even correlate with the actual gamers performance as it does not correlate to the present frame data.

And you do realise PresentMon is becoming the standard for good sites to use including here at HardOCP, the tool is not just used by HardwareCanucks although they were one of the 1st along with PCPer.
I love how you now ignore the facts around async compute performance with additional info I provided for you and now resort to insults, if you have a problem with those results then you also need to take it up with Kyle and Brent as I ended it with their data....
You never considered that Quantum Break and AoTS was perfectly designed for AMD and then has taken time for them to mature to use with Nvidia?
This is historical trend that also occurs with games that initially benefit Nvidia and then AMD performance improves, but AoTS and Quantum Break to re-iterate again were perfectly designed for AMD at launch; consider Fallout games where the engine is perfectly designed for Nvidia hardware.

Just to add,HardwareCanucks measurements/trend matches another good publication (in fact it does several but why should I be the one to continually waste the most time putting info together in this argument) and that is http://www.pcgameshardware.de/AMD-Radeon-Grafikkarte-255597/Tests/RX-480-Test-1199839/2/
Will need to scroll down for the DX12 Hitman result at PCGamesHardware.
This need to be considered as relative values and more about position relative to other products and possible settings (hence the fps divergence)
Compare result for Canucks 480 Hitman: http://www.hardwarecanucks.com/foru...9-radeon-rx480-8gb-performance-review-20.html

Point is you are being dismissive of HardwareCanucks for little reason to do with its benchmark results that are using an independent tool in PresentMon, and this is a recognised tool that for now is the best way to verify and measure frame performance and behaviour in DX12.
Leaving it at that because you keep ignoring previous facts I provide (such as why Async Compute is good but not a game changer) while attacking the poster and not the facts in a balanced way.

THUMPer · Dec 1, 2016

I wouldn't play Hitman on DX12. Performance may be there, but it's still a stuttery mess. So you're both wrong, because none of those benchmarks matter to me. (y)

SilverSliver · Dec 1, 2016

I will never buy another product from a company Roy Taylor works for. The guy is one of the biggest d-bags on the internet.

razor1 · Dec 1, 2016

Remon said:
I won't pretend that Taylor hasn't lied, but, you're ok with lying as long as they're successful? That's the most shill thing I've read from you, and that means a lot.

One thing to make shit up and expect people to believe you another thing to make shit up and when the product comes out and it really is what you say it is.....

Yeah see the difference? If you can't then I suggest you take your own advice

Idiopathic of Rriemann

razor1 · Dec 1, 2016

JustReason said:
Holy crap you love talking thru your arse. Maxwell can not and it is because the whole array is tied together. Pascal gets by this not by load balancing but because the array is split into 4 parts, each independent from the other. And yes the gigathread matters as it is the contact with which the API has when executing tasks. AMDs GCN ACEs are each visible to the API hence why they have never had any issue with concurrent operations at any level, and yes I like to split them as upto the GPU and within, because the lot of you like to obfuscate the issues at hand. The SMX units are not visible at all to the API and being the have only a single controller, the Gigathread, it is apparent they can not accept concurrent tasks. But with Pascal because they separated the SMX unit into quadrants they now have the abitlity to at least run graphics and compute ques concurrently after dispatch. It isn't that hard to figure out especially with the reading of Nvidias release where that part was cut off because it would in fact show a weakness with concurrent task or in basic terms having more than one CPU core speak to the GPU at a time. But unlike you I didn't feel the need to condemn Nvidia on this lack of forethought, hell they have enough money to throw at software to ensure that such weaknesses aren't too detrimental. And the likelihood of any game utilizing such coding would be quite slim early on. You concern yourself too much with coding and act like that is the determining factor and limit yourself to that part for every discussion. I look at the architecture and see the possibilities or the lack there of in this case. Same could be said of AMDs choice for going with the FX as Intel was doing everything they could to ensure IPC and low core counts stayed king.

Holy crap batman! you don't know what the hell you are talking about lol.

I suggest you read more and stop posting so much it will help. Cause what you type is so much crap lol and the amount I have to type to fix your crap takes too much time lol. Yeah that is a problem, I'll do it tomorrow when I'm not too busy at work.

And please just wait before you post again, cause I've already given the shovel to you to a couple of posts ago, and you are now digging your own grave, but be my guest you have a habit of doing this lol.

In the mean time, I suggest you read the concurrency thread at B3D, cause no programmer have ever stated what you did lol. So people that are doing the work don't even think what you have been posting as real.... yeah ..... ok....

And god what the fuck is IPC and GPUs? do you even understand those two things have nothing in common? Only a person that doesn't' know what he is taking about will try to correlate CPU and GPU in the matter of IPC.

CSI_PC · Dec 1, 2016

THUMPer said:
I wouldn't play Hitman on DX12. Performance may be there, but it's still a stuttery mess. So you're both wrong, because none of those benchmarks matter to me.

Just using it as an example of how Canucks is not all about promoting Nvidia performance and correlates with another site using same tool, and yeah also not keen on latest Hitman myself

As I mentioned earlier unfortunately most no longer include AoTS as it is a pain in the backside and time consuming to do accurately (along with needing to be done with PresentMon).
I just lost enthusiasm to show correlation with other games benchmarked as spent a fair bit of time in this thread today.
Cheers

Ieldra · Dec 1, 2016

JustReason I didnt know you were so well versed in the inner working of Maxwell and Pascal (or gcn for that matter),i must have missed your insightful comments and replies to the thread I made about async compute a few months back.

Jokes aside, a little surprised this is still going on. You'd think people would run out of BS after close to a year but it appears that in the land of AMD marketing repeaters the tap is always flowing

Ps: looks like dual Polaris card is launching soon, prepare for forum violence

TaintedSquirrel · Dec 1, 2016

Why make a dual Polaris when Vega is so close?

Ieldra · Dec 1, 2016

TaintedSquirrel said:
Why make a dual Polaris when Vega is so close?

This is exactly my point, and a point I made several months ago.

I'll leave the conclusion open to interpretation

Roy Taylor is wrong again: Watch_Dogs 2 does not support DX12, not "optimized for AMD"

2[H]4U

Limp Gawd

[H]F Junkie

[H]F Junkie

razor1 is my Lover

[H]F Junkie

[H]F Junkie

[H]F Junkie

razor1 is my Lover

[H]F Junkie

[H]F Junkie

razor1 is my Lover

[H]F Junkie

Just Plain Mean

2[H]4U

2[H]4U

razor1 is my Lover

[H]F Junkie

2[H]4U

Limp Gawd

[H]F Junkie

[H]F Junkie

Just Plain Mean

Limp Gawd

[H]F Junkie

2[H]4U

razor1 is my Lover

razor1 is my Lover

[H]ard|Gawd

Extremely [H]

[H]ard|Gawd

2[H]4U

Supreme [H]ardness

[H]F Junkie

[H]F Junkie

[H]F Junkie

2[H]4U

I Promise to RTFM

[H]F Junkie

I Promise to RTFM