Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
That's one area where the queue priorities would probably help. There is likely some control, albeit in drivers, over work distribution. Limit graphics to 70% occupancy, reserving 30% for compute for example. This will probably get exposed in Vulkan, DX12 is another matter(one compute queue). Those features should make async somewhat self tuning based on hardware. I know ACEs are programmable(drivers). It would make sense the work distributor could be configured as well. Score shaders by tex/memory:math ratio and attempt to balance all the compute units.
That sounds like downplaying.sorry mis spoke he sated space slicing, damn hang over from last night.
He also stated current console code will not run well on future hardware, there will most likely be no performance gains from async on future hardware with older titles (current ones), and hopefully there won't be regression of performance. This is specific to AMD hardware.
Space slicing, if I understand correctly, should be partitioning the compute units which is still different than what I had in mind. That'd be zero concurrency.sorry mis spoke he sated space slicing, damn hang over from last night.
He also stated current console code will not run well on future hardware, there will most likely be no performance gains from async on future hardware with older titles (current ones), and hopefully there won't be regression of performance. This is specific to AMD hardware.
I'm not sure that was in regards to async specifically. My take was the color compression on fiji/tonga needing a unique path.no its not, even at GDC, AMD pretty much stated the same thing, different hardware of different generation levels need different code to perform optimally.
I've been saying this for what 8 months now, and you think this is downplaying. Its just the way things are.
Apologies for the slight necro. But the feature I described probably looks a lot like this: Hardware Managed Ordered Circuit Patented by AMD in 2013.first response
1.0 certainly isn't, the ACEs had not been programmable back then at all.
1.1 is programmable, but the space is limited, and the space is required for the queue decoding logic.
1.2 might be able to do such a thing. But I'm not entirely sure what the HWS/ "new" ACE units on Tonga and Fiji are actually doing right now.
We have no idea of what the HWS units do, what Dave Baumann (works at AMD). told me was each HWS unit works like 2 ACE units. So I'm going with queue decoding logic isn't there and they need a lot of space for that.
I did PM a couple of Intel engineers as well, waiting on their responses, but I'm pretty sure I'll get a similar response
An embodiment of the present invention provides an apparatus including a scoreboard structure configured to store information associated with a plurality of wavefronts. The apparatus further includes a controller, comprising a plurality of counters, configured to control an order of operations, such that a next one of the plurality of wavefronts to be processed is determined based on the stored information and an ordering scheme.