NVIDIA’s Neural Texture Compression for material texture compression

erek · May 6, 2023

GeForce NTC

The continuous advancement of photorealism in rendering is accompanied by a growth in texture data and, consequently, increasing storage and memory demands. To address this issue, we propose a novel neural compression technique specifically designed for material textures. We unlock two more levels of detail, i.e., 16× more texels, using low bitrate compression, with image quality that is better than advanced image compression techniques, such as AVIF and JPEG XL. At the same time, our method allows for on-demand, real-time decompression with random access similar to block texture compression on GPUs. This extends our compression benefits all the way from disk storage to memory. The key idea behind our approach is compressing multiple material textures and their mipmap chains together, and using a small neural network, that is optimized for each material, to decompress them. Finally, we use a custom training implementation to achieve practical compression speeds, whose performance surpasses that of general frameworks, like PyTorch, by an order of magnitude.
— Random-Access Neural Compression of Material Textures, NVIDIA

"Unlike common BCx algorithms, which require custom hardware, this algorithm utilizes the matrix multiplication methods, which are now accelerated by modern GPUs. According to the paper, this makes the NTC algorithm more practical and more capable due to lower disk and memory constraints."

Source: https://videocardz.com/newz/nvidia-...than-standard-compression-with-30-less-memory

Prince Valiant · May 6, 2023

Still a fair bit of loss but alright looking.

The png comparison image from NV's article for convenience:

Lakados · May 6, 2023

Prince Valiant said:
Still a fair bit of loss but alright looking.

The png comparison image from NV's article for convenience:

View attachment 568643

Still, that compression ratio though. Smaller than the 1080p texture.

noko · May 6, 2023

Lakados said:
Still, that compression ratio though. Smaller than the 1080p texture.

Lossless compression to degraded compression, is that Nvidia solution to minimize vram in their cards?

Looks like on the right track, data can be much faster when in smaller packet sizes. Still think the quality will have to be maybe not lossless but at least visually the same.

LukeTbk · May 6, 2023

Most of the example I saw aimed for superbe quality at the cost of lower performance

Would be interesting to see the performance and texture size if they aim for BV high quality

noko said:
. Still think the quality will have to be maybe not lossless but at least visually the same.

Does it not just has to beat all other compression alternative already use, specially if the baking is so much faster and generate much small install file, maybe for some game having the same resolution for difuse, normal, displacement, etc... would be an issue and there seem to have some other issues at the moment too.

noko said:
s that Nvidia solution to minimize vram in their cards?

Real-time decompression make it sound like it would be possible (even if the decompressed one end up larger because of the higher quality), but that would be at the cost of reduced performance by the decompression stage added milliseconds.

DukenukemX · May 6, 2023

This is Nvidia's solution to their lack of VRAM. None of this NTC crap sounds like a solution to a problem Nvidia made. It'll lower image quality, much like DLSS. Will this work on AMD and Intel, because most of the shit Nvidia makes is only for Nvidia? Especially since it promises to lower installation space as well as VRAM, but if this doesn't work for AMD and Intel then it'll just end up increasing the variations of game installations.

GoldenTiger · May 6, 2023

DukenukemX said:
This is Nvidia's solution to their lack of VRAM. None of this NTC crap sounds like a solution to a problem Nvidia made. It'll lower image quality, much like DLSS. Will this work on AMD and Intel, because most of the shit Nvidia makes is only for Nvidia? Especially since it promises to lower installation space as well as VRAM, but if this doesn't work for AMD and Intel then it'll just end up increasing the variations of game installations.

Look at the picture. Read the article. It improves image quality while lowering vram requirements. Why should Nvidia owe to their competitors their billions in research dollars to let them use it for free? Seriously, lol worthy.

LukeTbk · May 6, 2023

DukenukemX said:
This is Nvidia's solution to their lack of VRAM. None of this NTC crap sounds like a solution to a problem Nvidia made. It'll lower image quality, much like DLSS. Will this work on AMD and Intel, because most of the shit Nvidia makes is only for Nvidia? Especially since it promises to lower installation space as well as VRAM, but if this doesn't work for AMD and Intel then it'll just end up increasing the variations of game installations.

I feel you mistook current BC compression technic (the left side) of the image, with NTC on the right side, or I do not understand the claim of lowering image quality you make at all, you have read which paper, looked at which video to make it, can you explain it ?

Here NTC is on the left:

It should work on everything:
Unlike common BCx algorithms, which require custom hardware, this algorithm utilizes the matrix multiplication methods, which are now accelerated by modern GPUs

noko · May 6, 2023

LukeTbk said:
Most of the example I saw aimed for superbe quality at the cost of lower performance

Would be interesting to see the performance and texture size if they aim for BV high quality

Does it not just has to beat all other compression alternative already use, specially if the baking is so much faster and generate much small install file, maybe for some game having the same resolution for difuse, normal, displacement, etc... would be an issue and there seem to have some other issues at the moment too.

Real-time decompression make it sound like it would be possible (even if the decompressed one end up larger because of the higher quality), but that would be at the cost of reduced performance by the decompression stage added milliseconds.

Yes and it looks like it does. Had brain fart. See what Nvidia has at Siggraph dealing with this.

DukenukemX · May 6, 2023

GoldenTiger said:
Look at the picture. Read the article. It improves image quality while lowering vram requirements.

It's lossy compression, which means you lose some of original image quality. According to this article "Researchers observed mild blurring, the removal of fine details, color banding, color shifts, and features leaking between texture channels". Furthermore, game artists won't be able to optimize textures in all the same ways they do today, for instance, by lowering the resolution of certain texture maps for less important objects or NPCs. Nvidia says all maps need to be the same size before compression, which is bound to complicate workflows. This sounds even worse when you consider that the benefits of NTC don't apply at larger camera distances. It's basically like DLSS which will cause texture filtering. The image you're looking at is showing you a worst case scenario, vs NTC. The best case scenario is that you have more VRAM and actually load a higher quality texture with BC7. Also 1ms to decompress a 4k texture is very bad and 2 - 4x slower than the highest quality traditional compression method while using RTX 4090. Another thing to keep in mind is that Nvidia's tensor cores have gotten much faster each generation, meaning that this is much worse for those on the RTX 20 and 30 series.

Here's the picture you really need to see, where it shows you how the texture looks before NTC then after. BC7 will use more data, but again we just need more VRAM and not some special compression that is absolutely going to be Nvidia exclusive and will likely reduce performance due to the latency.

Why should Nvidia owe to their competitors their billions in research dollars to let them use it for free? Seriously, lol worthy.

AMD does it all the fucking time. Also this isn't some feature that can be easily included in a game. You're talking about texture data, which is like majority of game data. You have to ship the game with precompressed textures to make use of Nvidia's NTC. So either you download a special Nvidia version of the game, or include both and have an even bigger game to download. Assuming that this is Nvidia exclusive, this just seems like Nvidia is trying to do an Apple and separate themselves from AMD and Intel so they get their customers locked into their ecosystem. Like any proprietary standard, it'll just end up dead after a number of years. Consider the latency introduced through this compression technology, it will probably not be usable until the RTX 50 series at least. You need fixed function hardware to be included into the GPU to make this actually work. What this NTC announcement is doing is giving Nvidia RTX 8GB VRAM card owners false hope, instead of being angry at Nvidia for including such a pathetic amount of VRAM on such expensive graphic cards. If you're a RTX 20/30/40 series owner and thinking that this will be released for you, think again.

Decko87 · May 6, 2023

GoldenTiger said:
Look at the picture. Read the article. It improves image quality while lowering vram requirements. Why should Nvidia owe to their competitors their billions in research dollars to let them use it for free? Seriously, lol worthy.

It won't work that way, this will have to be done during the build process of the game. Any game that uses this technique will have to work on all platforms (I'm 99 percent sure it will anyway) for developers to use it in their engines.

NattyKathy · May 7, 2023

This is legitimately super neat, but yeah count me in with the "I would rather just have the option of using less texture compression" camp.

Asset loading and decompression can be such a humongous performance hit, especially in Open-World games that are constantly streaming in data. Why would I want to make that performance hit even worse in exchange for VRAM thrift?

Like, damn- I've got a GPU with 24GB of 1TB/s memory, plenty of fast system RAM, a 2TB PCIe Gen4 NVMe dedicated only to games, and a 200Mbit+ internet connection. I would much, much, much rather have the install size for a game like Cyberpunk be like 300GB with ridiculous in-game memory usage but smoother performance than deal with constant stuttering while textures get decompressed. /rant

cjcox · May 7, 2023

Waiting for the guy that says running 480p upscaled to 4K with the new neural textures is the same as running an AMD card at native resolution. It's his thing.

"My card gets 1000 fps while your crappy AMD card can only do 100. Near identical settings."

Flogger23m · May 7, 2023

GoldenTiger said:
Look at the picture. Read the article. It improves image quality while lowering vram requirements. Why should Nvidia owe to their competitors their billions in research dollars to let them use it for free? Seriously, lol worthy.

I'll wait and see how it works in game. We heard similar from DLSS, even from places like Digital Foundry talking up how great it was without mentioning the downsides.

GoodBoy · May 7, 2023

DukenukemX said:
It's lossy compression, which means you lose some of original image quality.

They took 256Mb and compressed it to 3.8Mb. Of course there's going to be loss. You are missing the point.

It looks pretty damn good, and looks better than any other compression while being smaller. That's a win.

DukenukemX said:
.. Also this isn't some feature that can be easily included in a game. You're talking about texture data, which is like majority of game data. You have to ship the game with precompressed textures to make use of Nvidia's NTC. So either you download a special Nvidia version of the game, or include both and have an even bigger game to download. Assuming that this is Nvidia exclusive...

Now you are just talking out of your ass, about something you know nothing about.

Lakados · May 8, 2023

I see this compression type as something ideal for consoles. This gen doesn’t even manage to do true 1080p let alone 4k and use all sorts of dedicated hardware tricks to handle texture loading that complicate multi platform releases. Consoles also get far more of an outcry for their large install sizes then PC’s do and with AMD pushing handhelds more space for a lot of things could be a huge premium.

It might not be a lossless method but using a larger lossless format to take a 4K texture to render it at 700-900p then upscale to 1080p or 4K can’t be much better….
And if it let those consoles play at at least with a full 1080p with no scaling and a more streamlined engine then perhaps it’s an overall good thing.
The only significant downside Insee is it is not currently compatible with the available multi render techniques.

DukenukemX · May 8, 2023

GoodBoy said:
They took 256Mb and compressed it to 3.8Mb. Of course there's going to be loss. You are missing the point.

It looks pretty damn good, and looks better than any other compression while being smaller. That's a win.

That image is showing you Nvidia's interpretation of BC7 compression. You can use more data for BC7 and still retain more of original texture. In other words, you need more VRAM.

Now you are just talking out of your ass, about something you know nothing about.

And you do? It's not hard to understand that using tensor cores to do the decompression that creates a 1ms or more delay is going to massively hurt performance. You don't want this. I really doubt this is more than a demonstration because realistically you need fixed function hardware to make this work. Pretty sure the tensor cores are used for Ray-Tracing, so good luck with this working with Ray-Tracing. This is just Nvidia putting something out to quench all the people upset that their RTX graphic cards that they over paid for has the same VRAM as some GPU's from 2015.

Lakados · May 8, 2023

DukenukemX said:
And you do? It's not hard to understand that using tensor cores to do the decompression that creates a 1ms or more delay is going to massively hurt performance. You don't want this. I really doubt this is more than a demonstration because realistically you need fixed function hardware to make this work. Pretty sure the tensor cores are used for Ray-Tracing, so good luck with this working with Ray-Tracing. This is just Nvidia putting something out to quench all the people upset that their RTX graphic cards that they over paid for has the same VRAM as some GPU's from 2015.

It uses tensor cores for compression the paper states that due to how the tensor cores handle shared memory spaces they are a bad choice for decompression and for that they use a driver side modification of standard rasterization calls. The GPU for the most part in newer titles is becoming the standard for decompression which is where Nvidia GPU Direct and AMD Direct GMA come in, and hopefully DX12 Heaps. Almost all the modern texture formats need matrix multiplication methods for decompression and the CPU is bad at that.

Decko87 · May 8, 2023

GoodBoy said:
They took 256Mb and compressed it to 3.8Mb. Of course there's going to be loss. You are missing the point.

It looks pretty damn good, and looks better than any other compression while being smaller. That's a win.

Now you are just talking out of your ass, about something you know nothing about.

I think if it's good enough Unity and Unreal will just include it in their engines and all textures will get packaged that way when you make a build of your project. I would imagine they'll make it pretty easy to use. Thing is, it'll likely be a year or two before it gets adopted en-masse if it's actually worth using over today's compression methods.

socK · May 8, 2023

DukenukemX said:
That image is showing you Nvidia's interpretation of BC7 compression. You can use more data for BC7 and still retain more of original texture. In other words, you need more VRAM.

And you do? It's not hard to understand that using tensor cores to do the decompression that creates a 1ms or more delay is going to massively hurt performance. You don't want this. I really doubt this is more than a demonstration because realistically you need fixed function hardware to make this work. Pretty sure the tensor cores are used for Ray-Tracing, so good luck with this working with Ray-Tracing. This is just Nvidia putting something out to quench all the people upset that their RTX graphic cards that they over paid for has the same VRAM as some GPU's from 2015.

View attachment 569018

Going by their example, it costs about 0.66ms.

Which means, assuming it holds relatively constant in cost, is the equivalent of going from 200 fps to 176 fps.

Or 60fps to 58 fps.

That's giving up fixed function decoding and filtering.

Results may be better, because they say some if the cost might just get absorbed by this happening in parallel to other work. So maybe in perfect scenarios it costs almost zilch.

Decko87 said:
I think if it's good enough Unity and Unreal will just include it in their engines and all textures will get packaged that way when you make a build of your project. I would imagine they'll make it pretty easy to use. Thing is, it'll likely be a year or two before it gets adopted en-masse if it's actually worth using over today's compression methods.

Unreal at least will likely never mainline it. They basically never add vendor tech to the engine.

LukeTbk · May 8, 2023

DukenukemX said:
And you do? It's not hard to understand that using tensor cores to do the decompression that creates a 1ms or more delay is going to massively hurt performance. You don't want this.

It depends just how much better the games look for that performance price and how much of those ms goes into the net total or is mostly hidden by parallel work (if RT is on being a case where maybe it does not change the frame rate at all), this could look much better at any disk and ram usage that is available in your platform, like the paper say:
We hope our work will inspire the creation of highly compressed neural representations for use in other areas of real-time rendering, as a means of achieving cinematic quality

This is part of bringing cinema quality from long render on supercomputer to real time render equation on "cheap" hardware. I do not see how a 3070 can be relevant, it make no sense to play at 4k on that level of power for any good-looking future game (or even current or past) and that would be truer in 4-5 years if this become a thing.

DukenukemX said:
, you need more VRAM.

A generation of console with larger hard drive as well.

It could make a lot of sense for the next Nintendo Switch with specialized hardware.

Lakados · May 8, 2023

LukeTbk said:
It depends just how much better the games for that performance price and how much of those ms goes into the net total or is mostly hidden by parallel work (if RT is on being a case where maybe it does not change the frame rate at all)

A generation of console with larger hard drive as well.

It could make a lot of sense for the next Nintendo Switch with specialized hardware.

I never thought of it for the switch 2 or whatever they call it, but that makes a lot of sense because you aren't doing ray tracing on a handheld at 7-10 watts, nothing out there is doing that well. But using those cores for other custom accelerated tasks, be it DLSS or codec acceleration or blah blah blah could make for a really good showcase of what the hardware can be coerced to do when fully optimized.

Decko87 · May 8, 2023

socK said:
Going by their example, it costs about 0.66ms.

Which means, assuming it holds relatively constant in cost, is the equivalent of going from 200 fps to 176 fps.

Or 60fps to 58 fps.

That's giving up fixed function decoding and filtering.

Results may be better, because they say some if the cost might just get absorbed by this happening in parallel to other work. So maybe in perfect scenarios it costs almost zilch.

Unreal at least will likely never mainline it. They basically never add vendor tech to the engine.

They could make it a plugin or an add on of some sort, if it's actually substantially better than what they have they'll add it. My guess is they'll just make their own that works about as well.

Prince Valiant · May 8, 2023

LukeTbk said:
It depends just how much better the games look for that performance price and how much of those ms goes into the net total or is mostly hidden by parallel work (if RT is on being a case where maybe it does not change the frame rate at all), this could look much better at any disk and ram usage that is available in your platform, like the paper say:
We hope our work will inspire the creation of highly compressed neural representations for use in other areas of real-time rendering, as a means of achieving cinematic quality

This is part of bringing cinema quality from long render on supercomputer to real time render equation on "cheap" hardware.

A generation of console with larger hard drive as well.

It could make a lot of sense for the next Nintendo Switch with specialized hardware.

I don't think Nintendo would go for it unless it was already included with HW they intended to use.

LukeTbk · May 8, 2023

Prince Valiant said:
I don't think Nintendo would go for it unless it was already included with HW they intended to use.

Depends on the release date of said Switch, if it has a December 2023, would need to run well on the already en route Nvidia SoC

TrunksZero · May 8, 2023

If it's nVidia only... F em. We don't need a texture format war. nVidia needs to work to get this part of the Vulkan or DX spec. And if they can't then it should die off, the hell with more proprietary crap.

Lakados · May 8, 2023

TrunksZero said:
If it's nVidia only... F em. We don't need a texture format war. nVidia needs to work to get this part of the Vulkan or DX spec. And if they can't then it should die off, the hell with more proprietary crap.

It's driverside using standard calls within Vulkan and DX, it's only proprietary on the developer side as it requires Tensor cores to handle the compression algorithms but for the consumer decompression is handled about the same as most of the existing formats.

OutOfPhase · May 8, 2023

Lakados said:
It's driverside using standard calls within Vulkan and DX, it's only proprietary on the developer side as it requires Tensor cores to handle the compression algorithms but for the consumer decompression is handled about the same as most of the existing formats.

And to clarify further, even for compression it doesn't actually require tensor cores, being math.

socK · May 8, 2023

TrunksZero said:
If it's nVidia only... F em. We don't need a texture format war. nVidia needs to work to get this part of the Vulkan or DX spec. And if they can't then it should die off, the hell with more proprietary crap.

amusingly, a bunch of the BC formats are basically the ancient S3TC formats.

yes, the S3 from 2 decades ago that had high resolution Unreal textures in 1999

Lakados · May 8, 2023

OutOfPhase said:
And to clarify further, even for compression it doesn't actually require tensor cores, being math.

It doesn't "require" it, you can do it without it, it is just orders of magnitude slower to the point of being impractical without it. Seconds to hours sorts of differences, the tensor cores doing half-precision sub 16-bit are upwards of 100x faster than AMD's lineup and those are 2-3x faster than you can do on a CPU.
So while everything this algorithm could probably be done with out it, I am not sure anybody would or should unless its for shits and giggles.

OutOfPhase · May 8, 2023

Lakados said:
It doesn't "require" it, you can do it without it, it is just orders of magnitude slower to the point of being impractical without it. Seconds to hours sorts of differences, the tensor cores doing half-precision sub 16-bit are upwards of 100x faster than AMD's lineup and those are 2-3x faster than you can do on a CPU.
So while everything this algorithm could probably be done with out it, I am not sure anybody would or should unless its for shits and giggles.

Understood. Just saying - the "gating" is having computation resources appropriate to the task. If one was pathologically against systems which could do this well, you can still do it.

DukenukemX · May 8, 2023

socK said:
amusingly, a bunch of the BC formats are basically the ancient S3TC formats.

yes, the S3 from 2 decades ago that had high resolution Unreal textures in 1999

Yea ancient as in BC7 that was included in DX11. DX11 ain't no spring chicken, but it isn't as old as S3TC, which BTW was relinquished by HTC who bought S3 long ago.

Lakados said:
It doesn't "require" it, you can do it without it, it is just orders of magnitude slower to the point of being impractical without it. Seconds to hours sorts of differences, the tensor cores doing half-precision sub 16-bit are upwards of 100x faster than AMD's lineup and those are 2-3x faster than you can do on a CPU.

Where you getting this info that AMD is 100x slower at this?

So while everything this algorithm could probably be done with out it, I am not sure anybody would or should unless its for shits and giggles.

The idea is that you don't need tensor cores over specialized hardware. You wouldn't use the CPU to do this, that would be silly. Nvidia is using tensor cores because they are doing nothing else otherwise.

LukeTbk said:
It depends just how much better the games look for that performance price and how much of those ms goes into the net total or is mostly hidden by parallel work (if RT is on being a case where maybe it does not change the frame rate at all), this could look much better at any disk and ram usage that is available in your platform, like the paper say:
We hope our work will inspire the creation of highly compressed neural representations for use in other areas of real-time rendering, as a means of achieving cinematic quality

The only way this would work is if everyone agrees on BC8 with this, or whatever new standard. On top of this, we need new GPU hardware because again tensor cores are too slow for this, and would be busy if you're using Ray-Tracing. But then this would just alienate everyone who doesn't have a new GPU to use this new texture compression technology.

This is part of bringing cinema quality from long render on supercomputer to real time render equation on "cheap" hardware. I do not see how a 3070 can be relevant, it make no sense to play at 4k on that level of power for any good-looking future game (or even current or past) and that would be truer in 4-5 years if this become a thing.

The RTX 3070 was mentioned because it has 8GB of VRAM, also the vast majority of RTX owners have 8GB or less VRAM. Also it's very likely that Nvidia will continue to release the RTX 40 series with 8GB of VRAM. The fact you have graphic cards with the same VRAM as the R9 290 and my Vega 56 that were release a very long time ago, is very sad. This was fine because most games were built for the PS4, but now that the PS5 is out, the developers are now starting to make use of it's hardware, which means you'll need more than 8GB of VRAM in some cases. It's not that you need much more, just like 10GB or 12GB instead of 8GB, and you're all ready to jump on another texture compression standard because you bought 8GB cards. Intel has 16GB cards for $400, while AMD's 6800 and 6900 also have 16GB, with the 6700 have 10GB and 12GB versions.

Also, you don't need a supercomputer farm to make these textures. It's just math for Gaben sake. Also this is BC1 vs uncompressed when you use a bit more data. Doesn't look so bad, now does it? Sure you need to use more data, but it doesn't look noticeably worse.

Here's uncompressed vs BC7.

A generation of console with larger hard drive as well.

Most consoles run AMD hardware, and I doubt they need NTC anyway.

It could make a lot of sense for the next Nintendo Switch with specialized hardware.

It could and would make sense, but I feel Nvidia's NTC is just a tech demo meant to make Nvidia 8GB card owners less angry that the RTX 4060 and 4050 will also come with 8GB of VRAM, or even less.

Lakados · May 8, 2023

DukenukemX said:
Where you getting this info that AMD is 100x slower at this?

AMD themselves. AMD does not do native sub 16-bit full or half precision it emulates it. That is why all their cards are completely incompatible with the OpenAI and numerous tasks. The upcoming CDNA architecture supposedly corrects this and brings it back but will still be a fair ways behind leaks have put it about half the performance of the existing Nvidia lineup. But the MI300 bringing back these features is one of its biggest selling points, it is highly anticipated for that reason. It is going to be AMD’s first shot at NVidia’s AI dominance.

AMD dominates for 64 and 32 bit precision and floating point, they crush NVidia there and for the longest time that’s what was big for scientific research and simulation tasks. AI goes the opposite direction low precision 4 and 8 bit integers, the problem for AMD now is AI tasks are approximating their way to the same values and doing it almost as quickly and if they aren’t careful they risk being overtaken so AMD has made it clear they are bringing that back to CDNA, probably won’t be needed in RDNA any time soon but they are bringing it to the Ryzen 7040 series so that’s gonna be neat.

Decko87 · May 8, 2023

Lakados said:
AMD themselves. AMD does not do native sub 16-bit full or half precision it emulates it. That is why all their cards are completely incompatible with the OpenAI and numerous tasks. The upcoming CDNA architecture supposedly corrects this and brings it back but will still be a fair ways behind leaks have put it about half the performance of the existing Nvidia lineup. But the MI300 bringing back these features is one of its biggest selling points, it is highly anticipated for that reason. It is going to be AMD’s first shot at NVidia’s AI dominance.

AMD dominates for 64 and 32 bit precision and floating point, they crush NVidia there and for the longest time that’s what was big for scientific research and simulation tasks. AI goes the opposite direction low precision 4 and 8 bit integers, the problem for AMD now is AI tasks are approximating their way to the same values and doing it almost as quickly and if they aren’t careful they risk being overtaken so AMD has made it clear they are bringing that back to CDNA, probably won’t be needed in RDNA any time soon but they are bringing it to the Ryzen 7040 series so that’s gonna be neat.

Lot of useful posts, it'll be interesting to see how easily they make it available for game dev. I'm personally all for better compression it's been needed for a long, long time now. It sounds like with tensors it'll reduce build time as well which is another huge benefit.

LukeTbk · May 8, 2023

DukenukemX said:
Also, you don't need a supercomputer farm to make these textures. It's just math for Gaben sake. Also this is BC1 vs uncompressed when you use a bit more data. Doesn't look so bad, now does it? Sure you need to use more data, but it doesn't look noticeably worse.

Not about making texture ? This is a compression-baking algorithm. We are talking at fitting them on cheap hardware for real time rendering.

DukenukemX said:
Most consoles run AMD hardware, and I doubt they need NTC anyway.

It is not about NTC, they need heavy compression of asset, the question which one has the best compromise on size, quality and performance, this as not much if anything to do with nvidia or amd hardware that I know of, could you explain ?

DukenukemX said:
The idea is that you don't need tensor cores over specialized hardware. You wouldn't use the CPU to do this, that would be silly. Nvidia is using tensor cores because they are doing nothing else otherwise.

Tensor is for fast baking, not used for the real time rendering according to the paper, shaders matrix operation

DukenukemX said:
It could and would make sense, but I feel Nvidia's NTC is just a tech demo meant to make Nvidia 8GB card owners less angry that the RTX 4060 and 4050 will also come with 8GB of VRAM, or even less.

By the time this would end up on actual PC game (think time between Unreal 5 demo and actual game, direct storage demo and actual game), not sure how relevant those cards would be. This seem to be quite the conspiracy mind, how long do you think those research have been going on ?

DukenukemX said:
. Also this is BC1 vs uncompressed when you use a bit more data. Doesn't look so bad, now does it?

Hard to say, this is not about looking at picture of the diffuse map, this is compressing all the element of modern texture (diffuse, normal, displacement, etc...) in one go, would need to see in actual 3d rendered action. you could show me a picture of a lossless jpeg that look the same as the original once read... and ? It is all about the size and performance, outside that context what the point at look at it ?

DukenukemX · May 9, 2023

Lakados said:
AMD themselves. AMD does not do native sub 16-bit full or half precision it emulates it. That is why all their cards are completely incompatible with the OpenAI and numerous tasks. The upcoming CDNA architecture supposedly corrects this and brings it back but will still be a fair ways behind leaks have put it about half the performance of the existing Nvidia lineup. But the MI300 bringing back these features is one of its biggest selling points, it is highly anticipated for that reason. It is going to be AMD’s first shot at NVidia’s AI dominance.

AMD dominates for 64 and 32 bit precision and floating point, they crush NVidia there and for the longest time that’s what was big for scientific research and simulation tasks. AI goes the opposite direction low precision 4 and 8 bit integers, the problem for AMD now is AI tasks are approximating their way to the same values and doing it almost as quickly and if they aren’t careful they risk being overtaken so AMD has made it clear they are bringing that back to CDNA, probably won’t be needed in RDNA any time soon but they are bringing it to the Ryzen 7040 series so that’s gonna be neat.

Ok but at what point does this matter for games? A quick Google doesn't show me any benchmarks for this, nor any games that make relevant use of it. From what I'm reading it's barely a bit slower, but again no benchmarks.

LukeTbk said:
Not about making texture ? This is a compression-baking algorithm. We are talking at fitting them on cheap hardware for real time rendering.

It's 2023, put more VRAM into the cheap hardware.

It is not about NTC, they need heavy compression of asset, the question which one has the best compromise on size, quality and performance, this as not much if anything to do with nvidia or amd hardware that I know of, could you explain ?

From what I've read NTC uses tensor cores and AMD hardware doesn't have tensor cores.

By the time this would end up on actual PC game (think time between Unreal 5 demo and actual game, direct storage demo and actual game), not sure how relevant those cards would be. This seem to be quite the conspiracy mind, how long do you think those research have been going on ?

It is a conspiracy since lately a lot of people have been pointing out how Nvidia has been cheap with VRAM, and now we have a VRAM solution suddenly. This is literally the equivalent of downloading more ram. If this took Nvidia a long time to develop, then where is a demo and driver support for this? They just farted it out.

Hard to say, this is not about looking at picture of the diffuse map, this is compressing all the element of modern texture (diffuse, normal, displacement, etc...) in one go, would need to see in actual 3d rendered action. you could show me a picture of a lossless jpeg that look the same as the original once read... and ? It is all about the size and performance, outside that context what the point at look at it ?

The point is that the obvious solution is to include more VRAM. I think someone just posted that the RTX 4060 will come with 8GB, so Nvidia has no plans to include more VRAM. Nvidia isn't showing you how much data is needed to get BC7 with the same image they produced with NTC, but an even worse image than NTC. Why are we humoring Nvidia's NTC which sounds like a more complicated solution than just including more VRAM? You wouldn't even need that much more, like 10GB or 12GB at least for low end cards, and 16GB for mid range. This is just Nvidia creating planned obsolescence, because they want to make sure that when new games do make use of more VRAM, you either get worse image quality like NTC, or you get unplayable performance and therefore must upgrade.

Lakados · May 9, 2023

DukenukemX said:
Ok but at what point does this matter for games? A quick Google doesn't show me any benchmarks for this, nor any games that make relevant use of it. From what I'm reading it's barely a bit slower, but again no benchmarks.

It doesn't matter for games, it's why AMD doesn't include the functions on the RDNA and until about 2 years ago it didn't matter for most research either again why they chose to remove it from CDNA. It is however absolutely critical for machine learning, and AI in general, a field AMD wrote off and they have to backtrack on.
But this is needed it is not just a add more VRAM issue, it's an add more everything, add more VRAM more bandwidth, and more memory channels, consumers are now being hampered by only having 2 channels for system ram, more storage space, and faster storage space. Would I love to see consumer CPUs with 4 memory channels, and a 24GB min for VRAM, with a 64GB 4 channel standard, hell yeah, but that would cut into workstation sales for everybody and they want to protect that so it's not going to happen. It is 100% an artificial segmentation problem and Intel, AMD, and Nvidia don't want to disturb the waters there too much.

socK · May 9, 2023

DukenukemX said:
Ok but at what point does this matter for games? A quick Google doesn't show me any benchmarks for this, nor any games that make relevant use of it. From what I'm reading it's barely a bit slower, but again no benchmarks.

It's 2023, put more VRAM into the cheap hardware.

From what I've read NTC uses tensor cores and AMD hardware doesn't have tensor cores.

It is a conspiracy since lately a lot of people have been pointing out how Nvidia has been cheap with VRAM, and now we have a VRAM solution suddenly. This is literally the equivalent of downloading more ram. If this took Nvidia a long time to develop, then where is a demo and driver support for this? They just farted it out.

The point is that the obvious solution is to include more VRAM. I think someone just posted that the RTX 4060 will come with 8GB, so Nvidia has no plans to include more VRAM. Nvidia isn't showing you how much data is needed to get BC7 with the same image they produced with NTC, but an even worse image than NTC. Why are we humoring Nvidia's NTC which sounds like a more complicated solution than just including more VRAM? You wouldn't even need that much more, like 10GB or 12GB at least for low end cards, and 16GB for mid range. This is just Nvidia creating planned obsolescence, because they want to make sure that when new games do make use of more VRAM, you either get worse image quality like NTC, or you get unplayable performance and therefore must upgrade.

I mean why make anything better or even research an idea when we can just throw more hardware at it

ZeroBarrier · May 9, 2023

socK said:
I mean why make anything better or even research an idea when we can just throw more hardware at it

The way things have been going, I give it 2 to 3 more new architectures before we see low end trash AMD cards with 128GB of vram because all these fanboys keep crowing that 64GB aren't enough.

Stop giving developers a pass on a game that looks last gen that runs like shit on the fastest GPU on the planet.

Edit: To clarify, trch like this is cool and cam help enhance an already awesome thing. Imagine being able to download a ultra HD texture pack mod and this tech help make it run on more than just professional grade hardware.

staknhalo · May 9, 2023

Software and hardware development have to progress each, they go hand in hand, sometimes one has to progress more than the other for a certain thing

NVIDIA’s Neural Texture Compression for material texture compression

[H]F Junkie

Limp Gawd

[H]F Junkie

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

Fully [H]

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

2[H]4U

[H]ard|Gawd

2[H]4U

[H]F Junkie

2[H]4U

[H]F Junkie

Supreme [H]ardness

[H]F Junkie

2[H]4U

Supreme [H]ardness

Supreme [H]ardness

[H]F Junkie

2[H]4U

Limp Gawd

Supreme [H]ardness

Gawd

[H]F Junkie

Supreme [H]ardness

Supreme [H]ardness

[H]F Junkie

Supreme [H]ardness

Supreme [H]ardness

[H]F Junkie

2[H]4U

Supreme [H]ardness

Supreme [H]ardness

[H]F Junkie

Supreme [H]ardness

Gawd

Supreme [H]ardness