RTX 40 graphics cards look set to get a big increase in shader count and L2 cache

  • The aftermath of the cyberattack on Nvidia over the weekend continues. We seem to be receiving daily floods of information as hackers attempt to force Nvidia to give in to demands, including the removal of LHR mining caps.
    The first bit of new information indicates that RTX 40 GPUs will feature significantly more shaders than RTX 30 GPUs. Starting at the top of the range, the AD102 GPU includes up to 18432 so-called CUDA cores. That’s a huge jump from the 10,752 GA102 max you’ll find in a fully unlocked (and currently defunct) RTX 3090 Ti.
    The AD103 GPU could be the flagship laptop GPU that the current GA103 is. It could also end up in cards like the hypothetically named RTX 4070. It has the same number of 10752 shaders as the GA102, and while that number is likely to be a fully unlocked chip, this could mean the next generation RTX 4070 Ti will provide performance on par, if not better, than the RTX 3090.
    GPUs from the lower range don’t get as much of an increase in shaders. The 7680-core AD104 is intended to replace the 6144-core GA104 used in the RTX 3070 Ti. We can expect this to end up with RTX 4060 Ti and 4070 class cards. Of course, the launch is still a long way off, and even if the GPUs themselves are already in production, we don’t know anything about yield. Nvidia will certainly be picking up cutbacks like they are now. The frequencies and number of shaders for individual models at this stage, most likely, have not yet been determined.
  • Tips & Tricks
(Image credit: Future)
  • How to buy a video card: tips for buying a video card in the desert silicon landscape of 2022

The news about the increase in the number of shaders is not the only leak of Ada Lovelace today. On Twitter, @harukase5719 published a good summary and comparison table. Nvidia intends to significantly increase the L2 cache size of its new GPUs. The increase is so great that the current size of the L2 cache is negligible by comparison.

Starting with AD102, it includes up to 96 MB L2, compared to 6 MB in the current GA102. AD103 has up to 64 MB compared to 4 MB for GA103, and AD104 has 48 MB compared to 4 MB for GA104.

So let’s recap about Lovelace and Hopper… @kopite7kimi @xinoassassin1 pic.twitter.com/hioRcvn8fbMarch 2, 2022

  • To learn more

These increases drastically change the architecture and, if properly optimized, can provide very impressive performance gains not unlike what AMD was able to achieve with the similar Infinity Cache. This reduces the need for a significant increase in bandwidth that would be required to get the most out of all these shaders. Bus widths of 192, 256, and 382 bits, combined with a huge L2 cache, should be enough to cover the bandwidth requirements of RTX 40 cards. In theory, anyway. We need to test this for ourselves to know for sure.

So, Ada Lovelace’s cards seem to have a bit of everything. More shader cores (or CUDA cores, to use Nvidia’s naming convention), a lot more L2 cache, and higher clock speeds, thanks in part to TSMC’s 5nm process. Unfortunately, there are rumors of a similarly dramatic increase in power consumption, though we think the 850W rumors are nearly impossible for a consumer card. Don’t be surprised if you see 500 watts or more. It’s scary enough. Both Nvidia and AMD are getting ready to fight each other for the hearts, minds, and dollars of gamers. Along with AMD Zen 4 and 13th Gen Intel Raptor Lake, the second half of 2022 looks like time for a refresh.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *