The AMD Vega chips might be the last big Radeon GPUs they ever make | PCGamesN

The AMD Vega chips might be the last big Radeon GPUs they ever make

AMD Vega 10 GPU

The fancy new AMD Vega graphics cards were officially unveiled at a tech day surrounding SIGGRAPH in LA. Despite the fanfare, I've got a feeling this is going to be the company’s last hurrah for the big, monolithic GPU design... because the AMD Navi architecture promises to be something entirely different.

Read more: best graphics cards.

With the launch of the RX Vega consumer GPUs creeping ever closer, AMD have been looking to manage the somewhat over-enthusiastic expectations that have dogged their first high-end graphics card release in years. 

However, back around the launch of the Polaris architecture last year, at a dinner with AMD’s Richard Huddy, he told me it was actually the AMD Navi architecture that was going to be genuinely exciting. Stupidly, I’d thought he actually meant the upcoming Vega architecture, but, looking back on it now, I was the one drinking, not him… Maybe those Vega expectations were being subtly managed even back then. 

When AMD did start talking about the Vega architecture it was about it’s high-performance gaming chops. We assumed then they were targeting the very top of the GPU tech tree. That meant pointing their silicon guns directly at Nvidia’s Pascal architecture and the GTX 1080 Ti, which is the most powerful consumer GPU around right now. 

But to expect a design team - which hasn’t released a genuinely competitive high-end graphics card for a good few years - to top the performance charts with a brand new architecture was maybe a little ambitious.

AMD’s best GPU efforts now look like they’re only really going to be able to perform at around the same level as, what is effectively now, Nvidia’s second-tier graphics card. A second-tier graphics card that was released some 14 months back. 

AMD Radeon RX Vega

So, how do AMD make the subsequent Navi GPU architecture the big graphical game-changer they say it’s going to be? The answer potentially lies in the successes AMD have had on the CPU side of the business, and with the gloriously titled Infinity Fabric interconnect. This connection is what joins the two quad-core modules of the Ryzen CPUs together and allows them to ostensibly act as a single eight-core chip.

They’re already using the Infinity Fabric to some extent inside the Vega GPU architecture, but it’s the potential for using it to connect multiple slices of graphics silicon inside one GPU package that could be what makes the Navi architecture such a big deal.

AMD Infinity Fabric

AMD’s graphics guru, Raja Koduri, has already gone on record saying that it forms the basis of all their future integrated circuits.

"Infinity Fabric allows us to join different engines together on a die much easier than before," Koduri explains. "As well it enables some really low latency and high-bandwidth interconnects.This is important to tie together our different IPs (and partner IPs) together efficiently and quickly. It forms the basis of all of our future ASIC designs.” 

We haven't mentioned any multi GPU designs on a single ASIC, like Epyc, but the capability is possible with Infinity Fabric."

In short, if you want to jam a whole bunch of discrete GPU cores into a single package then AMD’s Infinity Fabric is the perfect interconnect for such a job. If the AMD Navi architecture really is going to be a game-changer then this could be the perfect way for them to achieve that feat.

Infinity Fabric beyond the SoC

It also means AMD could get around the difficulty their production partners have had in nailing down subsequent lithography shrinks to help them deliver generation-on-generation performance boosts. AMD’s chief technical officer, Mark Papermaster, has said that making the shift down to 7nm, the next node on from the current 14nm designs, is “the toughest lift I’ve seen in a number of generations.” 

Going beyond the 7nm mark then is going to be even harder. If we end up in a situation where AMD is hamstrung by another stutter in the shift to ever smaller production lithographies, as happened with the move from 28nm to 14nm, the monolithic GPU design would be an albatross around their necks again. But using a multi-chip design, which uses the Infinity Fabric to connect more lower-spec GPU silicon together in one package, could deliver the generational performance uplift they’d be looking for with Navi and beyond.

AMD GPU roadmap

And if they can make such a multi-GPU design practically invisible to the operating system, as seems to be more-or-less the case with their ‘glued-together’ Ryzen and Threadripper processors, then that could take care of the traditional multi-GPU problems CrossFire and SLI have endured. 

On the CPU side, the Infinity Fabric interconnect is capable of delivering near perfect scaling. If it can do the same with GPU silicon then the potential for the AMD Navi architecture is clear. If multiple graphics chips are able to work together in the same manner, Nvidia really ought to be worried, and not just in the gaming world either. Such an efficiently scaling GPU architecture would do wonders in the professional / artifical intelligence spaces too.

AMD have also almost entirely backed off from the prospect of using CrossFire with their latest Vega GPUs, which would seem to feed into this future strategy too. The new cards are  CrossFire-capable, and Asus are rumoured to be pulling some more Ares shenanigans with a bespoke multi-GPU card, but AMD have barely mentioned a single word about it themselves during any of their public briefings. 

Infinity Fabric perfect scaling

That’s a marked change to their Polaris patter, where CrossFire was touted as the method for bridging the performance delta between them and Nvidia. It was only when pressed during a roundtable discussion, report GamersNexus, that AMD said, technically speaking, the cards would support CrossFire, but that the industry was largely moving away from multi-GPU configurations. 

To me, this all points to a future where AMD are turning their backs on the traditional monolithic GPU design, instead utilising their impressively lithe Infinity Fabric interconnect to make a swarm of little GPU cores work together for the greater good. And that could make the next generation of AMD graphics chips entirely different beasts compared to their Nvidia rivals, and could well put them a long way ahead. 

So, what do you think? Am I living in a fantasy world or is AMD’s next generation of graphics processors going to be some sort of Epyc, multi-GPU, monster chip going to Infinity and beyond?

Sign in to Commentlogin to comment
Firerod avatarhishnash avatarDave James avatarhfm avatarDuoBlaze avatarKiranmhatrejust4girls avatar
hishnash Avatar
6 Months ago

Creating an MCM style GPU that is transparent to the OS will be much harder than the CPU the reason for this is due to how operating systems have for a long time already been considering the concept of running tasks on the best core (based on the memory and other tasks this task needs)

However, in the GPU world, there is a very weak concept of memory ownership and even less of a concept of communication between threads. Without this "meta data" it would be very hard for the gpu to place tasks on cores that are local to each other.

to hamper this even more, unlike on a CPU, on a GPU when a task is waiting for some data it typically will block that core until it can continue. On a CPU even during the same cycle, those spare operations will be used up by the SMT/MT and on the next cycle, another talk will be set to run until the first is ready.

Thus with most current GPU compute tasks if you were just to run them without any changes on an MCM system you would have lots of cores being blocked waiting to talk to memory associated with other dies and you would get a really poor performance.

however AMD is working to improve this, HBCC brings cpu style memory ownership and management to the GPU. And there are parts of the Linux ROCm kernel (that will soon be mainlined) that bring real threading so that the GPU can be aware of parent kernels and shared memory between kernels.

However given this needs a patch to the Linux kernel and needs code to be recompiled i would not expect this to be as transparent as you say.

Dave James Avatar
6 Months ago

Yeah, HBCC could very well end up being one of those classic AMD 'fine wine' features that gets better with age. Having a large, high-speed memory system able to be used wherever it's needed ought to be a huge boon for an MCM system.

Also, graphics workloads are already incredibly parallel, and GPU silicon super modular, so multiple discrete chips in the same package, connected via the Infinity Fabric, should be capable of being accessed as one large mass of techie graphical goodness.

Well, so long as the schedulers do their jobs, I guess.

Firerod Avatar
6 Months ago

Well it makes sense, if you can interconnect four RX 580 dies on one pcb operating like a single GPU then it would blow anything out of the water with the near perfect scaling unlike crossfire. That would mean they could put two vega dies connected together that would be much better than onboard crossfire which is basically what the pro duo is.

hfm Avatar
1 Month ago

All of this doesn't fix the gap in performance per watt between Vega and Pascal. It's a rather stark contrast at the moment.

Kiranmhatrejust4girls Avatar
1 Month ago

If we think about it, it's may not be a contrast at all. The vega as we know rn, seems to actually working at speeds more than it should. Simply put, Vega was hyped/touted to be a high end card from AMD after so long which was supposed to be at least equal to competition if not better. Amd couldn't make it as big as it hoped, nor could it fix the DSRB and other new features it thought would help in saving the grace in time even after so much delay in getting them right. What they eventually did end up with even after so much time was just something around 1070 levels. With no option they simply overclocked the shit out of it possible. I suppose if Vega could do a top o.c of 150% architecturally, the stock vega was already around 120-130%. This seems evident from the fact that vega is easily underclocked without losing performance in a nowhere near linear rate. They had to do it to be at least around 1080. They did this even with the fury.

Pascal, on the other hand, released with no pressure. Nvidia were so better at the time of release, that the o.c headroom on pascal and even on 980ti was so much more, they didn't have to keep the stock clocks anywhere near its top architectural levels.

I for one, feel if a Veag 64 was released as a 1070ti comopetitor, something that it may actually should have been, it would be equal to or better in the performance to watt ratio. similarly, if pascal had to be in such a scenario, it would also have struggled in the same performance to watt ratio just like vega.

hfm Avatar
1 Month ago

Vega 56 is essentially the 1070 competitor, and perf per watt is no less of a gap there. You have a large amount of word salad there, but the bottom line is to achieve performance equivalent to anything nvidia has to offer they are using much more power. I don't understand where your clock argument is going. Perf-per-watt is perf-per-watt. If they could compete with Pascal anywhere on that spectrum they would be doing it to try to steal market share from that tier of nvidia's products.

Looking at some Anandtech Vega 56 reviews, it's using more power than a 1080, let alone the Vega 64 that runs neck-and-neck with it. Perhaps driver optimization will help down the road, but I think with a gap that large we're going to be looking at Navi to hopefully compete, but nVidia will have a new process node as well.

DuoBlaze Avatar
1 Month ago

I didn’t read see one piece of evidence supporting the opinion in the title. Unsurprising from a site which ran nvidia sponsored native ads much of the year.

Dave James Avatar
1 Month ago

AMD haven't released any details about the Navi architecture yet, as such there isn't any specific evidence of what the GPU designs will be, although Koduri did specifically say Infinity Fabric was being used of or all future ASICs. I am essentially speculating on the possible use of the Infinity Fabric tech being used to connect lots of small GPUs together to create their next generation of high-end graphics cards, and that will mark the end of big monolithic Radeon GPU designs.

This isn't a negative piece either, it's not some biased editorial claiming that Nvidia are the best because they paid us, it's discussing AMD’s impressive interconnect technology and its potential use in future GPUs.

I don't see one piece of evidence supporting your casual accusation of manufacturer bias.