The graphics processor that knows how to do it all: The NVIDIA Ampere generation has officially begun • HWzone
Computersgraphic cards

The Graphic Processor That Can Do It All: The NVIDIA Ampere generation has officially begun

The chipmaker's online event met and exceeded expectations - get to know the A100 core, the accelerator A100 and the monstrous DXG-A100 system obtained when eight are combined

Zhen-san Huang promised us the world's largest video card - and it does. The new GA100 core , The first in the Ampere architecture and the first based on a 7 nm manufacturing process (domestic) Of course, it is only slightly larger than the gigantic GV100 core that has held its record so far - but with 2.57 times more transistors, this is a very large development designed to accelerate all relevant and recognized processing techniques today, without exception.

The GA100 cores are a whopping 826 square millimeters in size, and with the A100 boosters we get them with an extra 5 stacks of Modern (out of 6 possible at the core) of 40GB of effective 1.6TB bandwidth per second thanks to the 5,120-bit interface - and with 6,912 CUDA active processing units at a maximum operating frequency of 1,410MHz, with the addition of 432 new-generation censors and cheating memory L2 of 40MB. All of these work together under a huge 400-watt power casing - for a standard precision floating point calculation (FP32) of 19.5FLOPS (or 19.5 trillion calculations per second) which is a more than 24 percent improvement over the previous peak in the GV100 cores.

The journey of The world of 7 nm lithography is starting now

On paper, this performance is lower than the jump we saw in the transition between the Pascal generation and the Volta generation - but the advantage of the A100 core is that it supports a much larger number of processing modes, all with unprecedented power to the field: floating point calculations as well Exactly Dual (FP64) of 9.7TFLOPS, also integer calculations with integers exactly 8 bit (INT8, a feature not supported at all in the Volta cores) of 624TOPS, as well as floating point tensor calculations with exactly 16 bit (FP16) of 312TOPS and also tensor calculations With 32 bit floating point (TF32 - new development from In the current generation) of 156TOPS. The chip developer claims to have for the first time created a single processing product capable of rendering all other products irrelevant and unnecessary, thereby significantly optimizing power-intensive computer systems globally both in architecture and in power consumption.

B- Already boasting winnings in a number of powerful supercomputer tenders based on Ampere boosters

The first commercial product of In the era of Ampere, the DGX-A100 will have eight SXM A100 cards combined with enhanced NVLink 3 interfaces with an effective bandwidth of 600GB per second - the recommended price is $ 199,000, with a pair of EPYC processors , Linux system, one dynamic single-byte dynamic memory, 17-byte NVMe storage, and power consumption of up to 6,500 watts, with each such system promising to deliver Up to 5TFLOPS acceleration (FP16) and up to 10TOPS for machine learning.

The full GA100 core actually includes 128 processing clusters with 8,192 CUDA units in total, of which 108 processing clusters are active in the A100 unit - and not sure if we will ever see an even more sophisticated product in which all of these hardware units are accessible and active

Will Intel or AMD have a direct answer to this technological attack, or Is a new era of big business blossoming underway? We'll find out soon - though you can already admit that the architecture She is everything we expected and hoped for and maybe a little more than that.

$ 199,000 is a bargain price for peak performance in standard 6U server cabinet configuration?

Back to top button