AMD Unveils World’s First 7nm Datacenter GPUs -- Powering the Next Era of Artificial Intelligence, Cloud Computing and High Performance Computing (HPC)


©2018 Advanced Micro Devices, Inc.  All rights reserved. AMD, the AMD Arrow logo, Radeon, Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

_____________________________

1 As of Oct 22, 2018. The results calculated for Radeon Instinct MI60 designed with Vega 7nm FinFET process technology resulted in 29.5 TFLOPS half precision (FP16), 14.8 TFLOPS single precision (FP32) and 7.4 TFLOPS double precision (FP64) peak theoretical floating-point performance. This performance increase is achieved with an improved transistor count of 13.2 billion on a smaller die size of 331.46mm2 than previous Gen MI25 GPU products with the same 300W power envelope.

The results calculated for Radeon Instinct MI50 designed with Vega 7nm FinFET process technology resulted in 26.8 TFLOPS peak half precision (FP16), 13.4 TFLOPS peak single precision (FP32) and 6.7 TFLOPS peak double precision (FP64) floating-point performance. This performance increase is achieved with an improved transistor count of 13.2 billion on a smaller die size of 331.46mm2 than previous Gen MI25 GPU products with the same 300W power envelope.

The results calculated for Radeon Instinct MI25 GPU based on the “Vega10” architecture resulted in 24.6 TFLOPS peak half precision (FP16), 12.3 TFLOPS peak single precision (FP32) and 768 GFLOPS peak double precision (FP64) floating-point performance. This performance is achieved with a transistor count of 12.5 billion on a die size of 494.8mm2 with 300W power envelope.

AMD TFLOPS calculations conducted with the following equation for Radeon Instinct MI25, MI50, and MI60 GPUs: FLOPS calculations are performed by taking the engine clock from the highest DPM state and multiplying it by xx CUs per GPU. Then, multiplying that number by xx stream processors, which exist in each CU. Then, that number is multiplied by 2 FLOPS per clock for FP32 and 4 FLOPS per clock for FP16. To calculate FP64 TFLOPS rate for Vega 7nm products MI50 and MI60 a 1/2 rate is used and for “Vega10” architecture based MI25 a 1/16th rate is used.

TFLOP calculations for MI50 and MI60 GPUs can be found at https://www.amd.com/en/products/professional-graphics/instinct-mi50 and https://www.amd.com/en/products/professional-graphics/instinct-mi60

GFLOPS per Watt
  MI25 MI50 MI60
FP16 0.082 0.089 0.098
FP32 0.041 0.045 0.049
FP64 0.003 0.022 0.025

Industry supporting documents / web pages:
http://www.tsmc.com/english/dedicatedFoundry/technology/7nm.htm
https://www.globalfoundries.com/sites/default/files/product-briefs/product-brief-7lp-7nm-finfet-technology.pdf

AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.
RIV-2

2 Pending

3 As of October 22, 2018. Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology-based accelerators are PCIe Gen 4.0 capable providing up to 64 GB/s Peak bandwidth per GPU card with PCIe Gen 4.0 x16 certified servers. Peak theoretical transport rate performance guidelines are estimated only and may vary. Previous Gen Radeon Instinct compute GPU cards are based on PCIe Gen 3.0 providing up to 32 GB/s peak theoretical transport rate bandwidth performance.

Peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions = GB/s 
PCIe Gen 3: 8 * 2 * 2 = 32 GB/s
PCIe Gen 4: 16 * 2 * 2 = 64 GB/s

Refer to server manufacture PCIe Gen 4.0 compatibility and performance guidelines for potential peak performance of the specified server models. Server manufacturers may vary configuration offerings yielding different results.

https://pcisig.com/
https://www.chipestimate.com/PCI-Express-Gen-4-a-Big-Pipe-for-Big-Data/Cadence/Technical-Article/2014/04/15
https://www.tomshardware.com/news/pcie-4.0-power-speed-express,32525.html

AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.
RIV-5

4 As of Oct 22, 2018. Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology based accelerators are PCIe® Gen 4.0* capable providing up to 64 GB/s peak theoretical transport data bandwidth from CPU to GPU per card with PCIe Gen 4.0 x16 certified servers.
Previous Gen Radeon Instinct compute GPU cards are based on PCIe Gen 3.0 providing up to 32 GB/s peak theoretical transport rate bandwidth performance.

Peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions = GB/s per card
PCIe Gen3: 8 * 2 * 2 = 32 GB/s
PCIe Gen4: 16 * 2 * 2 = 64 GB/s
Vega20 to Vega20 xGMI = 25 * 2 * 2 = 100 GB/s * 2 links per GPU = 200 GB/s

xGMI (also known as Infinity Fabric Link)  vs. PCIe Gen3: 200/32 = 6.25x

Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology-based accelerators include dual Infinity Fabric™ Links providing up to 200 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transport rate bandwidth performance per GPU card. Combined with PCIe Gen 4 compatibility providing an aggregate GPU card I/O peak bandwidth of up to 264 GB/s.

Performance guidelines are estimated only and may vary. Previous Gen Radeon Instinct compute GPU cards provide up to 32 GB/s peak PCIe Gen 3.0 bandwidth performance.

Infinity Fabric™ Link technology peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions * # links = GB/s per card

Infinity Fabric Link: 25 * 2 * 2 = 100 GB/s

MI50 |MI60 each have two links:
100 GB/s * 2 links per GPU = 200 GB/s

Refer to server manufacture PCIe Gen 4.0 compatibility and performance guidelines for potential peak performance of the specified server model numbers. Server manufacturers may vary configuration offerings yielding different results.
https://pcisig.com/
https://www.chipestimate.com/PCI-Express-Gen-4-a-Big-Pipe-for-Big-Data/Cadence/Technical-Article/2014/04/15
https://www.tomshardware.com/news/pcie-4.0-power-speed-express,32525.html

AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.
RIV-4

5 Calculated on Oct 22, 2018, the Radeon Instinct MI60 GPU resulted in 7.4 TFLOPS peak theoretical double precision floating-point (FP64) performance. AMD TFLOPS calculations conducted with the following equation: FLOPS calculations are performed by taking the engine clock from the highest DPM state and multiplying it by xx CUs per GPU. Then, multiplying that number by xx stream processors, which exist in each CU. Then, that number is multiplied by 1/2 FLOPS per clock for FP64. TFLOP calculations for MI60 can be found at https://www.amd.com/en/products/professional-graphics/instinct-mi60. External results on the NVidia Tesla V100 (16GB card) GPU accelerator resulted in 7 TFLOPS peak double precision (FP64) floating-point performance. Results found at: https://images.nvidia.com/content/technologies/volta/pdf/437317-Volta-V100-DS-NV-US-WEB.pdf. AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.

Featured Video
Latest Blog Posts
Sanjay GangalAECCafe Today
by Sanjay Gangal
AEC Industry Predictions for 2025 — vGIS
Sanjay GangalIndustry Predictions
by Sanjay Gangal
AEC Industry Predictions for 2025 — QeCAD
Jobs
Business Development Manager for Berntsen International, Inc. at Madison, Wisconsin
Upcoming Events
Consumer Electronics Show 2025 - CES 2025 at Las Vegas Convention Center Las Vegas NV - Jan 7 - 10, 2025
Commercial UAV Expo 2025 at Amsterdam Netherlands - Apr 8 - 10, 2025
Commercial UAV Expo 2025 at RAI Amsterdam Amsterdam Netherlands - Apr 8 - 11, 2025
Geospatial World Forum 2025 at Madrid Marriott Auditorium Madrid Spain - Apr 22 - 25, 2025



© 2024 Internet Business Systems, Inc.
670 Aberdeen Way, Milpitas, CA 95035
+1 (408) 882-6554 — Contact Us, or visit our other sites:
TechJobsCafe - Technical Jobs and Resumes EDACafe - Electronic Design Automation GISCafe - Geographical Information Services  MCADCafe - Mechanical Design and Engineering ShareCG - Share Computer Graphic (CG) Animation, 3D Art and 3D Models
  Privacy PolicyAdvertise