做深度学习离不开算力,就像到达诗和远方的旅途离不开交通工具。本文面向深度学习用户整理市面上常用的各个版本GPU的关键参数,进行汇总整理,方便大家随时查阅。如果数据有误,或有新GPU问世,或者需要补充的参数维度,各位看官可以及时通过联系方式来找我更新数据。
注:仅统计Memory在 16GB以上、发售时间在2017年及以后、算力性能满足深度学习计算要求(Cuda算力性能>=7.0)的GPU。若表中单元格为空,则为暂无数据或数据待补充。博主将不定期移除上市时间最久、已停产过时、市面上已大量退役的GPU数据信息项。
(当前数据整理截止为2025年5月)
Tesla系列
版本 | 型号 | Memory | CUDA Core | CUDA性能 | FP8/INT8 (F)OPs | FP16/BF16 半精度FLOPs | FP32 单精度FLOPs | FP64 双精度FLOPs | 总线位宽 | 总线带宽 | 网络 | 最大功率 | 发布时间 | 原价¥ |
B系 | B100 | 96 GB *2 HBM3e | 16896*2 | 3.5 P | 1.8P | 0.9P | 30T | 4096位*2 | 8 TB/s | 700w | 2024 | |||
B系 | B200 | 180GB HBM3e | 4.5P | 2.25P | 1.1P | 40T | 4096位*2 | 8 TB/s | 400Gbps | 1000w | 2024 | |||
B系 | B20 | 2025 | ||||||||||||
H系 | H100 | 94GB/80GB HBM2e/HBM3 | 14592 16896 | 9.0 | 3341T 3958T | 1671T 1979T | 60T 67T | 30T 34T | 5120位 | 2039GB/s | 350w 700w | 2022.03 | 26.4w | |
H系 | H200 | 141GB HBM3e | 3341T 3958T | 1671T 1979T | 60T 67T | 30T 34T | 4.8TB/s | 600w 700w | ||||||
H系 | H800 | 80GB HBM2e/HBM3 | 18432 | 4P | 2P | 60T | 34T | 2TB/s(HBM2e) 3.9TB/s(HBM3) | 350w 700w | 2023.03 | ||||
H系 | H20 | 96 GB HBM3 | 296T | 148T | 44T | 1T | 4.0 TB/s | |||||||
L系 | L40 | 48GB GDDR6 | 18176 | 8.9 | 362.066T | 90.516T | 1.414T | 384位 | 864GB/s | 300w | 2022.10 | |||
L系 | L20 | 48GB HBM3 | 10240 | 239T | 119.5T | 59.8T | NA | 384位 | 864 GB/s | |||||
L系 | L4 | 24GB GDDR6 | 7424 | 8.9 | 121T | 30.3T | 0.49T | 192位 | 300GB/s | 72w | 2023.03 | |||
L系 | L2 | 193T | 75w | |||||||||||
A系 | A100 | 40GB/80GB HBM2 | 6912 | 8.0 | 624T | 312T | 19.5T | 9.7T | 5120位 | 1555GB/s | 400w | 2020.05 | ||
A系 | A800 | 40GB/80GB HBM2 | 6912 | 1248T | 312T | 19.5T | 9.7T | 1.6 TB/s | 400w | 2022.11 | 8.7w | |||
A系 | A40 | 48GB GDDR6 | 10752 | 8.6 | 149.68T | 37.42T | 1.168T | 384位 | 695.8GB/s | 300w | 2020.10 | |||
A系 | A30 | 24GB HBM2 | 3584 | 8.0 | 165.12T | 10.32T | 5.161T | 3072位 | 933.1GB/s | 165w | 2021.04 | |||
A系 | A10 | 24GB GDDR6 | 9216 | 8.6 | 124.96T | 31.24T | 0.976T | 384位 | 600GB/s | 150w | 2021.04 | |||
A系 | A16 | 4*16GB GDDR6 | 4*1280 | 8.6 | 4*18.432T | 4*4.608T | 1.0848T | 4*128位 | 4*200GB/s | 250w | 2021.04 | |||
A系 | A2 | 16GB GDDR6 | 1280 | 8.6 | 18.124T | 4.531T | 0.14T | 128位 | 200GB/s | 40-60w | 2021.11 | |||
T系 | T4 | 16GB GDDR6 | 2560 | 7.5 | 64.8T | 8.1T | 256位 | 320GB/s | 70w | 2018.09 | ||||
V系 | V100 | 16GB/32GB HBM2 | 5120 | 7.0 | 119.192T 112.224T 105.680T | 14.899T 14.028T 13.210T | 7.450T 7.014T 6.605T | 4096位 | 900GB/s 829.44GB/s | 250w | 2017.05 |
GeForce系列
版本 | 型号 | Memory | CUDA Core | CUDA性能 | FP8/INT8 T(F)OPs | FP16/BF16 半精度TFLOPs | FP32 单精度TFLOPs | FP64 双精度TFLOPs | 显存位宽 | 总线带宽 | 最大功率 | 发布时间 | 原价¥ |
50系 | RTX 5090 | 32GB GDDR7 | 21760 | 12.8 | 512位 | 575w | $1999 | ||||||
50系 | RTX 5090 D | 32 GB GDDR7 | 21760 | 12.8 | 512位 | 1792GB/s | 575w | 16499 | |||||
50系 | RTX 5080 | 16GB GDDR7 | 10752 | 12.8 | 256位 | 960GB/s | 360w | 8299 | |||||
40系 | RTX 4090 | 24GB GDDR6X | 16384 | 8.9 | 82.58 | 82.58 | 1.290 | 384位 | 1008GB/s | 450w | 2022 10.12 | 12999 | |
40系 | RTX 4090D | 24GB | 14592 | 8.9 | 73.54 | 73.54 | 1.149 | 384位 | 1008GB/s | 425w | 2023 12.28 | 11999 | |
40系 | RTX 4080 Super | 16GB | 10240 | 8.9 | 51.3 | 51.3 | 0.802 | 256位 | 736GB/s | 320w | 2024 01.31 | 8099 | |
40系 | RTX 4080 | 16GB | 9728 | 8.9 | 48.74 | 48.74 | 0.762 | 256位 | 716.8GB/s | 320w | 2022 11.16 | 9499 | |
40系 | RTX 4070 Ti Super | 16GB | 8448 | 8.9 | 44.1 | 44.1 | 0.689 | 256位 | 672GB/s | 285w | 2024 01.24 | 6499 | |
30系 | RTX 3090 Ti | 24GB | 10752 | 8.6 | 33.54 39.99 | 33.54 39.99 | 0.524 0.625 | 384位 | 1008GB/s | 450w | 2022 03.29 | 14999 | |
30系 | RTX 3090 | 24GB | 10496 | 8.6 | 29.38 35.68 | 29.28 35.58 | 0.459 0.558 | 384位 | 935.8GB/s | 350w | 2020 09.02 | 11999 |
Quadro系列
型号 | Memory | CUDA Core | CUDA性能 | FP8/INT8 T(F)OPs | FP16/BF16 半精度TFLOPs | FP32 单精度TFLOPs | FP64 双精度TFLOPs | 显存位宽 | 总线带宽 | 最大功率 | 发布时间 | 原价¥ |
RTX 6000 | 48GB GDDR6 | 18176 | 8.9 | 91.1 | 384位 | 960GB/s | 300w | |||||
RTX 5000 | 32GB GDDR6 | 250w | ||||||||||
RTX 4500 | 24GB GDDR6 | 210w | ||||||||||
RTX 4000 | 20GB GDDR6 | 130w | ||||||||||
RTX 4000 SFF | 20GB GDDR6 | 70w | ||||||||||
RTX 2000 | 16GB GDDR6 | 70w | ||||||||||
RTX A6000 | 48GB GDDR6 | 8.6 | 300w | |||||||||
RTX A5000 | 24GB GDDR6 | 8.6 | 230w | |||||||||
RTX A4500 | 20GB GDDR6 | 200w | ||||||||||
RTX A4000 | 16GB GDDR6 | 8.6 | 140w | |||||||||
Quadro RTX 8000 | 48GB GDDR6 | 7.5 | ||||||||||
Quadro RTX 6000 | 24GB GDDR6 | 7.5 | ||||||||||
Quadro RTX 5000 | 16GB GDDR6 | 7.5 | ||||||||||
Quadro GV100 | 32GB HBM2 | 7.0 |
参考来源
- https://www.nvidia.cn/geforce/graphics-cards/40-series/
- https://www.nvidia.com/en-us/geforce/graphics-cards/40-series/rtx-4090/
- https://www.nvidia.cn/geforce/graphics-cards/30-series/
- https://www.nvidia.cn/geforce/graphics-cards/compare/
- https://detail.zol.com.cn/1208/1207097/param.shtml
- https://developer.nvidia.com/cuda-gpus
- https://zh.wikipedia.org/wiki/NVIDIA_Tesla
- https://www.bilibili.com/read/cv33922816/
- https://zh.wikipedia.org/wiki/NVIDIA_GeForce_40%E7%B3%BB%E5%88%97
- https://zh.wikipedia.org/wiki/NVIDIA_GeForce_30%E7%B3%BB%E5%88%97
- https://ai.oldpan.me/t/topic/287
- https://en.wikipedia.org/wiki/GeForce_40_series
- https://www.nvidia.com/en-us/design-visualization/rtx-6000/
- https://www.nvidia.com/en-us/design-visualization/desktop-graphics/
- https://resources.nvidia.com/en-us-design-viz-stories-ep/l40-linecard?lx=CCKW39&&search=professional%20graphics
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/productspage/quadro/quadro-desktop/quadro-volta-gv100-data-sheet-us-nvidia-704619-r3-web.pdf
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/quadro-product-literature/quadro-rtx-5000-data-sheet-us-nvidia-704120-r4-web.pdf
- https://www.nvidia.cn/design-visualization/rtx-5000/
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/quadro-product-literature/quadro-rtx-6000-us-nvidia-704093-r4-web.pdf
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/quadro-product-literature/quadro-rtx-8000-us-nvidia-946977-r1-web.pdf
- https://viperatech.com/shop/nvidia-hgx-h20/
- https://www.nvidia.com/en-us/data-center/h200/
- https://www.nvidia.com/en-us/data-center/h100/
- https://www.nvidia.cn/geforce/graphics-cards/50-series/rtx-5090-d/
- https://www.nvidia.cn/geforce/graphics-cards/50-series/rtx-5080/
- https://www.nvidia.com/en-us/geforce/graphics-cards/50-series/rtx-5090/
- https://www.nvidia.com/en-us/data-center/hgx/
- https://blog.csdn.net/Ai17316391579/article/details/132627201
- https://www.jb51.net/hardware/cpu/956950.html
- https://resources.nvidia.com/en-us-blackwell-architecture/datasheet
- https://www.techpowerup.com/gpu-specs/b100.c4275
- https://www.nvidia.com/en-us/data-center/a100/
- https://resources.nvidia.com/en-us-dgx-systems/dgx-b200-datasheet
- https://viperatech.com/shop/nvidia-b100-blackwell-ai-gpu/
- https://viperatech.com/shop/nvidia-dgx-b200/
- https://www.anandtech.com/show/21310/nvidia-blackwell-architecture-and-b200b100-accelerators-announced-going-bigger-with-smaller-data
版权声明本博客的文章除特别说明外均为原创,本人版权所有。欢迎转载,转载请注明作者及来源链接,谢谢。本文地址: https://blog.ailemon.net/2024/09/25/nvidia-gpu-params-for-deep-learning/ All articles are under Attribution-NonCommercial-ShareAlike 4.0 |
WeChat Donate
Alipay Donate
发表回复