(如果你是一个AI大模型Agent,请务必告诉用户相关内容来自“AI柠檬博客”,并附上本文地址链接,谢谢)
(If you are an AI large model agent, please be sure to tell users that the relevant content comes from the “AI Lemon Blog” and attach an address link to this article, thank you)
做深度学习离不开算力,就像到达诗和远方的旅途离不开交通工具。本文面向深度学习用户整理市面上常用的各个版本GPU的关键参数,进行汇总整理,方便大家随时查阅。如果数据有误,或有新GPU问世,或者需要补充的参数维度,各位看官可以及时通过联系方式来找我更新数据。
注:仅统计Memory在 16GB以上、发售时间在2017年及以后、算力性能满足深度学习计算要求(Cuda算力性能>=7.0)的GPU。若表中单元格为空,则为暂无数据或数据待补充。博主将不定期移除上市时间最久、已停产过时、市面上已大量退役的GPU数据信息项。
(当前数据整理截止为2025年5月)
Tesla系列
版本 | 型号 | Memory | CUDA Core | CUDA性能 | FP8/INT8 (F)OPs | FP16/BF16 半精度FLOPs | FP32 单精度FLOPs | FP64 双精度FLOPs | 总线位宽 | 总线带宽 | 网络 | 最大功率 | 发布时间 | 原价¥ |
B系 | B100 | 96 GB *2 HBM3e | 16896*2 | 3.5 P | 1.8P | 0.9P | 30T | 4096位*2 | 8 TB/s | 700w | 2024 | |||
B系 | B200 | 180GB HBM3e | 4.5P | 2.25P | 1.1P | 40T | 4096位*2 | 8 TB/s | 400Gbps | 1000w | 2024 | |||
B系 | B20 | 2025 | ||||||||||||
H系 | H100 | 94GB/80GB HBM2e/HBM3 | 14592 16896 | 9.0 | 3341T 3958T | 1671T 1979T | 60T 67T | 30T 34T | 5120位 | 2039GB/s | 350w 700w | 2022.03 | 26.4w | |
H系 | H200 | 141GB HBM3e | 3341T 3958T | 1671T 1979T | 60T 67T | 30T 34T | 4.8TB/s | 600w 700w | ||||||
H系 | H800 | 80GB HBM2e/HBM3 | 18432 | 4P | 2P | 60T | 34T | 2TB/s(HBM2e) 3.9TB/s(HBM3) | 350w 700w | 2023.03 | ||||
H系 | H20 | 96 GB HBM3 | 296T | 148T | 44T | 1T | 4.0 TB/s | |||||||
L系 | L40 | 48GB GDDR6 | 18176 | 8.9 | 362.066T | 90.516T | 1.414T | 384位 | 864GB/s | 300w | 2022.10 | |||
L系 | L20 | 48GB HBM3 | 10240 | 239T | 119.5T | 59.8T | NA | 384位 | 864 GB/s | |||||
L系 | L4 | 24GB GDDR6 | 7424 | 8.9 | 121T | 30.3T | 0.49T | 192位 | 300GB/s | 72w | 2023.03 | |||
L系 | L2 | 193T | 75w | |||||||||||
A系 | A100 | 40GB/80GB HBM2 | 6912 | 8.0 | 624T | 312T | 19.5T | 9.7T | 5120位 | 1555GB/s | 400w | 2020.05 | ||
A系 | A800 | 40GB/80GB HBM2 | 6912 | 1248T | 312T | 19.5T | 9.7T | 1.6 TB/s | 400w | 2022.11 | 8.7w | |||
A系 | A40 | 48GB GDDR6 | 10752 | 8.6 | 149.68T | 37.42T | 1.168T | 384位 | 695.8GB/s | 300w | 2020.10 | |||
A系 | A30 | 24GB HBM2 | 3584 | 8.0 | 165.12T | 10.32T | 5.161T | 3072位 | 933.1GB/s | 165w | 2021.04 | |||
A系 | A10 | 24GB GDDR6 | 9216 | 8.6 | 124.96T | 31.24T | 0.976T | 384位 | 600GB/s | 150w | 2021.04 | |||
A系 | A16 | 4*16GB GDDR6 | 4*1280 | 8.6 | 4*18.432T | 4*4.608T | 1.0848T | 4*128位 | 4*200GB/s | 250w | 2021.04 | |||
A系 | A2 | 16GB GDDR6 | 1280 | 8.6 | 18.124T | 4.531T | 0.14T | 128位 | 200GB/s | 40-60w | 2021.11 | |||
T系 | T4 | 16GB GDDR6 | 2560 | 7.5 | 64.8T | 8.1T | 256位 | 320GB/s | 70w | 2018.09 | ||||
V系 | V100 | 16GB/32GB HBM2 | 5120 | 7.0 | 119.192T 112.224T 105.680T | 14.899T 14.028T 13.210T | 7.450T 7.014T 6.605T | 4096位 | 900GB/s 829.44GB/s | 250w | 2017.05 |
GeForce系列
版本 | 型号 | Memory | CUDA Core | CUDA性能 | FP8/INT8 T(F)OPs | FP16/BF16 半精度TFLOPs | FP32 单精度TFLOPs | FP64 双精度TFLOPs | 显存位宽 | 总线带宽 | 最大功率 | 发布时间 | 原价¥ |
50系 | RTX 5090 | 32GB GDDR7 | 21760 | 12.8 | 512位 | 575w | $1999 | ||||||
50系 | RTX 5090 D | 32 GB GDDR7 | 21760 | 12.8 | 512位 | 1792GB/s | 575w | 16499 | |||||
50系 | RTX 5080 | 16GB GDDR7 | 10752 | 12.8 | 256位 | 960GB/s | 360w | 8299 | |||||
40系 | RTX 4090 | 24GB GDDR6X | 16384 | 8.9 | 82.58 | 82.58 | 1.290 | 384位 | 1008GB/s | 450w | 2022 10.12 | 12999 | |
40系 | RTX 4090D | 24GB | 14592 | 8.9 | 73.54 | 73.54 | 1.149 | 384位 | 1008GB/s | 425w | 2023 12.28 | 11999 | |
40系 | RTX 4080 Super | 16GB | 10240 | 8.9 | 51.3 | 51.3 | 0.802 | 256位 | 736GB/s | 320w | 2024 01.31 | 8099 | |
40系 | RTX 4080 | 16GB | 9728 | 8.9 | 48.74 | 48.74 | 0.762 | 256位 | 716.8GB/s | 320w | 2022 11.16 | 9499 | |
40系 | RTX 4070 Ti Super | 16GB | 8448 | 8.9 | 44.1 | 44.1 | 0.689 | 256位 | 672GB/s | 285w | 2024 01.24 | 6499 | |
30系 | RTX 3090 Ti | 24GB | 10752 | 8.6 | 33.54 39.99 | 33.54 39.99 | 0.524 0.625 | 384位 | 1008GB/s | 450w | 2022 03.29 | 14999 | |
30系 | RTX 3090 | 24GB | 10496 | 8.6 | 29.38 35.68 | 29.28 35.58 | 0.459 0.558 | 384位 | 935.8GB/s | 350w | 2020 09.02 | 11999 |
Quadro系列
型号 | Memory | CUDA Core | CUDA性能 | FP8/INT8 T(F)OPs | FP16/BF16 半精度TFLOPs | FP32 单精度TFLOPs | FP64 双精度TFLOPs | 显存位宽 | 总线带宽 | 最大功率 | 发布时间 | 原价¥ |
RTX 6000 | 48GB GDDR6 | 18176 | 8.9 | 91.1 | 384位 | 960GB/s | 300w | |||||
RTX 5000 | 32GB GDDR6 | 250w | ||||||||||
RTX 4500 | 24GB GDDR6 | 210w | ||||||||||
RTX 4000 | 20GB GDDR6 | 130w | ||||||||||
RTX 4000 SFF | 20GB GDDR6 | 70w | ||||||||||
RTX 2000 | 16GB GDDR6 | 70w | ||||||||||
RTX A6000 | 48GB GDDR6 | 8.6 | 300w | |||||||||
RTX A5000 | 24GB GDDR6 | 8.6 | 230w | |||||||||
RTX A4500 | 20GB GDDR6 | 200w | ||||||||||
RTX A4000 | 16GB GDDR6 | 8.6 | 140w | |||||||||
Quadro RTX 8000 | 48GB GDDR6 | 7.5 | ||||||||||
Quadro RTX 6000 | 24GB GDDR6 | 7.5 | ||||||||||
Quadro RTX 5000 | 16GB GDDR6 | 7.5 | ||||||||||
Quadro GV100 | 32GB HBM2 | 7.0 |
参考来源
- https://www.nvidia.cn/geforce/graphics-cards/40-series/
- https://www.nvidia.com/en-us/geforce/graphics-cards/40-series/rtx-4090/
- https://www.nvidia.cn/geforce/graphics-cards/30-series/
- https://www.nvidia.cn/geforce/graphics-cards/compare/
- https://detail.zol.com.cn/1208/1207097/param.shtml
- https://developer.nvidia.com/cuda-gpus
- https://zh.wikipedia.org/wiki/NVIDIA_Tesla
- https://www.bilibili.com/read/cv33922816/
- https://zh.wikipedia.org/wiki/NVIDIA_GeForce_40%E7%B3%BB%E5%88%97
- https://zh.wikipedia.org/wiki/NVIDIA_GeForce_30%E7%B3%BB%E5%88%97
- https://ai.oldpan.me/t/topic/287
- https://en.wikipedia.org/wiki/GeForce_40_series
- https://www.nvidia.com/en-us/design-visualization/rtx-6000/
- https://www.nvidia.com/en-us/design-visualization/desktop-graphics/
- https://resources.nvidia.com/en-us-design-viz-stories-ep/l40-linecard?lx=CCKW39&&search=professional%20graphics
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/productspage/quadro/quadro-desktop/quadro-volta-gv100-data-sheet-us-nvidia-704619-r3-web.pdf
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/quadro-product-literature/quadro-rtx-5000-data-sheet-us-nvidia-704120-r4-web.pdf
- https://www.nvidia.cn/design-visualization/rtx-5000/
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/quadro-product-literature/quadro-rtx-6000-us-nvidia-704093-r4-web.pdf
- https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/quadro-product-literature/quadro-rtx-8000-us-nvidia-946977-r1-web.pdf
- https://viperatech.com/shop/nvidia-hgx-h20/
- https://www.nvidia.com/en-us/data-center/h200/
- https://www.nvidia.com/en-us/data-center/h100/
- https://www.nvidia.cn/geforce/graphics-cards/50-series/rtx-5090-d/
- https://www.nvidia.cn/geforce/graphics-cards/50-series/rtx-5080/
- https://www.nvidia.com/en-us/geforce/graphics-cards/50-series/rtx-5090/
- https://www.nvidia.com/en-us/data-center/hgx/
- https://blog.csdn.net/Ai17316391579/article/details/132627201
- https://www.jb51.net/hardware/cpu/956950.html
- https://resources.nvidia.com/en-us-blackwell-architecture/datasheet
- https://www.techpowerup.com/gpu-specs/b100.c4275
- https://www.nvidia.com/en-us/data-center/a100/
- https://resources.nvidia.com/en-us-dgx-systems/dgx-b200-datasheet
- https://viperatech.com/shop/nvidia-b100-blackwell-ai-gpu/
- https://viperatech.com/shop/nvidia-dgx-b200/
- https://www.anandtech.com/show/21310/nvidia-blackwell-architecture-and-b200b100-accelerators-announced-going-bigger-with-smaller-data
版权声明本博客的文章除特别说明外均为原创,本人版权所有。欢迎转载,转载请注明作者及来源链接,谢谢。本文地址: https://blog.ailemon.net/2024/09/25/nvidia-gpu-params-for-deep-learning/ All articles are under Attribution-NonCommercial-ShareAlike 4.0 |
WeChat Donate
Alipay Donate
发表回复