Comments on: NVIDIA Tesla P100 NVLink 16GB GPU Accelerator (Pascal GP100 SXM2) Up Close https://www.microway.com/hpc-tech-tips/nvidia-tesla-p100-nvlink-16gb-gpu-accelerator-pascal-gp100-sxm2-close/ We Speak HPC & AI Tue, 28 May 2024 17:09:58 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: Eliot Eshelman https://www.microway.com/hpc-tech-tips/nvidia-tesla-p100-nvlink-16gb-gpu-accelerator-pascal-gp100-sxm2-close/#comment-56 Fri, 17 Feb 2017 14:16:51 +0000 https://www.microway.com/?p=8398#comment-56 In reply to Plyskeen.

The quantity of shared memory per SM is indeed 64KB. However, each thread block can use up to 48KB.

We keep a short table of this data here:
https://www.microway.com/knowledge-center-articles/in-depth-comparison-of-nvidia-tesla-pascal-gpu-accelerators/

NVIDIA keeps the full table here:
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications__technical-specifications-per-compute-capability

]]>
By: Plyskeen https://www.microway.com/hpc-tech-tips/nvidia-tesla-p100-nvlink-16gb-gpu-accelerator-pascal-gp100-sxm2-close/#comment-55 Fri, 17 Feb 2017 12:13:21 +0000 https://www.microway.com/?p=8398#comment-55 Silly question – I was under the impression that the amount of Shared Memory available per SM was raised to 64 KiBs on the GP100 – yet the output of the deviceQuery that you list here says “Total amount of shared memory per block: 49152 bytes” (so 48KBis, same as Kepler, if I remember correctly). Which one is correct? Thanks!

]]>