Comments on: GPU Memory Types – Performance Comparison https://www.microway.com/hpc-tech-tips/gpu-memory-types-performance-comparison/ We Speak HPC & AI Tue, 28 May 2024 16:37:23 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: Eliot Eshelman https://www.microway.com/hpc-tech-tips/gpu-memory-types-performance-comparison/#comment-29 Tue, 02 Sep 2014 14:41:59 +0000 http://https://www.microway.com/hpc-tech-tips/?p=347#comment-29 In reply to zheng.

Zheng,

When possible, you’ll get the best performance by calling libraries that have already been tuned. Hopefully, you can use a library such as cuSPARSE. There’s a good list of libraries here:
https://developer.nvidia.com/gpu-accelerated-libraries

I see that there are also papers from NVIDIA and other researchers on methods for sparse matrices. You may be able to re-use some of their methods:
http://www.nvidia.com/object/nvidia_research_pub_001.html

]]>
By: zheng https://www.microway.com/hpc-tech-tips/gpu-memory-types-performance-comparison/#comment-28 Sun, 31 Aug 2014 13:14:02 +0000 http://https://www.microway.com/hpc-tech-tips/?p=347#comment-28 Thanks for your posts, it is really very helpful. As you said, shared memory has a powerful performance. but in practice, like, SpMv based compressed sparse row format, I can not find any good way to cache vector x into shared memory due to shared memory can only be accessed by threads within same block. Any good idea ? Thanks.

]]>