CUB – a configurable C++ template library of high-performance CUDA primitives Each new generation of NVIDIA GPUs brings with it a dramatic increase in compute power and the pace of development over the past several years has…
This resource was prepared by Microway from data provided by NVIDIA and trusted media sources. All NVIDIA GPUs support general purpose computation (GPGPU), but not all GPUs offer the same performance or support the same features. The…
NVIDIA Tesla K40 is now the leading Tesla GPU for performance. Here are some important use-cases where Tesla K40 might greatly accelerate your GPU-accelerated applications: Pick Tesla K40 for Large Data Sets GPU memory has always been…
NVIDIA’s latest Tesla accelerator is without a doubt the most powerful GPU available. With almost 3,000 CUDA cores and 12GB GDDR5 memory, it wins in practically every* performance test you’ll see. As with the “Kepler” K20 GPUs,…
This article provides in-depth details of the NVIDIA Tesla K-series GPU accelerators (codenamed “Kepler”). “Kepler” GPUs improve upon the previous-generation “Fermi” architecture. For more information on other Tesla GPU architectures, please refer to: Important changes available in…
The debut of NVIDIA’s Kepler architecture in 2012 marked a significant milestone in the evolution of general-purpose GPU computing. In particular, Kepler GK110 (compute capability 3.5) brought unrivaled compute power and introduced a number of new features…
This post is Topic #3 (post 3) in our series Parallel Code: Maximizing your Performance Potential. Many applications contain algorithms which make use of multi-dimensional arrays (or matrices). For cases where threads need to index the higher…
This post is Topic #3 (post 2) in our series Parallel Code: Maximizing your Performance Potential. In my previous post, I provided an introduction to the various types of memory available for use in a CUDA application.…
NVIDIA’s Tesla K20 GPU is currently the de facto standard for high-performance heterogeneous computing. Based upon the Kepler GK110 architecture, these are the GPUs you want if you’ll be taking advantage of the latest advancements available in…
This post is Topic #3 (part 1) in our series Parallel Code: Maximizing your Performance Potential. CUDA devices have several different memory spaces: Global, local, texture, constant, shared and register memory. Each type of memory on the…