Data management in CUDA Programming for High Bandwidth Memory in GPU Accelerators

Data management in CUDA Programming for High Bandwidth Memory in GPU Accelerators

Grzegorz Korpała

Institute of Metal Forming, Technische Universität Bergakademie Freiberg Bernhard v. Cotta Str. 4, 09599 Freiberg, Germany.

DOI:

https://doi.org/10.7494/cmms.2016.3.0580

Abstract:

The number of applications that use GPU accelerated calculations always grows. The software conversion to such calculation type is although complex but gives enormous energy efficiency and performance. In the publication is presented a method of the data sets, in which a temporary storage is occurred in the shared memory of GPU. The loading of values to shared memory via massive data access makes it possible to exploit the full power of the High Bandwidth Memory. It is demonstrated by CUDA codes for Cellular Automaton application and corresponding indexing of the data in global and shared memory. The real and the theoretical Speed-Up are described and shown in this publication.

Cite as:

Korpała, G. (2016). Data management in CUDA Programming for High Bandwidth Memory in GPU Accelerators. Computer Methods in Materials Science, 16(3), 121 – 126. https://doi.org/10.7494/cmms.2016.3.0580

Article (PDF):

Keywords:

GPU computation shared memory, Memory management

References:

Ferenc, M. J., Ferenc, I., Róbert, M., István, L., 2011, Simulationof reaction–diffusion processes in three dimensionsusing CUDA. Analytical Platforms for Providing andHandling Massive Chemical Data, 76-85.

Korpala, G., Kawalla, R., 2015, Optimization and application ofGPU calculations in material science. Computer Methodsin Materials Science, 15 (1), 185-191.

Nvidia. 2016, Whitepaper NVIDIA Tesla P100. Availableonline at: https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecturewhitepaper. pdf, accessed:15.11.2016.

Sanders, J., Kandrot, E., 2010, An Introduction to General-Purpose GPU Programming. In J. Sanders, & E. Kandrot,CUDA by Example.Wolfram, S., 2015, Wolfram Language Dokumentation CenterMathematica 10.3. Wolfram.

Woolley, C., 2013, GPU Optimization Fundamentals. Availableonline at: http:https://www.olcf.ornl.gov/wp-content/uploads/2013/02/ GPU_Opt_Fund-CW1.pdf. accessed:15.11.2016