In this paper, we analyze this new CUDA feature, its advantages and limitations, by developing two versions of an application that computes the scalar product: one using the Unified Memory concept and the other one using the classical approach based on the cudaMalloc instruction (for memory allocation) and cudaMemcpy (for transferring the data between the host and the device). The system automatically migrates data allocated in the Unified Memory between the host memory and the device memory. One of the most important improvements within the Compute Unified Device Architecture (CUDA) 6.5 version, launched in August 2014, is the support for Unified Memory, a feature that simplifies the memory management, by providing a unified pool of managed memory, shared between the Central Processing Unit (CPU) and the Graphic Processing Unit (GPU).
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |