A compiler framework for optimization of affine loop nests for gpgpus, ICS’08

http://portal.acm.org/citation.cfm?id=1375527.1375562

They showed the characteristics of CUDA such asĀ a coalescing when access the global memory and a bank conflict when access the shared memory. They derived the best performance situations, and generated efficient parallelĀ codes that operate in the efficient mode using polyhedral model.

Muthu Manikandan
Tagged on:                 

Leave a Reply