This section details some tips for developing CUDA applications. This will evolve over time, and if you see anything wrong, please let me know with the Contact form!
-Write sequential simulated thread code: this helps especially if you are translating from a sequential program. When validated with the sequential version, it reduces the probability of logical errors.
-Use nvprof: http://devblogs.nvidia.com/parallelforall/cuda-pro-tip-nvprof-your-handy-universal-gpu-profiler/
-Use the open-source library ArrayFire: instead of writing everything from scratch, ArrayFire provides a library for elementary operations such as reduction.
-Write sequential simulated thread code: this helps especially if you are translating from a sequential program. When validated with the sequential version, it reduces the probability of logical errors.
-Use nvprof: http://devblogs.nvidia.com/parallelforall/cuda-pro-tip-nvprof-your-handy-universal-gpu-profiler/
-Use the open-source library ArrayFire: instead of writing everything from scratch, ArrayFire provides a library for elementary operations such as reduction.