Programming with the GPU

CUDA C

Introduction

For everyone that knows me or follows me on Twitter this blogpost comes as no surprise - I love GPUs. I will try and keep this blogpost free from inane GPU worship and try to be as objective as I can. Over the last year I've come to appreciate these little devices and the amount of power they can pack. I often used libraries for Computer Vision, Machine Learning and Robotics applications that had the option to speed up computation using GPUs (and mind you they did!) and sometimes it was absolutely necessary (Deep Learning with Caffe, TensorFlow etc). I had a vague understanding of which parts of code could benefit from GPU computation but I've always wanted to actually write some of the lower level programs to make it happen - resulting in me trying to teach myself CUDA C.

(very) Brief history of GPU programming:
Rendering graphics required an independent computational system -> GPUs entered the market -> Games became more demanding -> GPUs became more sophisticated -> General purpose programming on GPU began (by tricking the GPU into believing regular data were shaders) -> annoying for programmers because of additional learning curve (learning graphics modeling languages) -> NVIDIA launched CUDA C -> Yay! -> GPUs got even more sophisticated -> Rising popularity among general purpose programming in domains like medical imaging, machine learning, machine perception.