GPU Kernels

Serena Chan | Carl Munoz

In Julia, GPU usage is already optimized for many processes through CUDA.jl. Simply by applying a function to a CuArray, operations (e.g. broadcasting and map-reducing) are executed on GPU-specialized code. Additionally, more complex tasks, such as operations in machine learning algorithms like self-attention, have optimized code through cuDNN.jl.

These are done through GPU kernels, which implement functions that exploit the CUDA architecture of GPUs.

Today, we will show how to write such kernels, for when already-optimized kernels do not already exist.

Downloadable contents

Notebook: julia | pdf | html

Environement files: manifest | project

Julia scripts: demo sandbox | softmax examples