Tutorials·January 8, 2026·6 min readCS336 Notes: Lecture 6 - Kernels and TritonWriting efficient GPU kernels with Triton: profiling, benchmarking, kernel fusion, and when to hand-optimize versus using torch.compile.machine-learninggpustanford-cs336tritonRead