CUDA
5 mentions across all digests
CUDA is a parallel computing platform and programming model developed by NVIDIA that enables GPU-accelerated computing, central to NVIDIA's 20-year accelerated computing strategy and used for writing production-quality GPU kernels for AI workloads.
Custom Kernels for All from Codex and Claude
Taking on CUDA with ROCm: 'One Step After Another'
ROCm's patient incremental gains highlight the long-term competitive struggle needed to fragment NVIDIA's CUDA dominance in GPU computing—ecosystem lock-in rarely yields to overnight alternatives.
Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers
WebGPU dispatch overhead (24–71 μs) is the true LLM inference bottleneck in browsers, not compute—torch-webgpu provides a PyTorch backend while revealing prior benchmarks massively overestimated costs by ~20×.
An Interview with Nvidia CEO Jensen Huang About Accelerated Computing
We Got Claude to Build CUDA Kernels and teach open models!