Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint

2026-05-18T17:56 · tech

Article URL: https://modal.com/blog/truly-serverless-gpus Comments URL: https://news.ycombinator.com/item?id=48183038 Points: 17 # Comments: 0