GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz - AllTheNews.today

GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz

Article URL: https://twitter.com/fguzmanai/status/2065832668172845209 Comments URL: https://news.ycombinator.com/item?id=48557535 Points: 12 # Comments: 1
Read Full Article →
twitter.com
← Back to Latest