Google's open-source diffusion language model generates 256 tokens in parallel and self-corrects, hitting 4x speed on one GPU ...
Google says that DiffusionGemma can generate more than 1,000 tokens per second when running on a single H100, a server-grade ...
DiffusionGemma hits 1,000 tokens per second by ditching word-by-word generation entirely. It just doesn't run on most ...
Stable implementation with almost 1,700 tests and enforced 100% test code coverage. Every single method, statement and conditional branch variant in the entire codebase is tested and required to pass ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results