
Ryan Whitwam / Ars Technica:
Google releases Multi-Token Prediction drafters for its Gemma 4 models, which use a form of speculative decoding to guess future tokens for faster inference — Google launched its Gemma 4 open models this spring, promising a new level of power and performance for local AI.

Ryan Whitwam / Ars Technica:
Google releases Multi-Token Prediction drafters for its Gemma 4 models, which use a form of speculative decoding to guess future tokens for faster inference — Google launched its Gemma 4 open models this spring, promising a new level of power and performance for local AI.
Source: TechMeme
Source Link: https://www.techmeme.com/260506/p38#a260506p38