"Speculative decoding is an optimization technique for inference that makes educated guesses about future tokens while generating the current token"TIL Language models can use speculative execution.