@obrhoff it is impossible to guarantee low latency with local model on average developers machines. and that pretty large model. That may be critical factor.