@SuperDicq
We wanted to do this, too! That's one of the reasons I got a laptop with an NPU (as a writeoff lmao) a few weeks ago.
The problem is that those "NPUs" are absolute fucking memes and basically can only do image classification. No good ways to run even 1B models on those.
But like hell yeah I want you to do as much as possible on *your* equipment that you pay for lmfao.
As for privacy, customers requiring HIPAA compliance (or similar) can force all inference to happen on our infra. The eval loop also looks for things that appear to be PHI and halt the request if the account is not enabled for HIPAA compliance.
@Cyrillic