He sort of invented it, so you have to think he’s commenting on the concept here, not the implementation.
I have tried a lot of medium and small models, and there it just no good replacement for the larger ones for natural text output. And they won’t run on device.
Still, fine-tuning smaller models can do wonders, so my guess would be that Apple Intelligence is really 20+ small and fine tuned models that kick in based on which action you take.
That’s why it’s on the OS-level. For example, for text, it seems to work in any text app that uses the standard text input api, which Apple controls.
User activates the “AI overlay” on the OS, not in the app, OS reads selected text from App and sends text suggestions back.
The App is (possibly) unaware that AI has been used / activated, and has not received any user information.
Of course, if you don’t trust the OS, don’t use this. And I’m 100% speculating here based on what we saw for the macOS demo.