Actually, really liked the Apple Intelligence announcement. It must be a very exciting time at Apple as they layer AI on top of the entire OS. A few of the major themes.
Step 1 Multimodal I/O. Enable text/audio/image/video capability, both read and write. These are the native human APIs, so to speak.
Step 2 Agentic. Allow all parts of the OS and apps to inter-operate via “function calling”; kernel process LLM that can schedule and coordinate work across them given user queries.
Step 3 Frictionless. Fully integrate these features in a highly frictionless, fast, “always on”, and contextual way. No going around copy pasting information, prompt engineering, or etc. Adapt the UI accordingly.
Step 4 Initiative. Don’t perform a task given a prompt, anticipate the prompt, suggest, initiate.
Step 5 Delegation hierarchy. Move as much intelligence as you can on device (Apple Silicon very helpful and well-suited), but allow optional dispatch of work to cloud.
Step 6 Modularity. Allow the OS to access and support an entire and growing ecosystem of LLMs (e.g. ChatGPT announcement).
Step 7 Privacy. <3
We’re quickly heading into a world where you can open up your phone and just say stuff. It talks back and it knows you. And it just works. Super exciting and as a user, quite looking forward to it.
He sort of invented it, so you have to think he’s commenting on the concept here, not the implementation.
I have tried a lot of medium and small models, and there it just no good replacement for the larger ones for natural text output. And they won’t run on device.
Still, fine-tuning smaller models can do wonders, so my guess would be that Apple Intelligence is really 20+ small and fine tuned models that kick in based on which action you take.