Actually, really liked the Apple Intelligence announcement. It must be a very exciting time at Apple as they layer AI on top of the entire OS. A few of the major themes.

Step 1 Multimodal I/O. Enable text/audio/image/video capability, both read and write. These are the native human APIs, so to speak.

Step 2 Agentic. Allow all parts of the OS and apps to inter-operate via “function calling”; kernel process LLM that can schedule and coordinate work across them given user queries.

Step 3 Frictionless. Fully integrate these features in a highly frictionless, fast, “always on”, and contextual way. No going around copy pasting information, prompt engineering, or etc. Adapt the UI accordingly.

Step 4 Initiative. Don’t perform a task given a prompt, anticipate the prompt, suggest, initiate.

Step 5 Delegation hierarchy. Move as much intelligence as you can on device (Apple Silicon very helpful and well-suited), but allow optional dispatch of work to cloud.

Step 6 Modularity. Allow the OS to access and support an entire and growing ecosystem of LLMs (e.g. ChatGPT announcement).

Step 7 Privacy. <3

We’re quickly heading into a world where you can open up your phone and just say stuff. It talks back and it knows you. And it just works. Super exciting and as a user, quite looking forward to it.

https://x.com/karpathy/status/1800242310116262150?s=46

    • Z4rK@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      13
      ·
      6 months ago

      He sort of invented it, so you have to think he’s commenting on the concept here, not the implementation.

      I have tried a lot of medium and small models, and there it just no good replacement for the larger ones for natural text output. And they won’t run on device.

      Still, fine-tuning smaller models can do wonders, so my guess would be that Apple Intelligence is really 20+ small and fine tuned models that kick in based on which action you take.