Sam Altman says ChatGPT should be ‘much less lazy now’::ChatGPT users previously complained that the chatbot was slacking off and refusing to complete some tasks.

  • AlmightySnoo 🐢🇮🇱🇺🇦@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    3
    ·
    10 months ago

    PSA: give open-source LLMs a try folks. If you’re on Linux or macOS, ollama makes it incredibly easy to try most of the popular open-source LLMs like Mistral 7B, Mixtral 8x7B, CodeLlama etc… Obviously it’s faster if you have a CUDA/ROCm-capable GPU, but it still works in CPU-mode too (albeit slow if the model is huge) provided you have enough RAM.

    You can combine that with a UI like ollama-webui or a text-based UI like oterm.

    • JustUseMint@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      I spent the better part of a day trying to setup llama c++ with “wizard vicuna unrestricted” and was unable to, and I’ve got quite a tech background. This was at someone’s suggestion, I’m hoping yours is easier lol.

    • akrot@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      ROCm? Is that even supported now? Last time I checked it was still a dumpster fire. What are the RAM and VRAM reqs for the Mixtral8x7b?

      • AlmightySnoo 🐢🇮🇱🇺🇦@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        2
        ·
        edit-2
        10 months ago

        ROCm is decent right now, I can do deep learning stuff and CUDA programming with it with an AMD APU. However, ollama doesn’t work out-of-the-box yet with APUs, but users seem to say that it works with dedicated AMD GPUs.

        As for Mixtral8x7b, I couldn’t run it on a system with 32GB of RAM and an RTX 2070S with 8GB of VRAM, I’ll probably try with another system soon [EDIT: I actually got the default version (mixtral:instruct) running with 32GB of RAM and 8GB of VRAM (RTX 2070S).] That same system also runs CodeLlama-34B fine.

        So far I’m happy with Mistral 7b, it’s extremely fast on my RTX 2070S, and it’s not really slow when running in CPU-mode on an AMD Ryzen 7. Its speed is okayish (~1 token/sec) when I try it in CPU-mode on an old Thinkpad T480 with an 8th gen i5 CPU.

        • akrot@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          10 months ago

          I have a ryzen apu, so I was curious. I tried yesterday to fiddle with it, and managed to up the “vram” to 16gb. But installing xformers and flash-attention for LLM support on igpus is not officially supported and was not possible to install anything past pytorch. It’s step further for sure, but still needs lots of work.