• tourist@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    ·
    5 months ago

    The participants judged GPT-4 to be human a shocking 54 percent of the time.

    ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time

    Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.

    Try talking to the thing: https://web.njit.edu/~ronkowit/eliza.html

    I refuse to believe that 22% didn’t misunderstand the task or something.

    • Downcount@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 months ago

      Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.

      I did some stuff with Eliza back then. One time I set up an Eliza database full of insults and hooked it up to my AIM account.

      It went so well, I had to apologize to a lot of people who thought I was drunken or went crazy.

      Eliza wasn’t thaaaaat bad.

    • technocrit@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 months ago

      It was a 5 minute test. People probably spent 4 of those minutes typing their questions.

      This is pure pseudo-science.