• NutWrench@lemmy.world
    link
    fedilink
    English
    arrow-up
    30
    arrow-down
    2
    ·
    5 months ago

    Each conversation lasted a total of five minutes. According to the paper, which was published in May, the participants judged GPT-4 to be human a shocking 54 percent of the time. Because of this, the researchers claim that the large language model has indeed passed the Turing test.

    That’s no better than flipping a coin and we have no idea what the questions were. This is clickbait.

    • Hackworth@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      1
      ·
      5 months ago

      On the other hand, the human participant scored 67 percent, while GPT-3.5 scored 50 percent, and ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time.

      54% - 67% is the current gap, not 54 to 100.

    • NutWrench@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      3
      ·
      5 months ago

      The whole point of the Turing test, is that you should be unable to tell if you’re interacting with a human or a machine. Not 54% of the time. Not 60% of the time. 100% of the time. Consistently.

      They’re changing the conditions of the Turing test to promote an AI model that would get an “F” on any school test.

    • BrianTheeBiscuiteer@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      5 months ago

      It was either questioned by morons or they used a modified version of the tool. Ask it how it feels today and it will tell you it’s just a program!

      • KairuByte@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 months ago

        The version you interact with on their site is explicitly instructed to respond like that. They intentionally put those roadblocks in place to prevent answers they deem “improper”.

        If you take the roadblocks out, and instruct it to respond as human like as possible, you’d no longer get a response that acknowledges it’s an LLM.

  • dustyData@lemmy.world
    link
    fedilink
    English
    arrow-up
    28
    ·
    edit-2
    5 months ago

    Turing test isn’t actually meant to be a scientific or accurate test. It was proposed as a mental exercise to demonstrate a philosophical argument. Mainly the support for machine input-output paradigm and the blackbox construct. It wasn’t meant to say anything about humans either. To make this kind of experiments without any sort of self-awareness is just proof that epistemology is a weak topic in computer science academy.

    Specially when, from psychology, we know that there’s so much more complexity riding on such tests. Just to name one example, we know expectations alter perception. A Turing test suffers from a loaded question problem. If you prompt a person telling them they’ll talk with a human, with a computer program or announce before hand they’ll have to decide whether they’re talking with a human or not, and all possible combinations, you’ll get different results each time.

    Also, this is not the first chatbot to pass the Turing test. Technically speaking, if only one human is fooled by a chatbot to think they’re talking with a person, then they passed the Turing test. That is the extend to which the argument was originally elaborated. Anything beyond is alterations added to the central argument by the author’s self interests. But this is OpenAI, they’re all about marketing aeh fuck all about the science.

    EDIT: Just finished reading the paper, Holy shit! They wrote this “Turing originally envisioned the imitation game as a measure of intelligence” (p. 6, Jones & Bergen), and that is factually wrong. That is a lie. “A variety of objections have been raised to this idea”, yeah no shit Sherlock, maybe because he never said such a thing and there’s absolutely no one and nothing you can quote to support such outrageous affirmation. This shit shouldn’t ever see publication, it should not pass peer review. Turing never, said such a thing.

  • phoneymouse@lemmy.world
    link
    fedilink
    English
    arrow-up
    20
    arrow-down
    1
    ·
    5 months ago

    Easy, just ask it something a human wouldn’t be able to do, like “Write an essay on The Cultural Significance of Ogham Stones in Early Medieval Ireland“ and watch it spit out an essay faster than any human reasonably could.

    • Blue_Morpho@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 months ago

      I recall a Turing test years ago where a human was voted as a robot because they tried that trick but the person happened to have a PhD in the subject.

  • tourist@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    ·
    5 months ago

    The participants judged GPT-4 to be human a shocking 54 percent of the time.

    ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time

    Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.

    Try talking to the thing: https://web.njit.edu/~ronkowit/eliza.html

    I refuse to believe that 22% didn’t misunderstand the task or something.

    • Downcount@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 months ago

      Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.

      I did some stuff with Eliza back then. One time I set up an Eliza database full of insults and hooked it up to my AIM account.

      It went so well, I had to apologize to a lot of people who thought I was drunken or went crazy.

      Eliza wasn’t thaaaaat bad.

    • technocrit@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 months ago

      It was a 5 minute test. People probably spent 4 of those minutes typing their questions.

      This is pure pseudo-science.

  • NeoNachtwaechter@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    3
    ·
    5 months ago

    Turing test? LMAO.

    I asked it simply to recommend me a supermarket in our next bigger city here.

    It came up with a name and it told a few of it’s qualities. Easy, I thought. Then I found out that the name does not exist. It was all made up.

    You could argue that humans lie, too. But only when they have a reason to lie.

  • technocrit@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    1
    ·
    edit-2
    5 months ago
    • 500 people - meaningless sample
    • 5 minutes - meaningless amount of time
    • The people bootlicking “scientists” obviously don’t understand science.
  • bandwidthcrisis@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    5 months ago

    Did they try asking how to stop cheese falling off pizza?

    Edit: Although since that idea came from a human, maybe I’ve failed.

  • foggy@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    5 months ago

    Meanwhile, me:

    (Begin)

    [Prints error statement showing how I navigated to a dir, checked to see a files permissions, ran whoami, triggered the error]

    Chatgpt4: First, make sure you’ve navigated to the correct directory.

    cd /path/to/file

    Next, check the permissions of the file

    ls -la

    Finally, run the command

    [exact command I ran to trigger the error]>

    Me: stop telling me to do stuff that I have evidently done. My prompt included evidence of me having do e all of that already. How do I handle this error?

    (return (begin))

  • dhork@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 months ago

    In order for an AI to pass the Turing test, it must be able to talk to someone and fool them into thinking that they are talking to a human.

    So, passing the Turing Test either means the AI are getting smarter, or that humans are getting dumber.

    • pewter@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      5 months ago

      Humans are as smart as they ever were. Tech is getting better. I know someone who was tricked by those deepfake Kelly Clarkson weight loss gummy ads. It looks super fake to me, but it’s good enough to trick some people.

  • werefreeatlast@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    5 months ago

    It does great at Python programming… everything it tries is wrong until I try and I tell tell it to do it again.

    • A_A@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      5 months ago

      Edit :
      oops : you were saying it is like a human since it does errors ? maybe i “wooshed”.


      Hi @werefreeatlast,
      i had successes asking LLaMA 3 70B with simple specific questions …
      Context : i am bad at programming and it help me at least to see how i could use a few function calls in C from Python … or simply drop Python and do it directly in C.
      Like you said, i have to re-write & test … but i have a possible path forward. Clearly you know what you do on a computer but i’m not really there yet.

      • werefreeatlast@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        But people don’t just know code when you ask them. The llms fo because they got trained on that code. It’s robotic in nature, not a natural reaction yet.

    • harrys_balzac@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 months ago

      Skynet will gets the dumb ones first by getting them put toxic glue on thir pizzas then the arrogant ones will build the Terminators by using reverse psychology.