Poisoned AI went rogue during training and couldn't be taught to behave again in 'legitimately scary' study

L4sBot@lemmy.world · 10 months ago

Poisoned AI went rogue during training and couldn't be taught to behave again in 'legitimately scary' study

irotsoma@lemmy.world · 10 months ago

The problem is that these LLMs are built with the wrong driving motivator. They’re driven to find one right way whereas the reality is that there is rarely a single right way and computers don’t need to have a single right way like humans tend towards. The LLM shouldn’t be driven to be “right” in its learning model. It should be trained on known good data only as a base, and then given the other data to serve context rather than allowing that data to modify the underlying system. This is more like how biological creatures work in teaching a child to be “good” or “evil” and to know the basic things needed to survive and serve their purpose, and then the stuff they learn in adulthood serves to help them apply those base concepts to the world.