This is an interesting topic that I remember reading almost a decade ago - the trans-human AI-in-a-box experiment. Even a kill-switch may not be enough against a trans-human AI that can literally (in theory) out-think humans. I’m a dev, though not anywhere near AI-dev, but from what little I know, true general purpose AI would also be somewhat of a mystery box, similar to how actual neutral network behavior is sometimes unpredicable, almost by definition. So controlling an actual full AI may be difficult enough, let alone an actual true trans-human AI that may develop out of AI self-improvement.
Also on unrelated note I’m pleasantly surprised to see no mention of chat gpt or any of the image generating algorithms - I think it’s a bit of a misnomer to call those AI, the best comparison I’ve heard is that “chat gpt is auto-complete on steroids”. But I suppose that’s why we have to start using terms like general-purpose AI, instead of just AI to describe what I’d say is true AI.
You don’t do what Google seems to have done - inject diversity artificially into prompts.
You solve this by training the AI on actual, accurate, diverse data for the given prompt. For example, for “american woman” you definitely could find plenty of pictures of American women from all sorts of racial backgrounds, and use that to train the AI. For “german 1943 soldier” the accurate historical images are obviously far less likely to contain racially diverse people in them.
If Google has indeed already done that, and then still had to artificially force racial diversity, then their AI training model is bad and unable to handle that a single input can match to different images, instead of the most prominent or average of its training set.