I'm not an expert in this space but I can see the value. It allows an endless lo...

I'm not an expert in this space but I can see the value. It allows an endless loop of generating novel scenarios and evaluating an AI agent's performance within that scenario (for example, "go up the stairs"). A world with one minute of coherence is about enough to evaluate whether the AI's actions were in the right direction or not. When you then want to run an agent on a real task in the real world, with video-input data, you can run the same policy that it learned in dream-world simulation. The real world has coherence, so the AI agent's actions just need to string together well enough minute-by-minute to work toward achieving a goal.

You could use real video games to do this but I guess there'd be a risk of over-fitting; maybe it would learn too precisely what a staircase looks like in Minecraft, but fail to generalize that to the staircase in your home. If they can simulate dream worlds (as well as, presumably, worlds from real photos), then they can train their agents this way.

This would only be training high-level decision policies (ie, WASD inputs). For something like a robot, lower level motor control loops would still be needed to execute those commands.

Of course you could just do your training in the real world directly, because it already has coherence and plenty of environmental variety. But the learning process involves lots of learning from failure, and that would probably be even more expensive than this expensive simulator.

Despite the claims I don't think it does much to help with AI safety. It can help avoid hilarious disasters of an AI-in-training crashing a speedboat onto the riverbank, but I don't think there's much here that helps with the deeper problems of value-alignment. This also seems like an effective way to train robo-killbots who perceive the world as a dreamlike first-person shooter.