Not surprising. It always seemed likely to me that there is model bias if you train your models on model generated data, like a feedback loop (second order effects?). Similar to how applying a linear system over and over stretches the inputs in the direction of its largest eigenvector.
Now wait till the generated content is indistinguishable from human content (to humans) and it will be hard to figure out what's in your training set.
Now wait till the generated content is indistinguishable from human content (to humans) and it will be hard to figure out what's in your training set.