That's what a refiner is for in auto1111. Taking an image the last 10% and touching it up with an alternative model.
I actually use flux to generate image for purposes of adherence, then pull it in as a canny/depth controlnet with more established models like realvis, unstableXL, etc.