Dude, it's the second sentence of the first paragraph:
> For my purposes, the Majestic Million dataset felt like the perfect fit as it is ranked by the number of links that point to that domain (as well as taking into account diversity of the origin domains as well).
And moreover, the author’s conclusion is that the dataset is bad.
> While I had expected some cleanliness issues, I wasn’t expecting to see this level of quality problems from a dataset that I’ve seen referenced pretty extensively across the web
> For my purposes, the Majestic Million dataset felt like the perfect fit as it is ranked by the number of links that point to that domain (as well as taking into account diversity of the origin domains as well).