To me it sounds like it could (and likely would) backfire, by replacing judgment with numbers. Who is giving the confidence score? What confidence score does each confidence score receive? Why are those scores more valid than the expert in that very narrow domain? If that expert is the one giving the scores, are they not just gatekeeping? Et cetera. I don't want to see researchers rewriting their papers because their cumulative source score is 68.17, and it should be 72.5 or higher.
also, there have been points in time where established archeology was wrong, and this seems like it would produce a bias towards what we currently think is true.
for example, theories on how the Polynesian migration came to be are still in flux, to the point where one theory was attempted to be proven by actually sailing to the different islands using only traditional wayfinding.
I would phrase it otherwise: supporting judgement with numbers. Its not about altering conclusions, but making more transparent the factual basis and associated reasoning from which they are derived.
The analogy would be trying some exotic food and having a list of ingredients. Yes, good to listen to a local as to how it tastes (and whether it cures all diseases), but if the indication is: 50% sugar, thats a data point worth knowing.