Quantifying accuracy improvement in sets of pooled judgments: does dialectical bootstrapping work?

Quantifying accuracy improvement in sets of pooled judgments: does dialectical bootstrapping work?