How to evaluate systems against human judgment on the presence of disagreement?

Next slide Back to first slide View graphic version