research-article

Open access

Impact of Surrogate Assessments on High-Recall Retrieval

Authors:

Adam Roegiest,

Gordon V. Cormack,

Charles L.A. Clarke,

Maura R. GrossmanAuthors Info & Claims

SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 555 - 564

https://doi.org/10.1145/2766462.2767754

Published: 09 August 2015 Publication History

PDF eReader

Abstract

We are concerned with the effect of using a surrogate assessor to train a passive (i.e., batch) supervised-learning method to rank documents for subsequent review, where the effectiveness of the ranking will be evaluated using a different assessor deemed to be authoritative. Previous studies suggest that surrogate assessments may be a reasonable proxy for authoritative assessments for this task. Nonetheless, concern persists in some application domains---such as electronic discovery---that errors in surrogate training assessments will be amplified by the learning method, materially degrading performance. We demonstrate, through a re-analysis of data used in previous studies, that, with passive supervised-learning methods, using surrogate assessments for training can substantially impair classifier performance, relative to using the same deemed-authoritative assessor for both training and assessment. In particular, using a single surrogate to replace the authoritative assessor for training often yields a ranking that must be traversed much lower to achieve the same level of recall as the ranking that would have resulted had the authoritative assessor been used for training. We also show that steps can be taken to mitigate, and sometimes overcome, the impact of surrogate assessments for training: relevance assessments may be diversified through the use of multiple surrogates; and, a more liberal view of relevance can be adopted by having the surrogate label borderline documents as relevant. By taking these steps, rankings derived from surrogate assessments can match, and sometimes exceed, the performance of the ranking that would have been achieved, had the authority been used for training. Finally, we show that our results still hold when the role of surrogate and authority are interchanged, indicating that the results may simply reflect differing conceptions of relevance between surrogate and authority, as opposed to the authority having special skill or knowledge lacked by the surrogate.

References

[1]

Memorandum in Support of Motion for Protective Order Approving the Use of Predictive Coding, Global Aerospace v. Landow Aviation, No. CL 61040, 2012 WL 1419842 (Va. Cir. Ct., Loudoun Cnty., Apr. 9, 2012).

Abstract

References

Cited By

Index Terms

Recommendations

Scalability of Continuous Active Learning for Reliable High-Recall Text Classification

Effective User Interaction for High-Recall Retrieval: Less is More

Impact of Review-Set Selection on Human Assessment for Text Classification

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations