Application of aboutness to functional benchmarking in information retrieval
Experimental approaches are widely employed to benchmark the performance of an
information retrieval (IR) system. Measurements in terms of recall and precision are
computed as performance indicators. Although they are good at assessing the retrieval
effectiveness of an IR system, they fail to explore deeper aspects such as its underlying
functionality and explain why the system shows such performance. Recently, inductive (ie,
theoretical) evaluation of IR systems has been proposed to circumvent the controversies of …
information retrieval (IR) system. Measurements in terms of recall and precision are
computed as performance indicators. Although they are good at assessing the retrieval
effectiveness of an IR system, they fail to explore deeper aspects such as its underlying
functionality and explain why the system shows such performance. Recently, inductive (ie,
theoretical) evaluation of IR systems has been proposed to circumvent the controversies of …
Experimental approaches are widely employed to benchmark the performance of an information retrieval (IR) system. Measurements in terms of recall and precision are computed as performance indicators. Although they are good at assessing the retrieval effectiveness of an IR system, they fail to explore deeper aspects such as its underlying functionality and explain why the system shows such performance. Recently, inductive (i.e., theoretical) evaluation of IR systems has been proposed to circumvent the controversies of the experimental methods. Several studies have adopted the inductive approach, but they mostly focus on theoretical modeling of IR properties by using some metalogic. In this article, we propose to use inductive evaluation for functional benchmarking of IR models as a complement of the traditional experiment-based performance benchmarking. We define a functional benchmark suite in two stages: the evaluation criteria based on the notion of "aboutness," and the formal evaluation methodology using the criteria. The proposed benchmark has been successfully applied to evaluate various well-known classical and logic-based IR models. The functional benchmarking results allow us to compare and analyze the functionality of the different IR models.
ACM Digital Library