Iterative Metric Learning for Imbalance Data Classification

Iterative Metric Learning for Imbalance Data Classification

Nan Wang, Xibin Zhao, Yu Jiang, Yue Gao

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 2805-2811. https://doi.org/10.24963/ijcai.2018/389

In many classification applications, the amount of data from different categories usually vary significantly, such as software defect predication and medical diagnosis. Under such circumstances, it is essential to propose a proper method to solve the imbalance issue among the data. However, most of the existing methods mainly focus on improving the performance of classifiers rather than searching for an appropriate way to find an effective data space for classification. In this paper, we propose a method named Iterative Metric Learning (IML) to explore the correlations among imbalance data and construct an effective data space for classification. Given the imbalance training data, it is important to select a subset of training samples for each testing data. Thus, we aim to find a more stable neighborhood for testing data using the iterative metric learning strategy. To evaluate the effectiveness of the proposed method, we have conducted experiments on two groups of dataset, i.e., the NASA Metrics Data Program (NASA) dataset and UCI Machine Learning Repository (UCI) dataset. Experimental results and comparisons with state-of-the-art methods have exhibited better performance of our proposed method.
Keywords:
Machine Learning: Classification
Machine Learning: Machine Learning