Abstract
Australia’s critical water pipes break on average 7, 000 times per year. Being able to accurately identify which pipes are at risk of failure will potentially save Australia’s water utilities and the community up to \( \$700 \) million a year in reactive repairs and maintenance. However, ranking these water pipes according to their calculated risk has mixed results due to their different types of attributes, data incompleteness and data imbalance. This paper describes our experience in improving the performance of classifying and ranking these data via local metric learning. Distance metric learning is a powerful tool that can improve the performance of similarity based classifications. In general, global metric learning techniques do not consider local data distributions, and hence do not perform well on complex / heterogeneous data. Local metric learning methods address this problem but are usually expensive in runtime and memory. This paper proposes a fuzzy-based local metric learning approach that out-performs recently proposed local metric methods, while still being faster than popular global metric learning methods in most cases. Extensive experiments on Australia water pipe datasets demonstrate the effectiveness and performance of our proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bellet, A., Habrard, A., Sebban, M.: A survey on metric learning for feature vectors and structured data. arXiv preprint arXiv:1306.6709 (2013)
Bohné, J., Ying, Y., Gentric, S., Pontil, M.: Large margin local metric learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014, Part II. LNCS, vol. 8690, pp. 679–694. Springer, Heidelberg (2014)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
He, Y., Chen, W., Chen, Y., Mao, Y.: Kernel density metric learning. In: IEEE 13th International Conference on Data Mining (ICDM), 2013. pp. 271–280. IEEE (2013)
Huang, H.C., Chuang, Y.Y., Chen, C.S.: Multiple kernel fuzzy clustering. IEEE Trans. Fuzzy Syst. 20(1), 120–134 (2012)
Ibrahim, J.G., Chen, M.H., Sinha, D.: Bayesian survival analysis. Wiley Online Library, New York (2005)
Jafar, R., Shahrour, I., Juran, I.: Application of artificial neural networks (ANN) to model the failure of urban water mains. Math. Comput. Model. 51(9), 1170–1180 (2010)
Kleiner, Y., Rajani, B.: Comparison of four models to rank failure likelihood of individual pipes. J. Hydroinformatics 14(3), 659–681 (2012)
Li, Z., Zhang, B., Wang, Y., Chen, F., Taib, R., Whiffin, V., Wang, Y.: Water pipe condition assessment: a hierarchical beta process approach for sparse incident data. Mach. Learn. 95(1), 11–26 (2014)
Liu, T., Moore, A.W., Yang, K., Gray, A.G.: An investigation of practical approximate nearest neighbor algorithms. In: Advances in neural information processing systems. pp. 825–832 (2004)
Liu, W., Tsang, I.W.: Large margin metric learning for multi-label prediction. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Megano, T., Fukui, K.i., Numao, M., Ono, S.: Evolutionary multi-objective distance metric learning for multi-label clustering. In: 2015 IEEE Congress on Evolutionary Computation (CEC). pp. 2945–2952. IEEE (2015)
Noh, Y.K., Zhang, B.T., Lee, D.D.: Generative local metric learning for nearest neighbor classification. In: NIPS. pp. 1822–1830 (2010)
Tabesh, M., Soltani, J., Farmani, R., Savic, D.: Assessing pipe failure rate and mechanical reliability of water distribution networks using data-driven modeling. J. Hydroinformatics 11(1), 1–17 (2009)
Wan, S., Aggarwal, J.: Spontaneous facial expression recognition: a robust metric learning approach. Pattern Recogn. 47(5), 1859–1868 (2014)
Wang, J., Kalousis, A., Woznica, A.: Parametric local metric learning for nearest neighbor classification. In: NIPS. pp. 1610–1618 (2012)
Wang, Y., Zayed, T., Moselhi, O.: Prediction models for annual break rates of water mains. J. Perform. Constructed Facil. 23(1), 47–54 (2009)
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)
Wu, L., Jin, R., Hoi, S.C., Zhu, J., Yu, N.: Learning Bregman distance functions and its application for semi-supervised clustering. In: NIPS. pp. 2089–2097 (2009)
Xiong, C., Johnson, D., Xu, R., Corso, J.J.: Random forests for metric learning with implicit pairwise position dependence. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 958–966. ACM (2012)
Xu, Q., Chen, Q., Ma, J., Blanckaert, K.: Optimal pipe replacement strategy based on break rate prediction through genetic programming for water distribution network. J. Hydro-Environ. Res. 7(2), 134–140 (2013)
Yu, J., Tao, D., Li, J., Cheng, J.: Semantic preserving distance metric learning and applications. Inf. Sci. 281, 674–686 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ghanavati, M., Wong, R.K., Chen, F., Wang, Y., Fong, S. (2016). Effective Local Metric Learning for Water Pipe Assessment. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-31753-3_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)