Abstract
Bioacoustics is powerful for monitoring biodiversity. We investigate in this paper automatic segmentation model for real-world bioacoustic scenes in order to infer hidden states referred as song units. Nevertheless, the number of these acoustic units is often unknown, unlike in human speech recognition. Hence, we propose a bioacoustic segmentation based on the Hierarchical Dirichlet Process (HDP-HMM), a Bayesian non-parametric (BNP) model to tackle this challenging problem. Hence, we focus our approach on unsupervised learning from bioacoustic sequences. It consists in simultaneously finding the structure of hidden song units, and automatically infers the unknown number of the hidden states. We investigate two real bioacoustic scenes: whale, and multi-species birds songs. We learn the models using Markov-Chain Monte Carlo (MCMC) sampling techniques on Mel Frequency Cepstral Coefficients (MFCC). Our results, scored by bioacoustic expert, show that the model generates correct song unit segmentation. This study demonstrates new insights for unsupervised analysis of complex soundscapes and illustrates their potential of chunking non-human animal signals into structured units. This can yield to new representations of the calls of a target species, but also to the structuration of inter-species calls. It gives to experts a tracktable approach for efficient bioacoustic research as requested in Kershenbaum et al. (Biol Rev 91(1):13–52, 2016).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The MFCC are features that represent and compress short-term power spectrum of a sound. It follows the Mel scale.
- 2.
- 3.
References
Bartcus, M., Chamroukhi, F., & Glotin, H. (2015, July). Hierarchical Dirichlet Process Hidden Markov Model for Unsupervised Bioacoustic Analysis. In Neural Networks (IJCNN), 2015 International Joint Conference on pp. 1–7. IEEE.
Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica sinica, 639–650.
Kershenbaum, A., Blumstein, D.T., Roch, M.A., Akçay, Ç., Backus, G., Bee, M.A., Bohn, K., Cao, Y., Carter, G., Cäsar, C. and Coen, M. (2016). Acoustic sequences in non-human animals: a tutorial review and prospectus. Biological Reviews, 91(1), pp.13–52.
Rabiner, L. and Juang, B. (1986). An introduction to hidden Markov models. ieee assp magazine, 3(1), pp.4–16.
Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 6(2), pp.461–464.
Akaike, H. (1974). A new look at the statistical model identification. IEEE transactions on automatic control, 19(6), 716–723.
Teh, Yee Whye and Jordan, Michael I. and Beal, Matthew J. and Blei, David M. (2006). Hierarchical Dirichlet Processes. Journal of the American Statistical Association, 476(101), pp.1566–1581.
Beal, M. J., Ghahramani, Z., & Rasmussen, C. E. (2002). The infinite hidden Markov model. In Advances in neural information processing systems pp. 577–584.
Fox, E. B., Sudderth, E. B., Jordan, M. I., & Willsky, A. S. (2008, July). An HDP-HMM for systems with state persistence. In Proceedings of the 25th international conference on Machine learning pp. 312–319. ACM.
Helweg, D.A., Cat, D.H., Jenkins, P.F., Garrigue, C. and McCauley, R.D. (1998). Geograpmc Variation in South Pacific Humpback Whale Songs. Behaviour, 135(1), pp.1–27.
Medrano, L., Salinas, M., Salas, I., Guevara, P.L.D., Aguayo, A., Jacobsen, J. and Baker, C.S. (1994). Sex identification of humpback whales, Megaptera novaeangliae, on the wintering grounds of the Mexican Pacific Ocean. Canadian journal of zoology, 72(10), pp.1771–1774.
Frankel, A.S., Clark, C.W., Herman, L. and Gabriele, C.M. (1995). Spatial distribution, habitat utilization, and social interactions of humpback whales, Megaptera novaeangliae, off Hawai’i, determined using acoustic and visual techniques. Canadian Journal of Zoology, 73(6), pp.1134–1146.
Baker, C.S. and Herman, L.M. (1984). Aggressive behavior between humpback whales (Megaptera novaeangliae) wintering in Hawaiian waters. Canadian journal of zoology, 62(10), pp.1922–1937.
Garland, E.C., Goldizen, A.W., Rekdahl, M.L., Constantine, R., Garrigue, C., Hauser, N.D., Poole, M.M., Robbins, J. and Noad, M.J. (2011). Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Current Biology, 21(8), pp.687–691.
Catchpole, C.K. and Slater, P.J., 86. B. (1995). Birdsong: Biological Themes and Variations. Cambridge University PressCatchpole.
Kroodsma, D. E., & Miller, E. H. (Eds.). (1996). Ecology and evolution of acoustic communication in birds pp. 269–281. Comstock Pub.
Pace, F., Benard, F., Glotin, H., Adam, O. and White, P. (2010). Subunit definition and analysis for humpback whale call classification. Applied Acoustics, 71(11), pp.1107–1112.
Picot, G., Adam, O., Bergounioux, M., Glotin, H. and Mayer, F.X. (2008, October). Automatic prosodic clustering of humpback whales song. In New Trends for Environmental Monitoring Using Passive Systems, 2008 pp. 1–6. IEEE.
Glotin, H., LeCun, Y., Artieres, T., Mallat, S., Tchernichovski, O., & Halkias, X. (2013). Neural information processing scaled for bioacoustics, from neurons to big data. USA (2013). http://sabiod.org/NIPS4B2013_book.pdf.
Deroussen F., Jiguet F. (2006). La sonotheque du Museum: Oiseaux de France. Nashvert Production, Charenton, France.
Baum, L.E., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The annals of mathematical statistics, 41(1), pp.164–171.
Biernacki, C., Celeux, G. and Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE transactions on pattern analysis and machine intelligence, 22(7), pp.719–725.
Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. The annals of statistics, pp.209–230.
Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probability theory and related fields, 102(2), pp.145–158.
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian Data Analysis, (Chapman & Hall/CRC Texts in Statistical Science).
Strehl, A. and Ghosh, J. (2002). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of machine learning research, 3(Dec), pp.583–617.
Jordan, M.I., Ghahramani, Z., Jaakkola, T.S. and Saul, L.K. (1999). An introduction to variational methods for graphical models. Machine learning, 37(2), pp.183–233.
Foti, N., Xu, J., Laird, D., & Fox, E. (2014). Stochastic variational inference for hidden Markov models. In Advances in neural information processing systems, pp.3599–3607.
Acknowledgements
We would like to thanks Provence-Alpes-Côte d’Azur region and NortekMed for their financial support for Vincent ROGER. We also thank GDR CNRS MADICS http://sabiod.org/EADM for its support. We thank G. Pavan for its expertise, J. Sueur, F. Deroussen, F. Jiguet for the coorganisation of the challenges and M. Roch for her collaboration.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Roger, V., Bartcus, M., Chamroukhi, F., Glotin, H. (2018). Unsupervised Bioacoustic Segmentation by Hierarchical Dirichlet Process Hidden Markov Model. In: Joly, A., Vrochidis, S., Karatzas, K., Karppinen, A., Bonnet, P. (eds) Multimedia Tools and Applications for Environmental & Biodiversity Informatics. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-76445-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-76445-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76444-3
Online ISBN: 978-3-319-76445-0
eBook Packages: Computer ScienceComputer Science (R0)