Barwise Music Structure Analysis with the Correlation Block-Matching Segmentation Algorithm

Axel Marmoret; Jérémy E. Cohen; Frédéric Bimbot

doi:10.5334/tismir.167

Barwise Music Structure Analysis with the Correlation Block-Matching Segmentation Algorithm

Transactions of the International Society for Music Information Retrieval

Volume 6 (2023): Issue 1

By: Axel Marmoret , Jérémy E. Cohen and Frédéric Bimbot

Open Access

|Nov 2023

Bellman, R. (1952). On the theory of dynamic programming. Proceedings of the National Academy of Sciences, 38(8):716–719. DOI: 10.1073/pnas.38.8.716
Open DOI Search in Google Scholar Back to article
Bengio, Y., Courville, A., and Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828. DOI: 10.1109/TPAMI.2013.50
Open DOI Search in Google Scholar Back to article
Böck, S. and Davies, M. E. (2020). Deconstruct, analyse, reconstruct: How to improve tempo, beat, and downbeat estimation. In International Society for Music Information Retrieval Conference (ISMIR), pages 574–582.
Search in Google Scholar Back to article
Böck, S., Davies, M. E., and Knees, P. (2019). Multitask learning of tempo and beat: Learning one to improve the other. In International Society for Music Information Retrieval Conference (ISMIR), pages 486–493.
Search in Google Scholar Back to article
Böck, S., Korzeniowski, F., Schlüter, J., Krebs, F., and Widmer, G. (2016a). Madmom: A new Python audio and music signal processing library. In Proceedings of the 24th ACM International Conference on Multimedia, pages 1174–1178. DOI: 10.1145/2964284.2973795
Open DOI Search in Google Scholar Back to article
Böck, S., Krebs, F., and Widmer, G. (2016b). Joint beat and downbeat tracking with recurrent neural networks. In International Society for Music Information Retrieval Conference (ISMIR), pages 255–261.
Search in Google Scholar Back to article
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms. MIT press, 3rd edition.
Search in Google Scholar Back to article
de Berardinis, J., Vamvakaris, M., Cangelosi, A., and Coutinho, E. (2020). Unveiling the hierarchical structure of music by multi-resolution community detection. Transactions of the International Society for Music Information Retrieval, 3(1):82–97. DOI: 10.5334/tismir.41
Open DOI Search in Google Scholar Back to article
Foote, J. (2000). Automatic audio segmentation using a measure of audio novelty. In IEEE International Conference on Multimedia and Expo, pages 452–455. DOI: 10.1109/ICME.2000.869637
Open DOI Search in Google Scholar Back to article
Fuentes, M., McFee, B., Crayencour, H. C., Essid, S., and Bello, J. P. (2019). A music structure informed downbeat tracking system using skip-chain conditional random fields and deep learning. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 481–485. DOI: 10.1109/ICASSP.2019.8682870
Open DOI Search in Google Scholar Back to article
Goto, M., Hashiguchi, H., Nishimura, T., and Oka, R. (2002). RWC Music Database: Popular, Classical and Jazz Music Databases. In International Conference on Music Information Retrieval (ISMIR), pages 287–288.
Search in Google Scholar Back to article
Grill, T. and Schlüter, J. (2015). Music boundary detection using neural networks on combined features and two-level annotations. In International Society for Music Information Retrieval Conference (ISMIR), pages 531–537. DOI: 10.1109/EUSIPCO.2015.7362593
Open DOI Search in Google Scholar Back to article
Hung, Y.-N., Wang, J.-C., Song, X., Lu, W.-T., and Won, M. (2022). Modeling beats and downbeats with a time-frequency transformer. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 401–405. DOI: 10.1109/ICASSP43922.2022.9747048
Open DOI Search in Google Scholar Back to article
Jensen, K. (2006). Multiple scale music segmentation using rhythm, timbre, and harmony. EURASIP Journal on Advances in Signal Processing, 2007:1–11. DOI: 10.1155/2007/73205
Open DOI Search in Google Scholar Back to article
Marmoret, A., Cohen, J., and Bimbot, F. (2022a). as_seg: Module for computing and segmenting autosimilarity matrices. https://gitlab.inria.fr/amarmore/autosimilarity_segmentation.
Search in Google Scholar Back to article
Marmoret, A., Cohen, J. E., Bertin, N., and Bimbot, F. (2020). Uncovering audio patterns in music with nonnegative Tucker decomposition for structural segmentation. In International Society for Music Information Retrieval Conference (ISMIR), pages 788–794.
Search in Google Scholar Back to article
Marmoret, A., Cohen, J. E., and Bimbot, F. (2022b). Barwise compression schemes for audio-based music structure analysis. In 19th Sound and Music Computing Conference (SMC 2022).
Search in Google Scholar Back to article
Mauch, M., Noland, K. C., and Dixon, S. (2009). Using musical structure to enhance automatic chord transcription. In International Society for Music Information Retrieval Conference (ISMIR), pages 231–236.
Search in Google Scholar Back to article
McCallum, M. C. (2019). Unsupervised learning of deep features for music segmentation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 346–350. DOI: 10.1109/ICASSP.2019.8683407
Open DOI Search in Google Scholar Back to article
McFee, B. and Ellis, D. (2014a). Analyzing song structure with spectral clustering. In International Society for Music Information Retrieval Conference (ISMIR), pages 405–410.
Search in Google Scholar Back to article
McFee, B. and Ellis, D. P. (2014b). Learning to segment songs with ordinal linear discriminant analysis. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5197–5201. DOI: 10.1109/ICASSP.2014.6854594
Open DOI Search in Google Scholar Back to article
McFee, B., Nieto, O., Farbood, M. M., and Bello, J. P. (2017). Evaluating hierarchical structure in music annotations. Frontiers in Psychology, 8: 1337. DOI: 10.3389/fpsyg.2017.01337
Open DOI Search in Google Scholar Back to article
Nieto, O. and Bello, J. P. (2016). Systematic exploration of computational music structure research. In International Society for Music Information Retrieval Conference (ISMIR), pages 547–553.
Search in Google Scholar Back to article
Nieto, O., Mysore, G. J., Wang, C.-I., Smith, J. B., Schluter, J., Grill, T., and McFee, B. (2020). Audiobased music structure analysis: Current trends, open challenges, and applications. Transactions of the International Society for Music Information Retrieval, 3(1). DOI: 10.5334/tismir.78
Open DOI Search in Google Scholar Back to article
Ong, B. S. and Herrera, P. (2005). Semantic segmentation of music audio. In Proceedings of the International Computer Music Conference, page 61.
Search in Google Scholar Back to article
Oyama, T., Ishizuka, R., and Yoshii, K. (2021). Phaseaware joint beat and downbeat estimation based on periodicity of metrical structure. In International Society for Music Information Retrieval Conference (ISMIR), pages 493–499.
Search in Google Scholar Back to article
Paulus, J., Müller, M., and Klapuri, A. (2010). State of the art report: Audio-based music structure analysis. In International Society for Music Information Retrieval Conference (ISMIR), pages 625–636.
Search in Google Scholar Back to article
Raffel, C., McFee, B., Humphrey, E. J., Salamon, J., Nieto, O., Liang, D., and Ellis, D. P. W. (2014). mir_eval: A transparent implementation of common MIR metrics. In International Society for Music Information Retrieval Conference (ISMIR), pages 367–372.
Search in Google Scholar Back to article
Salamon, J., Nieto, O., and Bryan, N. J. (2021). Deep embeddings and section fusion improve music segmentation. IEEE Signal Processing Letters, 24(3):279–283. DOI: 10.1109/LSP.2017.2657381
Open DOI Search in Google Scholar Back to article
Sargent, G., Bimbot, F., and Vincent, E. (2016). Estimating the structural segmentation of popular music pieces under regularity constraints. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(2):344–358. DOI: 10.1109/TASLP.2016.2635031
Open DOI Search in Google Scholar Back to article
Serrà, J., Müller, M., Grosche, P., and Arcos, J. L. (2014). Unsupervised music structure annotation by time series structure features and segment similarity. IEEE Transactions on Multimedia, 16(5):1229–1240. DOI: 10.1109/TMM.2014.2310701
Open DOI Search in Google Scholar Back to article
Shiu, Y., Jeong, H., and Kuo, C.-C. J. (2006). Similarity matrix processing for music structure analysis. In Proceedings of the 1st ACM Workshop on Audio and Music Computing Multimedia, pages 69–76. DOI: 10.1145/1178723.1178734
Open DOI Search in Google Scholar Back to article
Smith, J. B., Burgoyne, J. A., Fujinaga, I., De Roure, D., and Downie, J. S. (2011). Design and creation of a large-scale database of structural annotations. In International Society for Music Information Retrieval Conference (ISMIR), pages 555–560.
Search in Google Scholar Back to article
Turnbull, D., Lanckriet, G. R., Pampalk, E., and Goto, M. (2007). A supervised approach for detecting boundaries in music using difference features and boosting. In International Conference on Music Information Retrieval (ISMIR), pages 51–54.
Search in Google Scholar Back to article
Ullrich, K., Schluter, J., and Grill, T. (2014). Boundary detection in music structure analysis using convolutional neural networks. In International Society for Music Information Retrieval Conference (ISMIR), pages 417–422.
Search in Google Scholar Back to article
Wang, J.-C., Smith, J. B., Lu, W.-T., and Song, X. (2021). Supervised metric learning for music structure features. In International Society for Music Information Retrieval Conference (ISMIR), pages 730–737.
Search in Google Scholar Back to article

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.5334/tismir.167 | Journal eISSN: 2514-3298

Journal RSS Feed

Language: English

Submitted on: Mar 30, 2023

Accepted on: Nov 2, 2023

Published on: Nov 30, 2023

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

Music Structure Analysis,

Audio Signals,

Barwise Music Processing,

Self-Similarity Matrix Segmentation

© 2023 Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 6 (2023): Issue 1