
Barwise Music Structure Analysis with the Correlation Block-Matching Segmentation Algorithm
By: Axel Marmoret, Jérémy E. Cohen and Frédéric Bimbot
References
- Bellman, R. (1952). On the theory of dynamic programming. Proceedings of the National Academy of Sciences, 38(8):716–719. DOI: 10.1073/pnas.38.8.716
- Bengio, Y., Courville, A., and Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828. DOI: 10.1109/TPAMI.2013.50
- Böck, S. and Davies, M. E. (2020). Deconstruct, analyse, reconstruct: How to improve tempo, beat, and downbeat estimation. In International Society for Music Information Retrieval Conference (ISMIR), pages 574–582.
- Böck, S., Davies, M. E., and Knees, P. (2019). Multitask learning of tempo and beat: Learning one to improve the other. In International Society for Music Information Retrieval Conference (ISMIR), pages 486–493.
- Böck, S., Korzeniowski, F., Schlüter, J., Krebs, F., and Widmer, G. (2016a). Madmom: A new Python audio and music signal processing library. In Proceedings of the 24th ACM International Conference on Multimedia, pages 1174–1178. DOI: 10.1145/2964284.2973795
- Böck, S., Krebs, F., and Widmer, G. (2016b). Joint beat and downbeat tracking with recurrent neural networks. In International Society for Music Information Retrieval Conference (ISMIR), pages 255–261.
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms. MIT press, 3rd edition.
- de Berardinis, J., Vamvakaris, M., Cangelosi, A., and Coutinho, E. (2020). Unveiling the hierarchical structure of music by multi-resolution community detection. Transactions of the International Society for Music Information Retrieval, 3(1):82–97. DOI: 10.5334/tismir.41
- Foote, J. (2000). Automatic audio segmentation using a measure of audio novelty. In IEEE International Conference on Multimedia and Expo, pages 452–455. DOI: 10.1109/ICME.2000.869637
- Fuentes, M., McFee, B., Crayencour, H. C., Essid, S., and Bello, J. P. (2019). A music structure informed downbeat tracking system using skip-chain conditional random fields and deep learning. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 481–485. DOI: 10.1109/ICASSP.2019.8682870
- Goto, M., Hashiguchi, H., Nishimura, T., and Oka, R. (2002). RWC Music Database: Popular, Classical and Jazz Music Databases. In International Conference on Music Information Retrieval (ISMIR), pages 287–288.
- Grill, T. and Schlüter, J. (2015). Music boundary detection using neural networks on combined features and two-level annotations. In International Society for Music Information Retrieval Conference (ISMIR), pages 531–537. DOI: 10.1109/EUSIPCO.2015.7362593
- Hung, Y.-N., Wang, J.-C., Song, X., Lu, W.-T., and Won, M. (2022). Modeling beats and downbeats with a time-frequency transformer. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 401–405. DOI: 10.1109/ICASSP43922.2022.9747048
- Jensen, K. (2006). Multiple scale music segmentation using rhythm, timbre, and harmony. EURASIP Journal on Advances in Signal Processing, 2007:1–11. DOI: 10.1155/2007/73205
- Marmoret, A., Cohen, J., and Bimbot, F. (2022a). as_seg: Module for computing and segmenting autosimilarity matrices.
https://gitlab.inria.fr/amarmore/autosimilarity_segmentation . - Marmoret, A., Cohen, J. E., Bertin, N., and Bimbot, F. (2020). Uncovering audio patterns in music with nonnegative Tucker decomposition for structural segmentation. In International Society for Music Information Retrieval Conference (ISMIR), pages 788–794.
- Marmoret, A., Cohen, J. E., and Bimbot, F. (2022b). Barwise compression schemes for audio-based music structure analysis. In 19th Sound and Music Computing Conference (SMC 2022).
- Mauch, M., Noland, K. C., and Dixon, S. (2009). Using musical structure to enhance automatic chord transcription. In International Society for Music Information Retrieval Conference (ISMIR), pages 231–236.
- McCallum, M. C. (2019). Unsupervised learning of deep features for music segmentation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 346–350. DOI: 10.1109/ICASSP.2019.8683407
- McFee, B. and Ellis, D. (2014a). Analyzing song structure with spectral clustering. In International Society for Music Information Retrieval Conference (ISMIR), pages 405–410.
- McFee, B. and Ellis, D. P. (2014b). Learning to segment songs with ordinal linear discriminant analysis. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 5197–5201. DOI: 10.1109/ICASSP.2014.6854594
- McFee, B., Nieto, O., Farbood, M. M., and Bello, J. P. (2017). Evaluating hierarchical structure in music annotations. Frontiers in Psychology, 8:
1337 . DOI: 10.3389/fpsyg.2017.01337 - Nieto, O. and Bello, J. P. (2016). Systematic exploration of computational music structure research. In International Society for Music Information Retrieval Conference (ISMIR), pages 547–553.
- Nieto, O., Mysore, G. J., Wang, C.-I., Smith, J. B., Schluter, J., Grill, T., and McFee, B. (2020). Audiobased music structure analysis: Current trends, open challenges, and applications. Transactions of the International Society for Music Information Retrieval, 3(1). DOI: 10.5334/tismir.78
- Ong, B. S. and Herrera, P. (2005). Semantic segmentation of music audio. In Proceedings of the International Computer Music Conference, page 61.
- Oyama, T., Ishizuka, R., and Yoshii, K. (2021). Phaseaware joint beat and downbeat estimation based on periodicity of metrical structure. In International Society for Music Information Retrieval Conference (ISMIR), pages 493–499.
- Paulus, J., Müller, M., and Klapuri, A. (2010). State of the art report: Audio-based music structure analysis. In International Society for Music Information Retrieval Conference (ISMIR), pages 625–636.
- Raffel, C., McFee, B., Humphrey, E. J., Salamon, J., Nieto, O., Liang, D., and Ellis, D. P. W. (2014). mir_eval: A transparent implementation of common MIR metrics. In International Society for Music Information Retrieval Conference (ISMIR), pages 367–372.
- Salamon, J., Nieto, O., and Bryan, N. J. (2021). Deep embeddings and section fusion improve music segmentation. IEEE Signal Processing Letters, 24(3):279–283. DOI: 10.1109/LSP.2017.2657381
- Sargent, G., Bimbot, F., and Vincent, E. (2016). Estimating the structural segmentation of popular music pieces under regularity constraints. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(2):344–358. DOI: 10.1109/TASLP.2016.2635031
- Serrà, J., Müller, M., Grosche, P., and Arcos, J. L. (2014). Unsupervised music structure annotation by time series structure features and segment similarity. IEEE Transactions on Multimedia, 16(5):1229–1240. DOI: 10.1109/TMM.2014.2310701
- Shiu, Y., Jeong, H., and Kuo, C.-C. J. (2006). Similarity matrix processing for music structure analysis. In Proceedings of the 1st ACM Workshop on Audio and Music Computing Multimedia, pages 69–76. DOI: 10.1145/1178723.1178734
- Smith, J. B., Burgoyne, J. A., Fujinaga, I., De Roure, D., and Downie, J. S. (2011). Design and creation of a large-scale database of structural annotations. In International Society for Music Information Retrieval Conference (ISMIR), pages 555–560.
- Turnbull, D., Lanckriet, G. R., Pampalk, E., and Goto, M. (2007). A supervised approach for detecting boundaries in music using difference features and boosting. In International Conference on Music Information Retrieval (ISMIR), pages 51–54.
- Ullrich, K., Schluter, J., and Grill, T. (2014). Boundary detection in music structure analysis using convolutional neural networks. In International Society for Music Information Retrieval Conference (ISMIR), pages 417–422.
- Wang, J.-C., Smith, J. B., Lu, W.-T., and Song, X. (2021). Supervised metric learning for music structure features. In International Society for Music Information Retrieval Conference (ISMIR), pages 730–737.
DOI: https://doi.org/10.5334/tismir.167 | Journal eISSN: 2514-3298
Language: English
Submitted on: Mar 30, 2023
Accepted on: Nov 2, 2023
Published on: Nov 30, 2023
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Keywords:
© 2023 Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.