Skip to main content
Have a personal or library account? Click to login
The Story Behind the RWC Music Database: An Interview with Masataka Goto Cover

The Story Behind the RWC Music Database: An Interview with Masataka Goto

Open Access
|Jun 2025

References

  1. Bittner, R. M., Salamon, J., Tierney, M., Mauch, M., Cannam, C., and Bello, J. P. (2014). MedleyDB: A multitrack dataset for annotation‑intensive MIR research. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Taipei, Taiwan (pp. 155160).
  2. Böck, S., Krebs, F., and Widmer, G. (2016). Joint beat and downbeat tracking with recurrent neural networks. In Proceedings of the International Societyfor Music Information Retrieval Conference (ISMIR), New York City, New York, USA (pp. 255261).
  3. Boulanger, R., Mathews, M., Vercoe, B., and Dannenberg, R. (1990). Conducting the MIDI orchestra, Part 1: Interviews with Max Mathews, Barry Vercoe, and Roger Dannenberg. Computer Music Journal, 14(2), 3446.
  4. Cannam, C., Landone, C., and Sandler, M. B. (2010). Sonic Visualiser: An open source application for viewing, analysing, and annotating music audio files. In Proceedings of the International Conference on Multimedia, Florence, Italy (pp. 14671468).
  5. Cheng, T., and Goto, M. (2023). Transformer‑based beat tracking with a low‑resolution encoder and a high‑resolution decoder. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Milan, Italy (pp. 466473).
  6. Cho, T., and Bello, J. P. (2011). A feature smoothing method for chord recognition using recurrence plots. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, Florida, USA (pp. 651656).
  7. Davies, M. E. P., Hamel, P., Yoshii, K., and Goto, M. (2014). AutoMashUpper: Automatic creation of multi‑song music mashups. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12), 17261737.
  8. Dessein, A., Cont, A., and Lemaitre, G. (2010). Real‑time polyphonic music transcription with non‑negative matrix factorization and beta‑divergence. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands (pp. 489494).
  9. Ewert, S., Müller, M., and Grosche, P. (2009). High resolution audio synchronization using chroma onset features. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Taipei, Taiwan (pp. 18691872).
  10. Goto, M. (2000). A robust predominant‑F0 estimation method for real‑time detection of melody and bass lines in CD recordings. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Vol. 2, pp. 757760).
  11. Goto, M. (2003a). A chorus‑section detecting method for musical audio signals. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, China (pp. 437440).
  12. Goto, M. (2003b). Music scene description project: Toward audio‑based real‑time music understanding. In Proceedings of the International Conference on Music Information Retrieval (ISMIR) (pp. 231232).
  13. Goto, M. (2003c). SmartMusicKIOSK: Music listening station with chorus‑search function. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST) (pp. 3140).
  14. Goto, M. (2004). Development of the RWC music database. In Proceedings of the International Congress on Acoustics (ICA) (pp. 553556).
  15. Goto, M. (2006a). AIST Annotation for the RWC Music Database. In Proceedings of the International Conference on Music Information Retrieval (ISMIR) (pp. 359360).
  16. Goto, M. (2006b). A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Transactions on Audio, Speech, and Language Processing, 14(5), 17831794.
  17. Goto, M., and Dannenberg, R. B. (2019). Music interfaces based on automatic music signal analysis: New ways to create and listen to music. IEEE Signal Processing Magazine, 36(1), 7481.
  18. Goto, M., Hashiguchi, H., Nishimura, T., and Oka, R. (2002). RWC Music Database: Popular, classical and jazz music databases. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), Paris, France (pp. 287288).
  19. Goto, M., Hashiguchi, H., Nishimura, T., and Oka, R. (2003). RWC Music Database: Music genre database and musical instrument sound database. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), Baltimore, Maryland, USA (pp. 229230).
  20. Goto, M., and Muraoka, Y. (1997). Issues in evaluating beat tracking systems. In Working Notes of the IJCAI‑97 Workshop on Issues in AI and Music‑Evaluation and Assessment (pp. 916).
  21. Goto, M., Yoshii, K., Fujihara, H., Mauch, M., and Nakano, T. (2011). Songle: A web service for active music listening improved by user contributions. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, Florida, USA (pp. 311316).
  22. Grosche, P., and Müller, M. (2011). Extracting predominant local pulse information from music recordings. IEEE Transactions on Audio, Speech, and Language Processing, 19(6), 16881701.
  23. Harte, C., and Sandler, M. B. (2005). Automatic chord identification using a quantised chromagram. In Proceedings of the Audio Engineering Society Convention, Barcelona, Spain.
  24. Itoyama, K., Kitahara, T., Komatani, K., Ogata, T., and Okuno, H. G. (2006). Automatic feature weighting in automatic transcription of specified part in polyphonic music. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), Victoria, Canada (pp. 172175).
  25. Joder, C., Essid, S., and Richard, G. (2011). A conditional random field framework for robust and scalable audio‑to‑score matching. IEEE Transactions on Audio, Speech, and Language Processing, 19(8), 23852397.
  26. Korzeniowski, F., and Widmer, G. (2016). Feature learning for chord recognition: The deep chroma extractor. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), New York City, New York, USA (pp. 3743).
  27. Mauch, M., and Dixon, S. (2014). pYIN: A fundamental frequency estimator using probabilistic threshold distributions. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy (pp. 659663).
  28. McFee, B., Kim, J. W., Cartwright, M., Salamon, J., Bittner, R. M., and Bello, J. P. (2019). Open‑source practices for music signal processing research: Recommendations for transparent, sustainable, and reproducible audio research. IEEE Signal Processing Magazine, 36(1), 128137.
  29. Nakano, T., and Goto, M. (2009). VocaListener: A singing‑to‑singing synthesis system based on iterative parameter estimation. In Proceedings of the 6th Sound and Music Computing Conference (SMC) (pp. 343348).
  30. Otsu, N. (1993). Real world computing program ‑ overview. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Nagoya, Japan (pp. 10651068).
  31. Paulus, J., and Klapuri, A. P. (2009). Music structure analysis using a probabilistic fitness measure and a greedy search algorithm. IEEE Transactions on Audio, Speech, and Language Processing, 17(6), 11591170.
  32. Roads, C., and Minsky, M. (1980). Interview with Marvin Minsky. Computer Music Journal, 4(3), 2539.
  33. Ryynänen, M., and Klapuri, A. (2005). Polyphonic music transcription using note event modeling. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, New York, USA (pp. 319322).
  34. Serra, X. (2014). Creating research corpora for the computational study of music: The case of the CompMusic project. In Proceedings of the AES International Conference on Semantic Audio, London, UK.
  35. Wang, J.‑C., Hung, Y.‑N., and Smith, J. B. L. (2022). To catch a chorus, verse, intro, or anything else: Analyzing a song with structural functions. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 416420).
  36. Wu, Y., and Li, W. (2019). Automatic audio chord recognition with MIDI‑trained deep feature and BLSTM‑CRF sequence decoding model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(2), 355366.
DOI: https://doi.org/10.5334/tismir.261 | Journal eISSN: 2514-3298
Language: English
Submitted on: Apr 1, 2025
Accepted on: May 17, 2025
Published on: Jun 19, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Meinard Müller, Stefan Balke, Masataka Goto, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.