
GiantMIDI-Piano: A Large-Scale MIDI Dataset for Classical Piano Music
By: Qiuqiang Kong, Bochen Li, Jitong Chen and Yuxuan Wang
References
- Bainbridge, D., and Bell, T. (2001). The challenge of optical music recognition. Computers and the Humanities, 35(2):95–121. DOI: 10.1023/A:1002485918032
- Bryner, B. (2002). The Piano Roll: A Valuable Recording Medium of the Twentieth Century. PhD thesis, Department of Music, University of Utah.
- Cancino-Chacón, C. E., Grachten, M., Goebl, W., and Widmer, G. (2018). Computational models of expressive music performance: A comprehensive and critical review. Frontiers in Digital Humanities, 5:25. DOI: 10.3389/fdigh.2018.00025
- Casey, M. A., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., and Slaney, M. (2008). Contentbased music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4):668–696. DOI: 10.1109/JPROC.2008.916370
- Choi, K., Fazekas, G., Cho, K., and Sandler, M. (2017). A tutorial on deep learning for music information retrieval. arXiv preprint arXiv:1709.04396.
- Conklin, D., and Witten, I. H. (1995). Multiple viewpoint systems for music prediction. Journal of New Music Research, 24(1):51–73. DOI: 10.1080/09298219508570672
- Duan, Z., Pardo, B., and Zhang, C. (2010). Multiple fundamental frequency estimation by modeling spectral peaks and non-peak regions. IEEE Transactions on Audio, Speech, and Language Processing, 18(8):2121–2133. DOI: 10.1109/TASL.2010.2042119
- Emiya, V., Bertin, N., David, B., and Badeau, R. (2010). MAPS: A piano database for multipitch estimation and automatic transcription of music. Technical report, INRIA. 00544155f.
- Forte, A. (1973). The Structure of Atonal Music. Yale University Press.
- Foscarin, F., McLeod, A., Rigaux, P., Jacquemard, F., and Sakai, M. (2020). ASAP: A dataset of aligned scores and performances for piano transcription. In International Society for Music Information Retrieval (ISMIR) Conference.
- Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep sparse rectifier neural networks. In Proceedings of the Conference on Artificial Intelligence and Statistics, pages 315–323.
- Good, M. (2001). MusicXML: An internet-friendly format for sheet music. In XML Conference and Expo, pages 03–04.
- Hashida, M., Matsui, T., and Katayose, H. (2008). A new music database describing deviation information of performance expressions. In International Conference on Music Information Retrieval (ISMIR), pages 489–494.
- Hawthorne, C., Elsen, E., Song, J., Roberts, A., Simon, I., Raffel, C., Engel, J., Oore, S., and Eck, D. (2018). Onsets and frames: Dual-objective piano transcription. In International Society for Music Information Retrieval (ISMIR) Conference.
- Hawthorne, C., Stasyuk, A., Roberts, A., Simon, I., Huang, C. A., Dieleman, S., Elsen, E., Engel, J., and Eck, D. (2019). Enabling factorized piano music modeling and generation with the MAESTRO dataset. International Conference on Learning Representations (ICLR).
- Huang, C.-Z. A., Hawthorne, C., Roberts, A., Dinculescu, M., Wexler, J., Hong, L., and Howcroft, J. (2019). The Bach Doodle: Approachable music composition with machine learning at scale. In International Society for Music Information Retrieval (ISMIR) Conference.
- Huang, C.-Z. A., Vaswani, A., Uszkoreit, J., Simon, I., Hawthorne, C., Shazeer, N., Dai, A. M., Hoffman, M. D., Dinculescu, M., and Eck, D. (2018). Music Transformer: Generating music with long-term structure. In International Conference on Learning Representations (ICLR).
- Ioffe, S., and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the International Conference on Machine Learning (ICML).
- Kim, J. W., and Bello, J. P. (2019). Adversarial learning for improved onsets and frames music transcription. In International Society for Music Information Retrieval (ISMIR) Conference.
- Kong, Q., Cao, Y., Iqbal, T., Wang, Y., Wang, W., and Plumbley, M. D. (2020). PANNs: Large-scale pretrained audio neural networks for audio pattern recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28: 2880–2894.
- Kong, Q., Li, B., Song, X., Wan, Y., and Wang, Y. (2021). High-resolution piano transcription with pedals by regressing onsets and offsets times. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:3707–3717. DOI: 10.1109/TASLP.2021.3121991
- Kwon, T., Jeong, D., and Nam, J. (2020). Polyphonic piano transcription using autoregressive multi-state note model. In International Society for Music Information Retrieval (ISMIR) Conference.
- Li, B., Liu, X., Dinesh, K., Duan, Z., and Sharma, G. (2018). Creating a multitrack classical music performance dataset for multimodal music analysis: Challenges, insights, and applications. IEEE Transactions on Multimedia, 21(2):522–535. DOI: 10.1109/TMM.2018.2856090
- Meredith, D. (2016). Computational Music Analysis. Springer. DOI: 10.1007/978-3-319-25931-4
- Nakamura, E., Yoshii, K., and Katayose, H. (2017). Performance error detection and post-processing for fast and accurate symbolic music alignment. In International Society for Music Information Retrieval (ISMIR) Conference, pages 347–353.
- Nienhuys, H.-W., and Nieuwenhuizen, J. (2003). Lilypond, a system for automated music engraving. In Proceedings of the XIV Colloquium on Musical Informatics, volume 1, pages 167–171.
- Niwattanakul, S., Singthongchai, J., Naenudorn, E., and Wanapu, S. (2013). Using of Jaccard coefficient for keywords similarity. In Proceedings of the International MultiConference of Engineers and Computer Scientists (IMECS), pages 380–384.
- Raffel, C. (2016). Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching. PhD thesis, Columbia University.
- Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marcal, A. R., Guedes, C., and Cardoso, J. S. (2012). Optical music recognition: State-of-the-art and open issues. International Journal of Multimedia Information Retrieval, 1(3): 173–190.
- Repp, B. H. (1996). The art of inaccuracy: Why pianists’ errors are difficult to hear. Music Perception, 14(2): 161–183.
- Roland, P. (2002). The music encoding initiative (MEI). In Proceedings of the First International Conference on Musical Applications Using XML, pages 55–59.
- Sapp, C. S. (2005). Online database of scores in the Humdrum file format. In International Conference on Music Information Retrieval (ISMIR), pages 664–665.
- Shi, Z., Sapp, C. S., Arul, K., McBride, J., and Smith, J. O.
III. . (2019). SUPRA: Digitizing the Stanford University Piano Roll Archive. In International Society for Music Information Retrieval (ISMIR) Conference, pages 517–523. - Smith, D., and Wood, C. (1981). The ‘USI’, or Universal Synthesizer Interface. In Audio Engineering Society Convention 70.
- Volk, A., Wiering, F., and Kranenburg, P. (2011). Unfolding the potential of computational musicology. In International Conference on Informatics and Semiotics in Organisations (ICISO), pages 137–144.
- Yang, L., Chou, S., and Yang, Y. (2017). MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. In International Society for Music Information Retrieval (ISMIR) Conference, pages 324–331.
DOI: https://doi.org/10.5334/tismir.80 | Journal eISSN: 2514-3298
Language: English
Submitted on: Oct 25, 2020
Accepted on: Feb 1, 2022
Published on: May 12, 2022
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Keywords:
© 2022 Qiuqiang Kong, Bochen Li, Jitong Chen, Yuxuan Wang, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.