Audio and Music Analysis on the Web using Essentia.js

Albin Correya; Jorge Marcos-Fernández; Luis Joglar-Ongay; Pablo Alonso-Jiménez; Xavier Serra; Dmitry Bogdanov

doi:10.5334/tismir.111

Audio and Music Analysis on the Web using Essentia.js

Transactions of the International Society for Music Information Retrieval

Volume 4 (2021): Issue 1

By: Albin Correya , Jorge Marcos-Fernández, Luis Joglar-Ongay , Pablo Alonso-Jiménez , Xavier Serra and Dmitry Bogdanov

Open Access

|Nov 2021

Adenot, P. and Choi, H. (2021). Web audio API, W3C candidate recommendation snapshot. Retrieved March 31, 2021, from https://www.w3.org/TR/webaudio.
Search in Google Scholar Back to article
Alonso-Jiménez, P., Bogdanov, D., Pons, J., and Serra, X. (2020a). TensorFlow audio models in Essentia. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020). DOI: 10.1109/ICASSP40776.2020.9054688
Open DOI Search in Google Scholar Back to article
Alonso-Jiménez, P., Bogdanov, D., and Serra, X. (2020b). Deep embeddings with Essentia models. In International Society for Music Information Retrieval Conference (ISMIR 2020) Late Breaking Demo.
Search in Google Scholar Back to article
Alonso-Jiménez, P., Joglar-Ongay, L., Serra, X., and Bogdanov, D. (2019). Automatic detection of audio problems for quality control in digital music distribution. In Audio Engineering Society Convention 146.
Search in Google Scholar Back to article
Austin, C. (2014). CppCon 2014: Embind and Emscripten: Blending C++11, JavaScript, and the Web Browser. Retrieved March 31, 2021, from https://www.youtube.com/watch?v=Dsgws5zJiwk.
Search in Google Scholar Back to article
Bernardo, F., Kiefer, C., and Magnusson, T. (2019). An AudioWorklet-based signal engine for a live coding language ecosystem. In Web Audio Conference (WAC 2019), pages 77–82.
Search in Google Scholar Back to article
Bertin-Mahieux, T., Ellis, D. P. W., Whitman, B., and Lamere, P. (2011). The Million Song Dataset. In International Society for Music Information Retrieval Conference (ISMIR 2011).
Search in Google Scholar Back to article
Bierman, G., Abadi, M., and Torgersen, M. (2014). Understanding TypeScript. In European Conference on Object-Oriented Programming (ECOOP 2014). DOI: 10.1007/978-3-662-44202-9_11
Open DOI Search in Google Scholar Back to article
Böck, S., Korzeniowski, F., Schlüter, J., Krebs, F., and Widmer, G. (2016). madmom: A new Python audio and music signal processing library. In ACM International Conference on Multimedia (MM 2016). DOI: 10.1145/2964284.2973795
Open DOI Search in Google Scholar Back to article
Bogdanov, D., Wack, N., Gómez, E., Gulati, S., Herrera, P., Mayor, O., Roma, G., Salamon, J., Zapata, J., and Serra, X. (2013). Essentia: An audio analysis library for music information retrieval. In International Society for Music Information Retrieval Conference (ISMIR 2013). DOI: 10.1145/2502081.2502229
Open DOI Search in Google Scholar Back to article
Brossier, P. M. (2006). The aubio library at MIREX 2006. In Music Information Retrieval Evaluation Exchange (MIREX 2006).
Search in Google Scholar Back to article
Cartwright, M., Seals, A., Salamon, J., Williams, A., Mikloska, S., MacConnell, D., Law, E., Bello, J. P., and Nov, O. (2017). Seeing sound: Investigating the effects of visualizations and complexity on crowdsourced audio annotations. Proceedings of the ACM on Human-Computer Interaction, 1(CSCW): 1–21. DOI: 10.1145/3134664
Open DOI Search in Google Scholar Back to article
Choi, H. (2018). AudioWorklet: The future of web audio. In International Computer Music Conference Proceedings (ICMC 2018).
Search in Google Scholar Back to article
Collins, N. and Knotts, S. (2019). A JavaScript musical machine listening library. In International Computer Music Conference (ICMC 2019).
Search in Google Scholar Back to article
Correya, A., Alonso-Jiménez, P., Marcos-Fernández, J., Serra, X., and Bogdanov, D. (2021). Essentia TensorFlow models for audio and music processing on the web. In Web Audio Conference (WAC 2021).
Search in Google Scholar Back to article
Correya, A., Bogdanov, D., Joglar-Ongay, L., and Serra, X. (2020). Essentia.js: A JavaScript library for music and audio analysis on the web. In International Society for Music Information Retrieval Conference (ISMIR 2020).
Search in Google Scholar Back to article
Fiala, J., Segal, N., and Rawlinson, H. A. (2015). Meyda: an audio feature extraction library for the Web Audio API. In Web Audio Conference (WAC 2015).
Search in Google Scholar Back to article
Fonseca, E., Pons Puig, J., Favory, X., Font Corbera, F., Bogdanov, D., Ferraro, A., Oramas, S., Porter, A., and Serra, X. (2017). Freesound Datasets: A platform for the creation of open audio datasets. In International Society for Music Information Retrieval Conference (ISMIR 2017).
Search in Google Scholar Back to article
Font, F., Roma, G., and Serra, X. (2013). Freesound technical demo. In ACM International Conference on Multimedia (MM 2013). DOI: 10.1145/2502081.2502245
Open DOI Search in Google Scholar Back to article
Haas, A., Rossberg, A., Schuff, D. L., Titzer, B. L., Holman, M., Gohman, D., Wagner, L., Zakai, A., and Bastien, J. (2017). Bringing the web up to speed with WebAssembly. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). DOI: 10.1145/3062341.3062363
Open DOI Search in Google Scholar Back to article
Herrera, D., Chen, H., Lavoie, E., and Hendren, L. (2018). Numerical computing on the web: Benchmarking for the future. In ACM SIGPLAN International Symposium on Dynamic Languages (DLS 2018). DOI: 10.1145/3276945.3276968
Open DOI Search in Google Scholar Back to article
ITP NYU (2018). ml5.js: Friendly machine learning for the Web. Retrieved March 31, 2021, from https://ml5js.org.
Search in Google Scholar Back to article
Jillings, N., Bullock, J., and Stables, R. (2016). JSXtract: A realtime audio feature extraction library for the Web. In International Society for Music Information Retrieval Conference (ISMIR 2016) Late Breaking Demo.
Search in Google Scholar Back to article
Jillings, N., Moffat, D., De Man, B., and Reiss, J. D. (2015). Web Audio Evaluation Tool: A browserbased listening test environment. In Sound and Music Computing Conference (SMC 2015).
Search in Google Scholar Back to article
Joglar-Ongay, L. (2020a). Applications of Essentia on the web. Master’s thesis, Universitat Pompeu Fabra. Master Thesis. DOI: 10.5281/zenodo.4091073.
Open DOI Search in Google Scholar Back to article
Joglar-Ongay, L. (2020b). Sónar+D CCCB 2020 Workshop: How to automatically detect quality problems in your music collection. Retrieved April 15, 2021, from https://www.youtube.com/watch?v=NR9-hVLs4b8.
Search in Google Scholar Back to article
Kleimola, J. and Larkin, O. (2015). Web audio modules. In Sound and Music Computing Conference (SMC 2015).
Search in Google Scholar Back to article
Law, E., West, K., Mandel, M. I., Bay, M., and Downie, J. S. (2009). Evaluation of algorithms using games: The case of music tagging. In International Society for Music Information Retrieval Conference (ISMIR 2009).
Search in Google Scholar Back to article
Lazzarini, V., Costello, E., Yi, S., and Fitch, J. (2014). Csound on the Web. In Linux Audio Conference (LAC 2014).
Search in Google Scholar Back to article
Lazzarini, V., Costello, E., Yi, S., and Fitch, J. (2015). Extending Csound to the Web. In Web Audio Conference (WAC 2015).
Search in Google Scholar Back to article
Letz, S., Orlarey, Y., and Fober, D. (2017). Compiling Faust audio DSP code to WebAssembly. In Web Audio Conference (WAC 2017).
Search in Google Scholar Back to article
Mahadevan, A., Freeman, J., Magerko, B., and Martinez, J. C. (2015). EarSketch: Teaching computational music remixing in an online web audio based learning environment. In Web Audio Conference (WAC 2015). DOI: 10.1145/2676723.2691869
Open DOI Search in Google Scholar Back to article
Mathieu, B., Essid, S., Fillon, T., Prado, J., and Richard, G. (2010). YAAFE, an easy to use and efficient audio feature extraction software. In International Society for Music Information Retrieval Conference (ISMIR 2010).
Search in Google Scholar Back to article
Matuszewski, B. and Schnell, N. (2017). LFO – a graph-based modular approach to the processing of data streams. In Web Audio Conference (WAC 2017).
Search in Google Scholar Back to article
McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M., Battenberg, E., and Nieto, O. (2015). librosa: Audio and music signal analysis in Python. In Python in Science Conference (SciPy 2015). DOI: 10.25080/Majora-7b98e3ed-003
Open DOI Search in Google Scholar Back to article
Moffat, D., Ronan, D., and Reiss, J. D. (2015). An evaluation of audio feature extraction toolboxes. In International Conference on Digital Audio Effects (DAFx 2015).
Search in Google Scholar Back to article
MTG UPF (2021). MusicCritic: An automatic assessment system for musical exercises. Retrieved March 31, 2021, from https://musiccritic.upf.edu.
Search in Google Scholar Back to article
Ning, E. (2020). ONNX.js – A JavaScript library to run ONNX models in browsers and Node.js. Retrieved March 31, 2021, from https://www.w3.org/2020/06/machine-learning-workshop/talks/onnx_js_a_javascript_library_to_run_onnx_models_in_browsers_and_node_js.html.
Search in Google Scholar Back to article
Pons, J. and Serra, X. (2019). musicnn: Pre-trained convolutional neural networks for music audio tagging. In International Society for Music Information Retrieval Conference (ISMIR 2019) Late Breaking Demo.
Search in Google Scholar Back to article
Porter, A., Bogdanov, D., Kaye, R., Tsukanov, R., and Serra, X. (2015). AcousticBrainz: A community platform for gathering music information obtained from audio. In International Society for Music Information Retrieval Conference (ISMIR 2015).
Search in Google Scholar Back to article
Roberts, A., Hawthorne, C., and Simon, I. (2018). Magenta. js: A JavaScript API for augmenting creativity with deep learning. In Joint Workshop on Machine Learning for Music (ICML).
Search in Google Scholar Back to article
Schoeffler, M., Stöter, F.-R., Edler, B., and Herre, J. (2015). Towards the next generation of web-based experiments: A case study assessing basic audio quality following the ITU-R recommendation BS. 1534 (MUSHRA). In Web Audio Conference (WAC 2015).
Search in Google Scholar Back to article
Schreiber, H. and Müller, M. (2019). Musical tempo and key estimation using convolutional neural networks with directional filters. In Sound and Music Computing Conference (SMC 2019).
Search in Google Scholar Back to article
Smilkov, D., Thorat, N., Assogba, Y., Yuan, A., Kreeger, N., Yu, P., Zhang, K., Cai, S., Nielsen, E., Soergel, D., Bileschi, S., Terry, M., Nicholson, C., Gupta, S. N., Sirajuddin, S., Sculley, D., Monga, R., Corrado, G., Viégas, F. B., and Wattenberg, M. (2019). TensorFlow.js: Machine learning for the web and beyond. In Conference on Systems and Machine Learning (SysML 2019).
Search in Google Scholar Back to article
Stack Overflow (2021). Stack Overflow Annual Developer Survey. Retrieved March 31, 2021, from https://insights.stackoverflow.com/survey.
Search in Google Scholar Back to article
Thompson, L., Cannam, C., and Sandler, M. (2017). Piper: Audio feature extraction in browser and mobile applications. In Web Audio Conference (WAC 2017).
Search in Google Scholar Back to article
W3C TAG (2013). Web Audio API Design Review. Retrieved March 31, 2021, from https://github.com/w3ctag/design-reviews/blob/master/2013/07/WebAudio.md.
Search in Google Scholar Back to article
West, K., Kumar, A., Shirk, A., Zhu, G., Downie, J. S., Ehmann, A., and Bay, M. (2010). The networked environment for music analysis (NEMA). In IEEE World Congress on Services (SERVICES 2010). DOI: 10.1109/SERVICES.2010.113
Open DOI Search in Google Scholar Back to article
Zakai, A. (2011). Emscripten: An LLVM-to-JavaScript compiler. In ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2011). DOI: 10.1145/2048147.2048224
Open DOI Search in Google Scholar Back to article

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.5334/tismir.111 | Journal eISSN: 2514-3298

Journal RSS Feed

Language: English

Submitted on: Apr 24, 2021

Accepted on: Sep 2, 2021

Published on: Nov 22, 2021

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

Software,

web audio,

audio analysis,

music signal processing,

music audio classification,

deep learning

© 2021 Albin Correya, Jorge Marcos-Fernández, Luis Joglar-Ongay, Pablo Alonso-Jiménez, Xavier Serra, Dmitry Bogdanov, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 4 (2021): Issue 1