Skip to main content
Have a personal or library account? Click to login
MGPHot: A Dataset of Musicological Annotations for Popular Music (1958–2022) Cover

MGPHot: A Dataset of Musicological Annotations for Popular Music (1958–2022)

Open Access
|May 2025

Figures & Tables

Table 1

Some public music datasets used in MIR, with song counts, attributes, categories, and curation details.

Dataset#Songs#Attributes#CategoriesCurated by expertsAttribute scoresChart data
GZTAN (Tzanetakis and Cook, 2002)1,000101NoNoNo
Ballroom (Gouyon et al., 2006)69881NoNoNo
MagnaTagATune (Law et al., 2009)5,4051881NoNoNo
MGPHot21,320587YesYesYes
MTG‑Jamendo (Bogdanov et al., 2019)55,7011953NoNoNo
FMA (Defferrard et al., 2017a)106,5741631NoNoNo
MuMu (Oramas et al., 2017)147,2952501NoNoNo
MSD500 (Won et al., 2021)158,3235007NoNoNo
MSD‑last.fm (Bertin‑Mahieux et al., 2011)505,216522,3661NoNoNo
Audioset (Gemmeke et al., 2017)2,084,3205277NoNoNo
Figure 1

Number of tracks per year in the Billboard Hot 100 charts and the subset successfully mapped. The X‑axis represents the chart year, and the Y‑axis indicates the number of tracks.

Table 2

List of MGPHot attributes.

CategoryNameDescription
VocalsVocal RegisterDescribes the vocal range of lead vocal performance on a scale from low to high.
Vocal Timbre Thin to FullExpresses the timbre from thin and wispy to full and resonant.
Vocal BreathinessIndicates breathiness in the vocal delivery, characterized by airiness in the voice.
Vocal SmoothnessIndicates smoothness, reflecting the absence of roughness or raspiness.
Vocal GrittinessReflects the presence of roughness or raspiness in vocal delivery.
Vocal NasalityMeasures nasality, the pinched or ‘plugged‑up’ quality in vocal delivery.
Vocal AccompanimentIndicates the importance of non‑lead vocal accompaniment in a track.
HarmonyMinor/Major Key TonalityIndicates whether the tonality is minor, major, or ambiguous.
Harmonic SophisticationCaptures the complexity of harmony, from simple to complex chromatic notes.
RhythmTempoDescribes the song’s tempo and how other factors affect the perceived speed.
Cut Time FeelReflects the presence of a ‘cut time’ feel, where the rhythm is felt in half‑time.
Triple MeterIndicates the presence of a triple meter, such as 3/4 time.
Compound MeterIndicates the presence of compound meter, combining triple and duple rhythms.
Odd MeterReflects the presence of odd meters, such as 5 or 7 beats per measure.
Swing FeelMeasures swing feel, where the first 8th note is longer than the second.
Shuffle FeelSimilar to swing feel, but with more pronounced articulation of each note.
Syncopation Low to HighIndicates syncopation, where rhythm emphasizes offbeats or anticipations.
BackbeatMeasures the dominance of a backbeat rhythm, with emphasis on beats 2 and 4.
DanceabilityRates how suitable the song is for dancing, from low to high.
InstrumentationDrum SetIndicates the presence and dominance of a drum set in the song.
Drum AggressivenessReflects the aggressiveness of the drum set performance.
Synthetic DrumsIndicates the presence of synthetic drums, often programmed.
PercussionReflects the dominance of percussion in the song, excluding drums.
Electric GuitarIndicates the presence and dominance of electric guitar(s).
Electric Guitar DistortionMeasures the degree of guitar distortion, from clean to ‘dirty’.
Acoustic GuitarIndicates the presence of acoustic guitar(s).
String EnsembleReflects the presence and dominance of a string ensemble in the song.
Horn EnsembleIndicates the presence of a horn ensemble, from small to large.
PianoIndicates the presence of a piano in the song.
OrganReflects the presence of an organ in the instrumentation.
RhodesIndicates the presence of a Fender Rhodes or other electric piano.
SynthesizerReflects the presence of synthesizers in the instrumentation.
Synth TimbreDescribes synthesizer timbres, from ambient to robotic or industrial.
Bass GuitarReflects the presence and dominance of a bass guitar.
Reed InstrumentReflects the presence of reed instruments like saxophones or clarinets.
LyricsAngry LyricsMeasures the presence and dominance of angry lyrics in the song.
Sad LyricsMeasures the presence and dominance of sad lyrics.
Happy/Joyful LyricsReflects the presence of happy or joyful lyrics.
Humorous LyricsIndicates the presence of humorous or funny lyrics.
Love/Romance LyricsMeasures the presence of romantic or love‑themed lyrics.
Social/Political LyricsIndicates the presence of lyrics about social or political issues.
Abstract LyricsMeasures the presence of abstract or whimsical lyrics.
Explicit LyricsMeasures the explicitness of lyrics, from clean to very explicit.
SonorityLive RecordingIndicates whether the song was recorded live or in a studio.
Audio ProductionMeasures the quality of the audio production, from poor to excellent.
Aural IntensityMeasures the song’s overall loudness or softness.
Acoustic SonorityIndicates the presence of acoustic instruments or voices.
Electric SonorityMeasures the presence of electric instruments.
Synthetic SonorityReflects the presence of synthetic instruments like synthesizers.
CompositionFocus on Lead VocalReflects the importance of lead vocals to the overall track.
Focus on LyricsMeasures the importance of lyrics in the overall track.
Focus on MelodyIndicates the importance of melody in the track.
Focus on Vocal AccompanimentReflects the importance of backing vocals in the track.
Focus on Rhythmic GrooveIndicates how important the rhythmic groove is to the track.
Focus on Musical ArrangementsReflects the importance of the arrangement and orchestration.
Focus on FormMeasures the importance of the song’s form or structure.
Focus on RiffsReflects the importance of instrumental riffs in the track.
Focus on PerformanceMeasures the importance of instrumental performance in the track.
Figure 2

Trend curve of all the attributes by category across the years.

Figure 3

Similarity matrix of MGPHot year embeddings. Darker means more similar, lighter means less similar.

Figure 4

Foote novelty metric for the 58 attributes.

Figure 5

Foote novelty metric for the attributes in each category.

Figure 6

Pearson correlation between categorical distances according to the Mantel test, having all p < 0.05.

DOI: https://doi.org/10.5334/tismir.236 | Journal eISSN: 2514-3298
Language: English
Submitted on: Nov 6, 2024
Accepted on: Mar 22, 2025
Published on: May 28, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Sergio Oramas, Fabien Gouyon, Steve Hogan, Camilo Landau, Andreas Ehmann, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.