Skip to main content
Have a personal or library account? Click to login
Improving Motif Discovery of Symbolic Polyphonic Music with Motif Note Identification Cover

Improving Motif Discovery of Symbolic Polyphonic Music with Motif Note Identification

By: Jun-You Wang,  Yu-Chia Kuo and  Li Su  
Open Access
|Sep 2025

Figures & Tables

Figure 1

A visualization of the motif‑discovery task. The excerpt comes from the first four bars of Beethoven’s Sonata No. 1 in F Minor, Op. 2, No. 1. Red and blue notes represent two different motifs, while gray notes are non‑motif notes that do not belong to an occurrence of any motif.

Figure 2

An overview of the proposed motif discovery framework, which divides the motif‑discovery task into motif note identification (MNID) and repeated pattern discovery (RPD). During training (left‑hand side), pseudo‑labeling is adopted to address the low‑resource issue for MNID, where the intersection of a pre‑trained melody identification (denoted as MeloID) model’s prediction and the MNID model’s prediction is treated as the pseudo‑label. During testing (right‑hand side), we first use an MNID model to identify motif notes from the input musical score. Then, we apply an RPD algorithm (such as SIATEC or CSA (Hsiao et al., 2023; Meredith et al., 2002) to discover repeated patterns from the identified motif notes.

Figure 3

Comparison of the melody identification (MeloID) and motif note identification (MNID) results on two examples of the Mozart Piano Sonata dataset. (a) and (b) are the first 10 beats of the the first movement of Mozart’s Piano Sonata No. 6 in D Major, KV284. (c) and (d) are the first 10 beats of the second movement of Mozart’s Piano Sonata No. 1 in C Major, KV279.

Table 1

Hyperparameters of the CSA algorithm and their values used in the experiments of both datasets.

NameDefinitionValues
BPS‑MotifJKU‑PDD
θThe cardinal score threshold that decides whether to merge two motifs0.50.7
mThe minimum note number for any motif occurrence44
θoThe onset tolerance for matching two vectors (between two note pairs)00
θpThe pitch tolerance for matching two vectors (between two note pairs)13
γoThe inter‑onset interval threshold for the compactness condition22
γpThe inter‑pitch interval threshold for the compactness condition
δThe maximum duration of motifs/patterns12
Table 2

MNID results on both the BPS‑Motif dataset and the JKU‑PDD. The subscripts denote the standard deviations across five folds. Note that, strictly speaking, the results on JKU‑PDD may not reflect the actual motif note identification results, as the ground‑truths here are not always motif notes. Please refer to Section 4.1 for detailed discussion.

MNIDSettingBPS‑MotifJKU‑PDD
AccuracyF1‑scorePrecisionRecallAccuracyF1‑scorePrecisionRecall
SkylineN/A0.7760.0290.6710.0550.5910.0600.8220.0360.8690.1310.7160.3000.6340.3280.9990.003
CNN0.7850.0210.6520.0430.6040.0610.7380.0570.8490.1000.6810.2730.5940.2940.9600.048
PL0.7770.0190.6290.0460.5960.0750.6980.0880.8600.0880.6780.2770.5990.3030.9200.075
MI0.8020.0280.6600.0500.6360.0590.7130.0770.8730.0780.6930.2650.6250.3130.9280.062
MidiBERT0.8230.0280.7060.0530.6690.0650.7760.0710.8280.1280.6620.2920.5710.3160.9810.012
PL0.8310.0260.7160.0560.6790.0580.7820.0660.8250.1310.6600.2930.5670.3170.9820.013
MI0.8390.0290.7210.0630.7010.0760.7630.0610.8660.0880.6830.2800.5990.3180.9440.041
Table 3

The total RPD runtime on the BPS‑Motif dataset in minutes using one Intel i9‑13900KF CPU.

RPDRuntime (min)
SIATEC340.9
SIATEC_CS4076.9
CSA978.3
Table 4

Motif discovery results on the BPS‑Motif dataset. PL denotes pseudo‑labeling; MI denotes intersection of the melody line and pseudo‑labels. The subscripts est, occ, and thr indicate the establishment, occurrence, and three‑layer measurements, respectively. The last three rows are the oracle setting that assumes an 100% accuracy of motif note identification.

MNIDSettingRPDPestRestFestPoccRoccFoccPthrRthrFthr
N/ASIATEC0.1800.6440.2800.2100.2770.2240.0410.3000.071
SkylineN/A0.2260.6600.3330.4100.2750.3080.0650.3350.107
CNN0.2400.6310.3440.4030.2210.2680.0730.3190.116
PL0.2360.6090.3360.3990.2400.2810.0720.3070.112
MI0.2490.6430.3540.4450.2510.3070.0760.3310.121
MidiBERT0.2570.6530.3640.4460.2630.3190.0810.3370.126
PL0.2580.6540.3660.4400.2660.3200.0820.3410.127
MI0.2580.6560.3660.4330.2600.3090.0830.3420.130
N/ACSA0.5530.8470.6630.1300.5700.2040.1270.2630.169
SkylineN/A0.5550.8060.6520.3120.4640.3540.2140.3570.264
CNN0.5210.7610.6100.3300.4100.3490.2020.3340.246
PL0.5160.7180.5930.3270.3880.3430.1980.3200.238
MI0.5400.7480.6190.3270.3870.3400.2110.3400.255
MidiBERT0.5760.8110.6680.3320.4390.3620.2240.3630.272
PL0.5800.8180.6730.3360.4360.3630.2280.3590.275
MI0.5910.8030.6760.3500.4300.3720.2310.3580.276
OracleN/ASIATEC0.2880.7320.4100.5360.3610.4180.1050.4430.165
N/ASIATEC_CS0.3270.5830.4130.4910.3120.3560.1850.3220.231
N/ACSA0.7340.8660.7890.4430.5400.4680.3530.4800.400
Table 5

Motif discovery results on JKU‑PDD. Note that, in this experiment, we use the annotations in the monophonic version as ground‑truth, as they resemble motif annotations better.

MNIDSettingRPDPestRestFestPoccRoccFoccPthrRthrFthr
N/ASIATEC_CS0.2100.3230.2510.0000.0000.0000.2200.2770.243
SkylineN/A0.4150.6630.4920.3800.6310.4720.4030.5340.439
CNN0.4460.6260.5110.4100.6530.5010.4200.5480.460
PL0.4180.6700.4920.3950.6650.4940.3870.5360.426
MI0.4330.7070.5200.5020.7620.6010.4030.5770.454
MidiBERT0.3590.5890.4310.2390.3270.2760.3600.4870.402
PL0.3980.6250.4690.2900.4670.3570.3830.4920.412
MI0.3870.6680.4680.3960.6060.4780.3720.5130.417
N/ACSA0.1550.4040.2140.1400.3550.1800.0620.2690.095
SkylineN/A0.2810.5670.3500.3710.4890.4070.2350.5270.295
CNN0.2610.5090.3240.2950.4140.3400.2220.5050.277
PL0.2510.4760.3060.2730.3920.3100.2220.4620.271
MI0.2610.5040.3240.2560.3880.2980.2320.5110.288
MidiBERT0.2450.5680.3210.3830.4900.4140.2000.5350.263
PL0.2670.5760.3400.3980.4770.4180.2200.5390.277
MI0.2530.4960.3150.2490.4120.3090.2120.4800.263
OracleN/ASIATEC0.2090.5080.2800.7150.6840.6990.2040.5270.270
N/ASIATEC_CS0.5890.6610.6200.5530.6800.6060.6210.6560.637
N/ACSA0.2630.4070.3100.3000.4190.3370.2450.4340.295
Table 6

The average number of distinct motifs (per song) of the ground‑truth and RPD algorithms’ predictions on the BPS‑Motif dataset and the JKU‑PDD. The Oracle MNID is employed.

RPDBPS‑MotifJKU‑PDD
SIATEC13365.52799.8
SIATEC_CS34.915.6
CSA27.343.2
Ground‑truth8.24.6
Figure 4

Illustration of motif discovery results for Beethoven’s Piano Sonata No. 10 in G Major, 1st movement, mm. 8–18. From top to bottom: original score, piano roll with ground truth motif annotation, piano roll with the results using the common structure algorithm, and piano roll with results using MNID (MidiBERT with MI) plus the common structure algorithm. For the piano roll representation, a note is represented as an onset (circle dot) and a duration (horizontal line); the bar line is represented as the vertical blue dotted line. Notes with gray color are non‑motif notes. Motif notes are in the colors other than gray, and the notes belonging to the same motif are represented as the same color. The note group of each occurrence is bounded by a gray translucent box. The motif name is marked in the beginning of each occurrence. For the ground truth, the motif names are lowercase alphabets following the dataset annotation. For the motif discovery results, the motif names are uppercase and assigned in alphabetical order according to the first occurrence time.

Figure 5

Illustration of motif discovery results for Beethoven’s Piano Sonata No. 16 in G Major, 1st movement, mm. 74–81. See the caption of Figure 4 for the notation in detail.

DOI: https://doi.org/10.5334/tismir.250 | Journal eISSN: 2514-3298
Language: English
Submitted on: Jan 7, 2025
Accepted on: Aug 1, 2025
Published on: Sep 18, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Jun-You Wang, Yu-Chia Kuo, Li Su, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.