ComMU
ComMU has 11,144 MIDI samples that consist of short note sequences created by professional composers with their corresponding 12 metadata. We propose combinatorial music generation, a new task that generate diverse and high-quality music only with metadata through auto-regressive language model. Here are the ComMU’s 12 metadata:
- BPM, genre, key, instrument, track-role, time signature, pitch range, number of measures, chord progression, min velocity, max velocity, and rhythm.
Examples of the dataset
- #bpm: 100, #key: C major, #time_signature: 4/4 #number_of_measures: 8, #genre: cinematic, #rhythm: standard #track-role: accompaniment, #pitch_range: mid_low, #instruments: piano, #min_velocity: 36, #max_velocity: 40 #chord_progression: F → C → Am → G
- #bpm: 120, #key: A minor, #time_signature: 3/4 #number_of_measures: 16, #genre: cinematic, #rhythm: standard #track-role: main_melody, #pitch_range: mid_high, #instruments: string, #min_velocity: 70, #max_velocity: 70 #chord_progression: Am → Em7 → FM7 → Em7 → Dm7 → CM7 → Bm7b5 → E7 → Am → Em7 → FM7 → Em7 → Dm7 → CM7 → Bm7b5 → Am
Pipeline of data collection
Combinatorial music generation
As shown above, the process of combinatorial music generation is divided into two stages. In stage 1, a note sequence is generated from a set of metadata. In stage 2, those note sequences are combined to produce a complete piece of music. ComMU is utilized to solve stage 1.
Stage 1
Audio samples are automatically generated only with descripted metadata.
- Common metadata of the 5 clips are as follows:
- #bpm: 130, #key: A minor, #time_signature: 4/4
- #number_of_measures: 8, #genre: new age, #rhythm: standard
- #chord_progression: Am → F → C → G → A m → F → C → G
- #track-role: accompaniment, #pitch_range: mid_low, #instrument: piano, #min_velocity: 40, #max_velocity: 50
- #track-role: main_melody, #pitch_range: mid, #instrument: piano, #min_velocity: 60, #max_velocity: 70
- #track-role: pad, #pitch_range: mid_low, #instrument: piano, #min_velocity: 70, #max_velocity: 80
- #track-role: pad, #pitch_range: mid_low, #instrument: string, #min_velocity: 2, #max_velocity: 127
- #track-role: riff, #pitch_range: mid_high, #instrument: piano, #min_velocity: 70, #max_velocity: 80
Stage 2
A human composer combines the 5 above audio samples, putting only 3-4 minutes to create the full song below.
The followings are musics that randomly combined the 5 samples generated in stage1 while maintaining only the track-role of the samples. Although there are differences from the music combined by the composer above, they sounds harmonious because the chord progression between the samples is consistent.
Stage 1
Below is another set of samples with a different genre. Audio samples are automatically generated only with described metadata.
- Common metadata of the 5 clips are as follows:
- #bpm: 80, #key: A minor, #time_signature: 4/4
- #number_of_measures: 8, #genre: cinematic, #rhythm: standard
- #track-role: main_melody, #pitch_range: mid_high, #instrument: violin, #min_velocity: 1, #max_velocity: 127
- #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
- #track-role: main_melody, #pitch_range: mid_high, #instrument: piano, #min_velocity: 25, #max_velocity: 60
- #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
- #track-role: accompaniment, #pitch_range: mid_low, #instrument: piano, #min_velocity: 25, #max_velocity: 55
- #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
- #track-role: main_melody, #pitch_range: mid_high, #instrument: string, #min_velocity: 1, #max_velocity: 127
- #chord_progression: Dm7 → Em7 → Fmaj7 → Cmaj7 → Am7 → Em7 → Fmaj7 → Cmaj7 → F#m7b5 → Gsus4 → E7
- #track-role: accompaniment, #pitch_range: mid_low, #instrument: piano, #min_velocity: 25, #max_velocity: 55
- #chord_progression: Dm7 → Em7 → Fmaj7 → Cmaj7 → Am7 → Em7 → Fmaj7 → Cmaj7 → F#m7b5 → Gsus4 → E7
Stage 2
A human composer combines the 5 above audio samples, putting only 3-4 minutes to create the full song below.
Ground truth vs. Generated
Our system does not reconstruct ground-truth samples but generate samples that have originality.
Ground truth | Generated |
Diversity of Generated Music
We can check the diversity of generated music with same metadata. Corresponding metadata for each set of examples are listed here.
- Piano Example: #bpm: 80, #key: A minor, #time_signature: 4/4, #number_of_measures: 8, #genre: cinematic, #rthythm: standard, #track-role: main_melody, #pitch_range: mid, #instrument: piano, #min_velocity: 25, #max_velocity: 60 #chord_progression: Dm7 → Em7 → Asus4 → Am → Em7 → Dmaj7 → Asus4 → Am → Dm7 → Em7 → Asus4 → Em7 → F#m7b5 → Em7 → Asus4 → A
- Violin Example: #bpm: 80, #key: A minor, #time_signature: 4/4, #number_of_measures: 8, #genre: cinematic, #rthythm: standard, #track-role: main_melody, #pitch_range: mid, #instrument: violin, #min_velocity: 1, #max_velocity: 127 #chord_progression: Am → Gmaj7 → Fmaj7 → G → Cmaj7 → Dm7 → Am → Bmaj7 → E → Am
Piano Example | Violin Example |
Multi-track with Track-role
Figure shows that the track-role can provide precise guidance to generated music.
Piano with 4 track-role
All metadata are the same except for track-role
Main Melody | ||
Sub Melody | ||
Accompaniment | ||
Riff |
String with 4 track-role
All metadata are the same except for track-role
Main Melody | ||
Sub Melody | ||
Accompaniment | ||
Riff |
License
The license of ComMU dataset is under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.