MIDI Gen AI

About

MIDI Gen AI is a large language model (LLM) based on the GPT-2 architecture and fine-tuned on textual representations of MIDI files. The Lakh MIDI dataset used for training is a collection of 176,581 unique MIDI files, 45,129 of which have been matched and aligned to entries in the Million Song Dataset. This vast and diverse dataset provides a rich training ground for the AI, enabling it to learn and generate a wide variety of MIDI sequences including melodies, chords, and drum patterns.

Users can upload a MIDI file, and the AI generates two new MIDI sequences. These output sequences include the beginning of the uploaded MIDI file plus a novel continuation. The resulting generations can be played in the browser using the html-midi-player or downloaded as MIDI files. There are two generation parameters: temperature and top_k. Higher values of these parameters increase the randomness and diversity of the generated MIDI sequences, while lower values will lead to more repetitive and predictable sequences.

For more details on the design of MIDI Gen AI, see the repository README on GitHub.