chapter fourteen

14 Building and training a music Transformer

This chapter covers

Representing music with control messages and velocity values
Tokenizing music into a sequence of indexes
Building and training a music Transformer
Generating musical events using the trained Transformer
Converting musical events back to a playable MIDI file

Sad that your favorite musician is no longer with us? Sad no more: generative AI can bring them back to the stage!

Take, for example, Layered Reality, a London-based company that’s working on a project called Elvis Evolution.¹ The goal? To resurrect the legendary Elvis Presley using AI. By feeding a vast array of Elvis’ official archival material, including video clips, photographs, and music, into a sophisticated computer model, this AI Elvis learns to mimic his singing, speaking, dancing, and walking with remarkable resemblance. The result? A digital performance that captures the essence of the late King himself.

14.1 Introduction to the music Transformer

14.1.1 Performance-based music representation

14.1.2 The music Transformer architecture

14.1.3 Training the music Transformer

14.2 Tokenizing music pieces

14.2.1 Downloading training data

14.2.2 Tokenizing MIDI files

14.2.3 Preparing the training data

14.3 Building a GPT to generate music

14.3.1 Hyperparameters in the music Transformer

14.3.2 Building a music Transformer

14.4 Training and using the music Transformer

14.4.1 Training the music Transformer

14.4.2 Music generation with the trained Transformer

Summary