Inference#
JAX:
# Generate 4 seconds of audio
mrt jax generate
MLX:
# Generate 4 seconds of audio
mrt mlx generate --bits=8
To print MusicCoCa tokens for a prompt directly without generating audio:
from magenta_rt.musiccoca import MusicCoCa
m = MusicCoCa()
print(m.tokenize(m.embed('a jazz piano trio')).tolist())
# Get tokens from audio
from magenta_rt.audio import Waveform
wav = Waveform.from_file("jazz_piano_trio.wav")
print(m.tokenize(m.embed(wav)).tolist())
Bulk generation#
Bulk-generate 60s audio clips from MusicCoCa prompts for listener evaluation:
python scripts/bulk_generate.py --size=mrt2_base
Outputs are saved to outputs/eval_audio/<size>/.