Waveform Mashup – A Self Portrait with Data

I created a digital self portrait in the form of a ‘waveform mashup’.

The waveform mashup is a visual abstraction of waveform data from my ‘Top Songs of 2017’ as defined by Spotify.


This project visualizes waveforms vertically as opposed to horizontally, so that the left and right sides of the waveform are aligned to the viewers ears, as opposed to using left ➡️ right as an axis to emphasize chronology.

traditional waveform data represented chronologically from left to right

The vertical waveforms of many songs are given a weak opacity and layered on top of each other in order to create the ‘mashup’ itself. For example, here is a mashup of only 3 songs, with a slightly stronger opacity:

3 song waveforms layered on top of each other

As more songs are added, the visualization becomes more abstract. The goal of this project overall is to re-convert this mashup into audio for an aural self-portrait. The current status of this interactive web version can be found (here- link to be updated).


Processing was one of the most fun parts of this project for me. I used a variety of tools to convert the songs in my Spotify into the data used above.

Getting the MP3 Data

Starting with data created by Spotify based on my account usage:

slightly embarrassing to post…

I first used a simple online tool by Joel Lehman called ‘Simple Playlist Exporter‘ that converts the playlist information into CSV format

login to your spotify account with this tool and click the playlist you want to download

I then used my dear old friend Microsoft Excel to do quick analysis of the data and add other information to the list. I added:

  • A song ID (001 – 100) for data management purposes
  • Genre (self-defined) which broke into 3 categories (Hip-Hop/Rap, Pop/Electronic, and Punk/Rock)
  • Youtube links to each song – a time-intensive manual process 💤
  • and finally the duration of each song – a boring manual process recorded and sped up to 400% for your viewing pleasure:

I used the wonderful youtube-dl tool to batch download the mp3 file of each song and assign them a numeric label. The command looks like this:

youtube-dl -x --audio-format 'mp3' -o "%(autonumber)s-%(title)s.%(ext)s" --batch-file='youtube-link-list.txt'

This moment was magical for me, as the mp3 filed into a folder one after the other:

copyright disclaimer, I used this content to create a mashup for artistic, non-commercial purposes

Some quick summary statistics from the data set:

self-assigned genre breakdown

Pivot tables are your friend, people:

song length analysis – average song length is 3m 43s

Getting the waveform data

I created my visualization using D3, based on the John Magoonian III’s D3 waveform visualizer – thank you John.

This D3 script uses JSON data that has this format as an input:


The key element of this JSON is the “data” field, which contains the left and right waveforms as negative and positive numbers, respectively.

To get data in this format for my MP3s, I used BBC’s audiowaveform tool. It converts audio files (MP3, WAV, etc) into .DAT files:

audiowaveform -i song.mp3 -o song.dat

which then can also be converted into JSON files:

audiowaveform -i song.dat -o song.json

I created a shell script to run these commands on all of the MP3s in my folder. I also had to cleanse the data by removing space and comma characters from the file names. You can find all of these scripts (here link to be updated).


A red-flag I did not notice at this point was that each JSON file was about 500kB in size – making the total file size for 100 songs around 50MB – this created issues once I had successfully linked my JSON data to the D3 script.

In order to make a visualization that can be replicated elsewhere, I will need to cut down on the file size by re-doing the data processing with a smaller sample rate.

However, after waiting a very long time for the page to load, at least some of the data came through into the visualization show at the beginning of this post:

This post will be updated as the project progresses

One thought on “Waveform Mashup – A Self Portrait with Data”

Leave a Reply

Your email address will not be published. Required fields are marked *