Lyrics API: Artist Word Choice

Using data from the musixmatch API, I have created a tool that shows what words musical artists use most in their lyrics. It uses a combination of jQuery and D3 libraries to work as a single-page “app”. Try it here.

The web “app” works by finding the artist you search for, and running through every lyric it can find by the artist to develop a lexicon. Once that lexicon is built, you can click on the box that is created to view statistics about their most-used words.

You can also view the most-used overall by clicking the “All Artists” button.

The code for the project can be found on my GitHub here. It uses an express app in combination with Node.js. A few notable aspects of the project:

Use of dictionaries
Using a dictionary was key to helping make a lexicon. The key-value pairs that get created by lines like:

//add word to lexicons
function lexicalize(word, lexPosition) {
     if (lexicon[word]) {
         console.log("this word is not new to us");
         lexicon[word] = lexicon[word] + 1;
     } else {
         console.log("this word is new to us"}
         lexicon[word] = 1;
     }

Dictionaries seem to run much faster than making a list, and are easier to query in a “lookup style” with if (lexicon[word]), replacing the need to do things like lexiconList.indexOf().

It was also handy to make temporary sorted lists based on the lexicon’s values, which is how the top data is found. See the sortDictByFreq() function:

function sortDictByFreq(dict, entries) {
 // Create topEntries array
 let topEntries = Object.keys(dict).map(function(key) {
 return [key, dict[key]];
 });

// Sort the array based on the second element
 topEntries.sort(function(first, second) {
 return second[1] - first[1];
 });

// Create a new array with only the entries number of spaces
 topEntries = topEntries.slice(0, entries);
 return topEntries;
};

AJAX (a la jQuery)
AJAX (Asynchronous JavaScript and XML) and it’s daddy jQuery are obviously both essential to the success of this single-page app.

There are actually multiple AJAX calls that fire off per-search. The design of the API makes the lyrics and artist information a little detached. Once an artist is search, first their albums are identified, then their songs, and each of the lyrics for those songs. Like so:

data table illustration for musixmatch API queries

This triage of information lookup is done through the searchArtist() to get the artist ID, then getAlbums() to get each album ID, then getSongs() to get each song ID from every album, and finally then getLyrics() on each of the songs.

This means using the app creates hundreds – if not thousands – of API calls to musixmatch. While developing this, I was really spamming the API. Look how many calls I made in one day of development:

jQuery is at the heart of this single-page application bringing the page to life on clicks, presses, and hovers. jQuery code in this project is responsible for the cool counting-up the word-counters do.

The reactive hovering and click toggle are also possible thanks to jQuery:

D3 Data Visualization
I’ve only scratched the surface of D3, but now understand its power as a tool. The main use for this is to make the little charts on the screen.

The labels and shapes are all generated by D3, using data from the topEntries array generated by the sortDictByFreq() function.

Design choices
On the design side, my choice was inspired by work from my classmates, opting for a simple look with clear directions. I took lessons from Steve Krug’s writing in Don’t Make Me Think.

I specifically focused on making the directions clear and the search bar big. Also for every user action, there is a designed reaction with minimal lag.

Final notes and warnings
Keep in mind this was my first time using a lot of these tools and libraries. There was minimal user testing. There will be bugs. Here are some know issues and opportunities for improvement:

1. The artist search is dumb/brutal. It just queries the artist name and takes the first result it sees. For example, artists such as Drake, Lil Purp, The Beatles, and Selena Gomez don’t produce ideal results.

2. The API only provides the first 30% of each song’s lyrics. While this is probably enough for a representative sample, it is not all of the lyrics.

3. If a word is added to a lexicon that happens to be a native javascript function (e.g. pop, slice, append) the word will not register properly in the dictionary and won’t be counted.

Lastly, a big thank you to Calli Higgins for teaching and my classmates in API mashups course. Additionally thanks to Daniel Shiffman for his videos, and Brian Jenney for their example musixmatch code.

Leave a Reply

Your email address will not be published. Required fields are marked *