In this fourth article in the series looking at our song lyrics dataset, we will begin to consider the meaning of the lyrics, rather than just treating the words as abstract objects. A simple technique for quantifying the meaning of texts is known as ‘sentiment analysis’.
Continue reading →Tag: text analysis
Song Lyrics 3: Repetition and Compression

We all know that a good song depends on repetition – both of the tune and the lyrics. Too much repetition and it is just boring; too little, and it can lack structure. This article looks at different aspects of repetition in song lyrics.
Continue reading →Song Lyrics 2: n-grams
In the previous article in this series we looked at counting the frequency of words in a dataset of song lyrics. This time we will look at combinations of words – or n-grams.
Continue reading →Song Lyrics 1: Counting Words

This is the first of a series of articles about analysing text data. The statistical music historian might be interested in many sorts of text – from lists and catalogues through to complex ‘free format’ writing in tweets, record reviews, composer biographies, or encyclopedias. For these articles I will consider a dataset of song lyrics, taken from the LyricWiki website [since I wrote this post, LyricWiki has disappeared, although there are several other sources of song lyrics that could be used].
Continue reading →Reading the Musical Times

A few weeks ago I noticed that those nice people at JSTOR have a scheme whereby researchers can apply to access large chunks of their data in order to carry out quantitative research projects.1 I sent off an application to see if they would let me have all copies of The Musical Times and Singing Class Circular in order to carry out a statistical analysis of the text (a technique which I will cover at some point in a future article). Lo and behold, after a couple of emails and a few days, I received a link from JSTOR to download the data I had asked for.
Continue reading →