Song Lyrics 1: Counting Words

This is the first of a series of articles about analysing text data. The statistical music historian might be interested in many sorts of text – from lists and catalogues through to complex ‘free format’ writing in tweets, record reviews, composer biographies, or encyclopedias. For these articles I will consider a dataset of song lyrics, taken from the LyricWiki website [since I wrote this post, LyricWiki has disappeared, although there are several other sources of song lyrics that could be used].

Continue reading →

Reading the Musical Times

A few weeks ago I noticed that those nice people at JSTOR have a scheme whereby researchers can apply to access large chunks of their data in order to carry out quantitative research projects.1 I sent off an application to see if they would let me have all copies of The Musical Times and Singing Class Circular in order to carry out a statistical analysis of the text (a technique which I will cover at some point in a future article). Lo and behold, after a couple of emails and a few days, I received a link from JSTOR to download the data I had asked for.

Continue reading →