The Google Books Ngram Viewer is a powerful tool for analysing historical text data. It uses the enormous corpus of books scanned by Google to analyse the frequency of words and phrases over time. An n-grams is just a combination of words – so a single word is a 1-gram, a pair of words a 2-gram, etc. The Google viewer has data up to 5-grams.
This has potential uses in many fields – including musicology. Here we will use the ngram viewer to analyse the rise and fall of ragtime music.
This article looks at the types of repertoire included in eighteenth-century London concerts. As discussed in the first article of this series, information on the works performed is encoded, in a complicated way, in the “programme” field of the dataset.
The data is based on concert advertisements in newspapers, so there is considerable variation in the detail provided. Some advertisements spell out details of all of the works and who will perform them, but it is more typical for the focus to be on the performers, with the works often vaguely specified, such as “a concerto by Handel” (if you are lucky, it will say what instrument it is for).
I recently stumbled across this page on Wikipedia, listing music students and their teachers. This is an ideal dataset to explore as a network diagram, or “graph”, in which a set of points (or “nodes”) are connected by lines (or “edges”). Here, the nodes are individuals, and there is an edge between them if one taught the other.
One of the things that seems to distinguish ‘classical’ from ‘popular’ music is the fact that the same classical composers and works can remain at the top for very long periods of time – decades, even centuries – whereas popular music songs and artists can reach the top of the charts, sell millions of records, and disappear within a matter of months. But is this difference real?
Christmas music is everywhere at the moment, so I thought I would look at the history of it. In the British Library Music Catalogue, of the one million or so total publications, almost 10,000 – very nearly 1% – have the words ‘Christmas’, ‘Noel’ or ‘Weihnacht’ in the title. This chart shows the proportion by publication date…
Many datasets of composers tell us relatively little about them, so we sometimes have to guess details from the information available – such as the composer’s name. Forenames, for example, are often a good indicator of gender, as described in this previous article. Titles – associated with the church, aristocracy or royalty – can also reveal gender, and tell us about occupation or social class. This article looks at what names can tell us about nationality – based on a recent attempt to identify Italian composers among the many obscure and unknown names listed in the British Library’s music catalogue.
Following this previous article, a friend got in touch to thank me for disproving some astrological ‘nonsense’. I replied that I had not disproved anything – I had just failed to find supporting evidence – but it did get me wondering about the nature of the conclusions that can be drawn from this sort of analysis.
Suppose, for the sake of argument, that people born under Aquarius do show a significantly higher propensity to become composers than those born under Virgo. Consider these three possible explanations… Continue reading →
If you go to the British Library online catalogue, search for music scores published in each year from 1650 to 1920, and plot the number of ‘hits’ by year, the result looks like this. Continue reading →
In what ways can statistical techniques be used to investigate topics in historical musicology? I think there are four main approaches – hypothesis testing, quantification, modelling and exploration. Their use depends on the topic, the data, and the type of question you are trying to answer.
These four types often overlap. It is hard to do modelling without some exploration and quantification, for example. Also, after you have spent so long collecting the data, cleaning it, and getting it into a form for statistical analysis, why not squeeze the most out of it and do some general exploration after testing your hypotheses? Continue reading →