Why Quantify Music History?

There is a shocking absence of statistics in books on music history. Generations of music historians have shown little interest in using statistical analysis to quantify their subject.

But why should it be considered outrageous that music historians have not embraced the tools and techniques that would enable them to quantify music history? After all, there are plenty of excellent accounts of the history of music, all based on thorough and rigorous scholarship and a deep knowledge of the subject. Is this not enough?

There are four main reasons why this situation is unsatisfactory. Firstly, most other historical disciplines routinely use quantitative research alongside qualitative methods – economic and social history, archaeology, book history or the history of art, for example. Statistical techniques are taught to history students, and history journals often include debates among academics regarding the finer points of each others’ quantitative methodologies, such as their approaches to sampling and the way that they have allowed for bias. Music history is conspicuous for being among the few historical fields built almost entirely on qualitative research. The typical music historian researches, in considerable detail, a particular composer, performer, work, event or institution. Sophisticated methods have been developed to squeeze the maximum amount of information from the varied sources used in historical musicology. More than two centuries of thorough, enthusiastic and ingenious research have produced an extraordinary amount of knowledge about a vast range of topics in music history. However, in the absence of quantitative techniques, it is not always easy to make out the big picture of music history from all of these detailed snippets. Our overall narrative of music history is created by attempts to piece together all of these detailed small pieces of qualitative research into a coherent whole. In a field as complex and diverse as music, this makes for an endless and fascinating multiplicity of stories. But it does not necessarily result in a balanced historical account.

The second reason for disappointment at the current situation is that statistical techniques have a great deal to tell us about the history of music. They can reveal things that would be impossible to find using only the traditional qualitative approaches to historical musicology. How, for example, would we have discovered that composers tend to relocate to new cities once every fourteen years on average, independently of where and when they were born? Or that piano works that are well known tend to have more sharps in their key signatures than obscure pieces?1 Many of these findings would perhaps even fail to be considered as valid lines of investigation by music historians, who are wise enough to know the limitations of their methods and will understandably (though often unconsciously) avoid questions that appear to be intractable.

As well as revealing new things, statistics can fill out our understanding of what we already know. Trends in music can be properly quantified and put into context, so that we can judge how significant they are, and decide to what extent they are supported by hard evidence. Backing up our conclusions with hard evidence would be an important step in bringing historical musicology into line not only with other historical disciplines but with many other branches of musicology. There is, for example, a great deal of sophisticated quantitative research going on in areas such as music analysis and composition, performance research, the study of music perception and psychology, and the design and development of instruments, concert halls and recording technology. Strangely, few of the talented mathematicians, statisticians and computer programmers working in these branches of musicology have so far turned their attention to music history.

The use of statistical techniques alongside qualitative research might also help to avoid some of the mistakes that are all too easy to make in the absence of an objective assessment of quantitative evidence. Whereas qualitative techniques can give us a thorough and robust understanding of the details of music history, they do not, as a rule, lend themselves to reliable generalisation. It is in the extrapolation of their conclusions that qualitative researchers can get into trouble. Quantitative methods, on the other hand, whilst they do not reveal much about the details of individual cases, are ideal for getting to grips with the large scale trends and patterns. The dangerous temptation for quantitative researchers is in trying to determine cause and effect or to ascribe reasons for their findings: this is where qualitative methods are needed. As such, the two approaches work best when they are used together in order to counteract each other’s weaknesses.

The third reason for dismay at the predominance of qualitative research in music history is that, almost by definition, it has resulted in a very narrow and unrepresentative body of knowledge. Whilst the world of music is extraordinarily large and diverse, qualitative researchers, quite naturally, tend to be drawn to those topics that are most interesting and which are represented by a plentiful supply of sources. Over a long period, the weight of scholarly opinion and the scholarship itself tends to reinforce these preferences at the expense of the less interesting and less well documented works, composers, and events. There are, thankfully, a respectable handful of researchers diligently unearthing the more obscure corners of music history, but also a depressing tendency for many new books and articles, university syllabuses and doctoral theses to stick to the tried and tested formula of examining the lives and works of a handful of famous composers.

One of the things I hope to demonstrate on this site is the astonishing scale of historical musical activity. As a simple example, even a well-read specialist in the history of music is unlikely to have heard of more than a few hundred composers, yet the number of composers who had works in print in the early years of the twentieth century was perhaps around 90,000. A few of these were very prolific and perhaps made a reasonable living from their compositions, but most were probably amateurs or supported themselves in other ways. On even a generous assessment of the scope of the received history of music, it is clear that it covers less than 1% of the people involved. Of course, there is little possibility of ever finding out much about the lives and works of these obscure composers, but quantitative techniques are at least able to give them a collective voice in the grand narrative of music history. This perspective also highlights the extraordinary achievements (whether by hard work, talent or good fortune) of the tiny proportion of composers who did manage to emerge from all of this chaos to achieve a degree of enduring fame.

A desire to understand the 99% of musical activity missed by traditional research can be seen as part of the recent trend among historians away from history based on the lives of great men and towards one based on the lives of ordinary people. The chances are that music historians have already successfully identified the most talented composers and performers, and the majority of the finest and most innovative compositions, and it is likely that many (but by no means all) of these obscure and unstudied composers and works are, in themselves, unremarkable. But that is not the point. History is not just about kings and queens and the great explorers, generals, artists and scientists. It is surely also about what life was like for ordinary people, how they made a living, how they were affected by the changes going on around them, and how that compares with other groups, in other places and at other times. In musical terms, whilst it is undeniably important that we endeavour to discover as much as we can about the lives and works of Johann Sebastian Bach, Igor Stravinsky and Elvis Presley, it is a great shame that we know so little about the myriad ‘little guys’ who failed to make it to the top but have nevertheless formed the bedrock of musical activity.

The fourth, and in some ways the most shocking thing about the near-absence of statistics from music history, is that there are enormous quantities of historical data that are suitable for statistical analysis. Large numbers of music enthusiasts and institutions have spent the last few centuries collecting music, cataloguing it, compiling dictionaries, biographies, encyclopedias and guides on musical topics, and producing lists and databases of composers, works, events, publishers, publications, performers, instruments, recordings, and many other component parts of the musical world.2 Multiply this activity by all the different regions, periods, genres and musical styles, and the end result is a bewildering array of datasets, each of which can be viewed as a snapshot, from a particular perspective, of part of the population of composers and their works.

Accessing much of this data is straightforward, although it usually needs careful preparation before it can be used for statistical analysis. Analysing it presents a few challenges, but does not usually require any great degree of sophistication. Interpreting the results, however, can be difficult, and often leads us into deep and subtle issues that have profound implications for historical musicology as a whole, whether qualitative or quantitative in its origins. These difficulties are related to two forms of bias that are inherent in many of the sources used to study music history. The first of these is an asymmetry between the information we have about well-known composers and what we know about their obscure counterparts. This asymmetry can easily and unavoidably result in unequal treatment of these two groups when they appear together in a statistical investigation, and this can cause real problems of interpretation. The second bias is a geographical one that privileges the history of those regions where music collecting and scholarship have been most active (such as, in the nineteenth century, Britain and Germany) compared to those countries where, for whatever reason, musical activity has to a greater extent gone unrecorded. Although, on the face of it, this second form of bias is easier to take into account in a quantitative analysis, it has a deep, widespread and insidious corollary in its impact on aesthetics and musical taste that endures to the present day. Is, for example, the music of the great Germanic composers really ‘better’ than that of their unknown contemporaries from Iberia or the Balkans (as one might reasonably infer based on its greater representation in concert performances, broadcasts and music syllabuses), or is its apparent greatness simply a reflection of the higher level of exposure, familiarity, and scholarly attention that is today’s legacy of the patterns of musical scholarship in the nineteenth century?

The richest sources of readily available data relate to western classical music from the last three or four centuries, and this will form the subject matter of much of this site. However, there are also many datasets covering popular music (the record sales charts, for example), jazz discographies, the numerous collections of folk music from different nations, catalogues of non-western music (such as the extensive literature on Indian music), and the large databases of medieval and renaissance manuscript sources. Much of this data is freely available, although there are many restricted or commercial sources that are also of great value for this sort of analysis. A lot of interesting data from the twentieth century, for example, is still subject to copyright restrictions so is not yet in the public domain.

I have argued that quantitative techniques can present a more objective view of music history as they are less prone to the tendency of qualitative research to cluster around the most well-known topics. That is not to say that they are immune to the influence of the researcher. As in any research, the researcher chooses the topics of interest, picks the data sources that appear most relevant (whilst being both available and in a usable format) and selects the analytical techniques that, in his or her judgement, are most appropriate for the job given the available time, resources and expertise. The interpretation and presentation of the results also require judgements to be made on what to include or leave out, what is most or least important, and how best to communicate to the intended audience. When the stakes are high – in medical or scientific research – protocols are in place to avoid statisticians from having too much undeclared latitude on any of these issues. In historical musicology, where the stakes are rather low and there is no established tradition of statistical research, I have no qualms in having made the most of my freedom to pick the topics, datasets, statistical techniques and styles of visualisation that appear on this site, and to present them in a way that illustrates the potential value of quantitative techniques in this field, as well as some of their hazards and difficulties. My hope is that this site will make a small contribution towards reducing the scarcity of statistics in music history research, and that the examples presented here will encourage and inspire others to explore the history of music in quantitative terms.

Cite this article as: Gustar, A.J. 'Why Quantify Music History?' in Statistics in Historical Musicology, 21st July 2017, https://musichistorystats.com/why-quantify-music-history/.


  1. These examples are both from case studies in my PhD thesis.
  2. See the Datasets pages for some examples.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.