Song Lyrics 4: Sentiment Analysis - Statistics in Historical Musicology

In this fourth article in the series looking at our song lyrics dataset, we will begin to consider the meaning of the lyrics, rather than just treating the words as abstract objects. A simple technique for quantifying the meaning of texts is known as ‘sentiment analysis’.

The basic idea of simple sentiment analysis is to count the number of ‘positive’ words in a text, and the number of ‘negative’ words, and to calculate a sentiment score, such as (positive – negative) / (positive + negative). This will give a number between +1 (entirely positive) and -1 (entirely negative).

Before discussing some of the pros and cons of this technique, let’s apply the idea to all of the songs in our dataset, grouped by year.¹ The following chart shows the sentiment scores calculated from the total number of positive and negative words for the song lyrics from each year.

It seems that song lyrics were, on average, neutral until about 1980, when they suddenly became more negative. Sentiment then continued downwards until the late 1990s. We will come back to this interesting finding, but first we need to look in more detail at what the sentiment analysis is actually doing.

The ‘Bing’ in the title of the chart refers to a particular lexicon of positive and negative words developed by Bing Liu in 2004 for analysing customer reviews.² Bing’s lexicon contains 2,005 positive words and 4,781 negative words – this partly explains why the overall scores above are generally quite negative.

Other sentiment lexicons are available. One is the ‘Loughran’ list, developed by Loughran & McDonald in 2011, primarily for analysing financial reports.³ The Loughran lexicon has just 354 positive words against 2,355 negative words. As well as ‘positive’ and ‘negative’, it also has categories ‘constraining’, ‘litigious’, ‘superfluous’ and ‘uncertainty’, reflecting the purpose for which it was developed. Running the song lyrics through the Loughran lexicon results in lower scores overall, and a less pronounced decline in sentiment over time compared with Bing:

Both Bing and Loughran assign words as either positive or negative. The AFINN lexicon, developed for an analysis of ‘microblogs’, gives words a score between +5 (very positive) and -5 (very negative).⁴ So, for example, ‘breathtaking’ and ‘superb’ (scoring +5) are rated as more positive than ‘agree’ and ‘laugh’ (scoring +1).

As the chart below shows, the AFINN lexicon produces roughly the same overall shape as Bing, but the overall level is less negatively biased, despite AFINN still having about twice as many negative words (1,599) as positive (878).

Each of these three lexicons was designed for a specific purpose – none of which are at all related to song lyrics – so we must treat them with caution. There are even differences in how some words are categorised. ‘Intense’, for example, is rated negative in Bing but positive in AFINN, whereas ‘defeated’ is positive in Bing, negative in AFINN. ‘Exonerate’ is positive in AFINN and Bing, but negative in Loughran.

A more general lexicon is the NRC Emotion Lexicon which was generated by crowdsourcing. As its name suggests, as well as general ‘positive’ and ‘negative’ words, NRC also assigns words to the eight basic emotions of anger, fear, anticipation, trust, surprise, sadness, joy, and disgust. The positive/negative analysis confirms the same pattern as Bing and AFINN, but at a noticeably higher level, as NRC has a better balance of positive and negative words (2,312 to 3,324). The emotion analysis is shown in the following chart:

The ‘mean score’ here is the proportion of words assigned to that emotion, among those that are assigned to any emotion (so the eight categories should add up to 100%, apart from the fact that some words are assigned to more than one emotion). Anger, disgust and fear are all steadily on the rise, whilst joy and trust are falling (with anticipation, sadness and surprise roughly constant).

We can show this using a similar chart to those used in previous articles. Here is the ranking of emotion words by decade, with text height reflecting proportions relative to the most common.

The 1980s saw a sudden increase in ‘fear’ words. Perhaps more significantly, the 1980s also heralded a much better balance between the emotions: in the 1950s, 60s and 70s, ‘anger’, ‘disgust’ and ‘surprise’ were used much less than the most common emotion ‘joy’ (indicated by the small text), but after 1980 the gap between the most and least common emotions is much narrower. So perhaps 1980 is better described not as the moment that songs became more negative, but as the point at which they became more representative of the full range of emotions.

A potential problem with sentiment analysis is that it takes no account of the relationships between words. In particular, the words ‘not’ or ‘no’ can completely change the meaning of an adjacent word. Similarly modifiers such as ‘very’ and ‘barely’ can make significant differences.⁵ There are more sophisticated sentiment analysis techniques that attempt to work at a sentence-by-sentence level, rather than word-by-word, taking account of negators, modifiers, and other structural factors that can affect the meaning. These techniques can work well in the right circumstances, but tend to be much slower.⁶

There is obviously a lot of scope for applying sentiment and emotion analysis to explore data in different ways. Rather than an analysis by year, for example, we could compare artists. As an example, here is a plot of some top artists (‘top’ in terms of the number of songs in the dataset and whether I have heard of them or not) according to their scores on ‘joy’ (vertical axis) and ‘fear’ (horizontal):

Many other such charts can be produced, and many conclusions drawn, some of them interesting. As with other text analysis techniques, it can be dangerous to take these results too seriously, but they can nevertheless be useful ways of exploring data and of highlighting possible questions for further research.

Cite this article as: Gustar, A.J. 'Song Lyrics 4: Sentiment Analysis' in Statistics in Historical Musicology, 19th August 2019, https://musichistorystats.com/song-lyrics-4-sentiment-analysis/.

For further details of the dataset, see the first article in the series.
Minqing Hu & Bing Liu, “Mining and summarizing customer reviews.”, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004), Seattle, Washington, USA, Aug 22-25, 2004.
Loughran, T. & McDonald, B. (2011), “When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks.” The Journal of Finance, 66: 35-65. doi:10.1111/j.1540-6261.2010.01625.x
Finn Årup Nielsen, “A new ANEW: evaluation of a word list for sentiment analysis in microblogs”, Proceedings of the ESWC2011 Workshop on ‘Making Sense of Microposts’: Big things come in small packages. Volume 718 in CEUR Workshop Proceedings: 93-98. 2011 May. Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie, Mariann Hardey (editors)
About 2% of the words in song lyrics are ‘not’, ‘no’ or end in ‘…n’t’.
The R packages qdap and sentimentr both have this capability, for example.

Leave a Reply Cancel reply