Indonesians’ Song Lyrics Topic Modelling Using Latent Dirichlet Allocation

Published in IEEE 2018 5th International Conference on Information Science and Control Engineering (ICISCE), 2018

Lyrics in songs have important roles to give identity and storyline of songs. Lyrics are also considered as one of influence factor for popularity of the songs. However, it is difficult to manually assess the topic from numerous songs especially Indonesian’s songs which have high poetic level in the lyrics. Knowing the intrinsic value of lyrics from numerous songs becomes a challenge, especially the lyrics written in complicated and complex language like Bahasa. This paper aims to know and interpret the topics from Indonesian’s song lyrics. Indonesian’s songs were obtained from daily TOP 200 Spotify in January 2017 – January 2018 with 193 different songs using Bahasa in the lyrics. Latent Dirichlet Allocation (LDA) for the topic modeling was used in this paper. Using 10 topics based on perplexity results, LDA has proper way to interpret the topics in numerous songs by giving information about top words in every topic and topic probabilities for each document or song.

Download paper here