A Quantitative Analysis of the Dissemination of Religious Notions in Contemporary English Texts (15.06.2022)
Aim of this Research Project
The project has set itself the task of quantitatively examining central religious concepts such as sacred, ritual, purity, and religion in terms of their prevalence, use, and meaning in various social fields such as politics and art. From a theoretical perspective, the project’s focus lies on the interplay between religion and various non-religious parts of society, with a special focus on (de)sacralization processes (see Krech 2018) and the concept of the “religioid” coined by Simmel (see Tyrell 2018).
Empirically, this project will build on the religious studies text corpus ReligionML that includes documents from different social fields and annotates them to enable data-driven research in the study of religion (more details on ReligionML and the first annotation attempts can be found on the ReligionML website). Due to several reasons, the initial phase of the project will concentrate on gathering, tagging, and examining Twitter data (for more details, please see the following parts and the section on Data in particular).
This research project aims at contributing to research on the interplay between religion and other societal spheres (such as politics)1 by analyzing public discourses about and from within religion. The overall theoretical assumption behind this attempt is “that phenomena of the religious field are defined in relation to other religious constituents and other social and cultural facts outside the religious field” (quote from introduction of the forthcoming ER special issue on “Religion and Images,” following Krech 2020).
To analyze the ways in which these interdependences between the various social fields have shaped religion (and the other way round), this project builds upon the concept of the “religioid” coined by Georg Simmel, which describes the emergence of religious semantics and of religion as a whole from (non-religious) social processes:
Als Ausgangspunkt dient die empirische Herleitung von Religion aus sozialen Vorgängen mittels des Differenzierungsparadigmas und der daraus resultierenden Unterscheidung von Religion als autonomer Vorstellungswelt und „Religioidem“ […], das bereits in Vergesellschaftungsprozessen vorzufinden ist. [Krech (2018), 337]2
Complementing the concept of the religioid, the project also draws on the process of (de)sacralization, which can be seen as an extension of the concept of secularization. In a linguistic context, secularization is commonly understood as the deconstruction of the religious origin of concepts and their interpretation in a non-religious context (see Schlette 2009). The concept of descralization, in contrast to sacralization, is understood as part of a broader phenomenon oscillating between the poles of increasing (sacralization) and decreasing (desacralization) religious interpretations (Krech 2018, 445). In this context, sacralization is understood as the process of religious interpretations of “vormals religiös insignifikante Entität[en]” (Krech 2018, 443),3 by religious actors (e.g. religious institutions) and non-religious participants alike:
Unter Sakralisierung ist in der Religionsforschung ganz allgemein zu verstehen, dass sich die kollektive Verständigung über das Heilige vom religiösen in den nicht-religiösen Bereich ausweitet oder verlagert. [Schlette and Krech (2018), 437]4
The process of (de)sacralization can shift the meaning of concepts from the religious to the non-religious realm, but it can also take the opposite path by linking formerly non-religious phenomena to the realm of religion (see the example of civil religion in Bellah 1967).
The project “A Quantitative Analysis of the Dissemination of Religious Notions in Contemporary English Texts” will attempt to quantitatively analyze these processes of religioid use cases of certain terms and the process of (de)sacralization using texts from various social fields. Potential social fields that could be examined in this context are found, among others, in the section “Religion im gesellschaftlichen Kontext” in the Handbuch Religionssoziologie (Pollack et al. 2018, 657–859). For several reasons (among others: availability, copyright issues, relevance), the project is currently limited to the three fields of politics, art, and media.5
The methodology is based on the quantitative analysis of the collected and annotated data, which will initially consist of Twitter data but also include other sources at a later stage (see section Data below). The data will be tagged according to the annotation schemata described on the ReligionML Corpus website, thereby both iteratively enhancing the annotation schemata and facilitating a later integration into the ReligionML corpus.
The exploratory data analysis will be performed by applying machine learning algorithms from the field of unsupervised learning (see Géron 2019) and other statistical measures (such as phi coeficient) on the annotated data. The clusters resulting from these steps will be analyzed using established methods from the field of corpus linguistics to achieve a better understanding of the use and meaning of the terms under investigation (including keyword analyses, collocation analyses, and word frequency lists, see Brezina 2018).
The quantitative methods will be expanded to include a qualitative level of analysis focusing on selected textual examples drawn from the quantitative analysis. The mixed methods approach used here builds on a procedure I developed in my dissertation (Jurczyk 2022) that will hopefully be extended during this project.6
An example in this context is the analysis of (English) terms such as "sacred" or "holy" in documents of different provenance and genres (Twitter, newspaper articles, forums, art, politics, etc.), which are collected and annotated in the ReligionML Corpus and analyzed and discussed under the questions addressed in the theory section (I took a first step in this direction in my dissertation, see Jurczyk 2022).
In the beginning of the project, the data acquisition will focus on tweets gathered via the Twitter API v2 that include certain religious terms, such as “religion,” “holy”, “sacred,” and others. These Tweets will be annotated according the schemata discussed on the ReligionML Corpus website, meaning that they will be tagged on a sentence (document) and word level. In the long run, the annotated tweets should be archived and processed in a structured manner as part of the ReligionML Corpus.
At a later point of the project, other (digitized) contemporary and historical sources (such as newspapers, books, Wikipedia, etc.) should also be included in order to complement the analysis with a diachronic layer and text genres that represent additional parts of the public sphere.
The reason to focus on Twitter data first has two major reasons:
Firstly, Twitter has the enormous advantage of being widely used by religious actors (both private persons and institutions) as well as persons communicating about religion(s) between various societal spheres. It covers a broad spectrum of what is known as ‘public religion’ (Casanova 2012, 2008) and the public discourse about religion (Neumaier 2018, 834). Therefore, Twitter data provides an excellent (and constantly growing) empirical basis and thus starting point for a research project that aims to examine the relation between religion and other parts of society. Yet, the concentration on Twitter data also adds several potential pitfalls to the analysis because the communication on Twitter follows certain rules and forms a distinct genre of communication (see Bouvier and Rosenbaum 2020). Thus, it is mandatory during the first stages of the examination to remember the restricted representational character of Twitter communication for the public disource on religion and to complement the Twitter analysis by integrating other sources at a later stage (see again the idea behind the ReligionML Corpus).
Secondly, from a copyright and pragmatic7 point of view, working with already digitized and freely copyright-free source material (such as Twitter data) has the advantage of saving a lot of time while also avoiding potential legal issues.
The term “social sphere” is applied with a vague reference to Bourdieu (2021) and Niklas Luhmann’s social systems (2021). It takes up the observation that modern differentiated societies consist of various social spheres (or fields/systems; I use these terms synonymously throughout the project) that can be understood as autonomous systems which interact with (and influence) each other.↩︎
“The starting point is the empirical derivation of religion from social processes by means of the differentiation paradigm and the resulting distinction between religion as an autonomous world of imagination and a ‘religioid’ [...], which is already found in processes of socialization.”↩︎
“formerly religiously insignificant entity[ies].”↩︎
“In religious studies, sacralization is generally understood to mean that the collective understanding of the sacred expands or shifts from the religious to the non-religious realm.”↩︎
For the relation between religion and politics, see Willems (2018). For the relation between religion and art, see Krech (2018). For the relation between religion and media/online discourses, see Neumaier (2018, 2016).↩︎
For an intriguing discussion of mixed methods approaches in digital humanities and their critique, see Kleymann (2022).↩︎
The keyword here is automated acquisition of data through web scraping.↩︎