AI for Scholarship: How Machine Learning can Transform the Humanities

 In a previous blog, I explored how AI will speed up scientific research. In this blog, I will examine the overlooked  potential that AI has to transform the Humanities. This connection may not be clear at first since most of these fields do not include an element of science or math. They are more preoccupied with developing theories than testing hypotheses through experimentation. Subjects like Literature, Philosophy, History, Languages and Religious Studies (and Theology) rely heavily in the interpretation and qualitative analysis of texts. In such environment, how could mathematical algorithms be of any use? 

Before addressing the question above, we must first look at the field of Digital Humanities that created a bridge from ancient texts to modern computation. The field dates back the 1930’s, before the emergence of Artificial Intelligence. Ironically, and interestingly relevant to this blog, the first project in this area was a collaboration between an English professor, a Jesuit Priest and IBM to create a concordance for Thomas Aquinas’ writings. As digital technology advanced and texts became digitized, the field has continued to grow in importance. Its primary purpose is to both apply digital methods to Humanities as well as reflect on its use. That is, they are not only interested in digitizing books but also evaluating how the use of digital medium affect human understanding of these texts. 

Building on the foundation of Digital Humanities, the connection with AI becomes all too clear. Once computers can ingest these texts, text mining and natural language processing are now a possibility. With the recent advances in machine learning algorithms, cheapening of computing power and the availability of open source tools the conditions are ripe for an AI revolution in the Humanities.

How can that happen? The use of machine learning in combination with Natural Language Processing can open avenues of meaning that were not possible before. For centuries, these academic subjects have relied on the accumulated analysis of texts performed by humans. Yet, human capacity to interpret, analyze and absorb texts is finite. Humans do a great job in capturing meaning and nuances in texts of hundreds or even a few thousand pages. Yet, as the volume increases, machine learning can detect patterns that  are not apparent to a human reader.  This can be especially critical in applications such as author attribution (determining who the writer was when that information is not clear or in question), analysis of cultural trends,  semantics, tone and relationship between disparate texts. 

Theology is a field that is particularly poised to benefit from this combination. For those unfamiliar with Theological studies, it is a long and lonely road. Brave souls aiming to master the field must undergo more schooling than Physicians. In most cases, aspiring scholars must a complete a five-year doctorate program on top of 2-4 years of master-level studies. Part of the reason is that the field has accumulated an inordinate amount of primary sources and countless interpretations of these texts. They were written in multiple ancient and modern languages and have a span over thousands of years. In short, when reams of texts can become Big Data, machine learning can do wonders to synthesize, analyze and correlate large bodies of texts. 

To be clear, that does not mean the machine learning will replace painstaking scholarly work. Quite the opposite, it has the potential to speed up and automate some tasks so scholars can focus on high level abstract thinking where humans still hold a vast advantage over machines. If anything it should make their lives easier and possibly shorter the time it takes to master the field.

Along these lines of augmentation, I am thinking about a possible project. What if we could employ machine learning algorithms in a theologian body of work and compare it to the scholarship work that interprets it? Could we find new avenues or meaning that could complement or challenge prevailing scholarship in the topic? 

I am curious to see what such experiment could uncover.