The previous post was spent outlining the basics of text mining and exploring some possible avenues for analysis. I learnt a lot in the process and hopefully was able to convey at least some of the potential for digital exploitation of historic texts. In this post I want to label each offence with as precise a date as possible and also assigning some basic categories to the allegations. I can do the first by extracting days, month, or years from the text to new columns and the pairing these with the original text.
This post is based on a talk I gave to Information Services staff at the Templeman Library in March 2019. Umberto Eco, in his 1977 How to Write a Thesis, spends over twenty pages outlining the various ways in which you should cite works pertinent to your research on index cards. While there is still a need to understand many of the principles that Eco outlines there is now a variety of software which should ease the practicalities of managing your references.
I was really looking forward to giving a paper at the International Medieval Congress at Leeds this year on text mining medieval court records on session 836 which was organised by Dr Claire Kennan and Dr Emma J. Wells. I had intended to use the process as an excuse to learn text mining using R. I thought it might be provide some material for my thesis and I am quite evangelical about the possibilities of text mining historical documents.