Lunchtime Seminar Series: Tuesday 18th March: Prof. Sofia Ananadiou

If you were unable to make the seminar and would like to view a copy of the presentation along with a recording of the paper and questions please CLICK HERE

(Once the presentation has loaded the audio will run continuously as you click through the slides, unfortunately the builders finished their lunch mid-way through the questions – so they might not be very clear!)




An Introduction to Text Mining

Text mining ™ is the process of discovering and extracting knowledge from unstructured textual data. This includes the recognition of entities in the texts, e.g., diseases, symptoms, drugs etc., together with the identification of relationships that occur amongst them e.g., which drugs have been used to treat a particular disease. Based on the knowledge extracted, associations can be found amongst the pieces of information extracted from many different texts, e.g. how successful are different drugs in treating particular diseases and under what conditions? TM is becoming increasingly important with the advent of “big data”. The sheer volume of available digital textual data means that, without suitable TM tools that can assist in the search for relevant information and discovery of trends, there is a danger that much important information will be overlooked. Big data has resulted not only from the exponential growth in the rate at which new scientific papers are being published, but also from increased efforts to digitise historical documents. The availability of digitised historical archives provides researchers with a potentially rich source of data to study trends over long periods of time, such as changes in treatments and understanding of diseases. The search for and study of relevant relationships between entities that occur within these documents can be vastly aided by the availability of powerful TM tools. This talk provides an introduction to some of the techniques used in the development of TM systems, and examines a number of different tools that have been developed for application to biomedical text. We consider how such tools could be used and adapted to assist in the study and discovery of information within historical medical documents.


