Lehrstuhl für Computerphilologie und Neuere Deutsche Literaturgeschichte

    Workshop "Introduction to the TXM Content Analysis Platform"

    DARIAH-DE Workshop "Introduction to the TXM Content Analysis Platform"

    The Department for Literary Computing, Würzburg University, will organize a DARIAH-DE Workshop called "Introduction to the TXM Content Analysis Platform".

    Workshop outline

    The objective of the "Introduction to TXM" tutorial is to introduce the participants to the methodology of text analysis through working with the TXM software directly on their own laptop computers. At the end of the tutorial, the participants will be able to input their own textual corpora (Unicode encoded raw texts or XML/TEI tagged texts) into TXM and to analyze them with the panel of content analysis tools available: word patterns frequency lists, kwic concordances and text browsing, rich full text search engine syntax (allowing to express various sequences of word forms, part of speech and lemma combinations constrained by XML structures), statistically specific sub-corpus vocabulary analysis, statistical collocation analysis, etc.). The portal version of TXM, allowing the on line access and analysis of corpora, will also be introduced.

    Keynote by Michael Piotrowski (Mainz): "Natural Language Processing for Historical Texts"

    Together with the increasing availability of historical texts in digital form, there is a growing interest in applying natural language processing (NLP) methods and tools to historical texts. The potential applications range from general-purpose text retrieval to very specific analyses for particular research questions. However, the specific linguistic properties of historical texts---the lack of standardized orthography in particular---pose special challenges for NLP. This talk aims to give an introduction to NLP for historical texts and an overview of the state of the art in this field.

    Basic information

    • Time: February 6-7, 2014
    • Place: Würzburg University: Zentrum für Sprachen, Hubland Nord, Matthias Lexer Weg 25, Raum 01.002
    • Workshop leader: Serge Heiden (Lyon)
    • Local organizer: Christof Schöch (Würzburg)

    For further information:

    Practical matters

    The workshop is aimed at younger as well as experienced scholars from the humanities dealing with textual data. It is being organized by the Department for Literary Computing at Würzburg University, Germany, under the auspices of the DARIAH-DE initiative. It will be held by Serge Heiden, leader of the TXM project. It will take place on Feb. 6-7, 2014 at Würzburg University.

    The maximum number of participants is limited to 15 and places will be filled on a first-come first-serve basis. The deadline for applications is January 20th, 2014. There is no fee for participating in this event. You will need to bring along your own laptop computer. Access to the internet will be available. Please contact Dr. Christof Schöch for further details and to sign up for the event at

    Preliminary programme

    Thursday, February 6

    • 13:30-14:00 - Welcome
    • 14:00-15:30 – Introduction to TXM
    • 15:30-16:00 – Coffee break
    • 16:00-17:30 – Hands-on Session 1
    • 18:00-19:30 – Lecture by Dr. Michael Piotrowski (Mainz)
    • 20:00- - Dinner

    Friday, Feb 7

    • 9:00-11:00 – Hands-on Session 2
    • 11:00-11:30 – Coffee break
    • 11:30-13:00 – Hands-on Session 3
    • 13:00-13:30 – Final discussion and farewell

    Note: This event is being organized by DARIAH-DE with funding provided by the German Federal Ministry of Education and Research (BMBF) under the identifier 01UG1110J.