2008 LTTTT2ExamplePipelineDoc

From GM-RKB
Jump to: navigation, search

Subject Headings: LT TTT2 System, System Document.

Quotes

1. Introduction

  • This documentation is intended to provide a detailed description of the pipelines provided in the LT-TTT2 distribution. The pipelines are implemented as Unix shell scripts and contain calls to processing steps which are applied to a document in sequence in order to add layers of XML mark-up to that document.
  • This document does not contain any explanation of lxtransduce grammars or XPath expressions. For an introduction to the lxtransduce grammar rule formalism, see the tutorial documentation (tutorial.html) in TTT2/doc/tutorial/. See also http://www.cogsci.ed.ac.uk/˜richard/ltxml2/lxtransduce-manual.html as well as the documentation for the LT-XML2 programs at http://www.cogsci.ed.ac.uk/˜richard/ltxml2/.
  • LT-TTT2 includes some software not originating in Edinburgh which has been included with kind permission of the authors. Specifically, the part-of-speech (POS) tagger is the C&C tagger and the lemmatiser is morpha. See Sections 5 and 6 for more information and conditions of use.
  • LT-TTT2 also includes some resource files which have been derived from a variety sources including UMLS, Wikipedia, Project Gutenberg, Berkeley and the Alexandria Digital Library Gazetteer. See Sections 4, 6 and 7 for more information and conditions of use.

Lemmariser

<document> <text>

The planning committee were always having big arguments. The children have frozen the frozen peas.

</text> </document> it is output like this (again modulo white space): <document> <text>

<w p="DT" id="w3" c="w" pws="yes">The</w> <w p="NN" id="w7" c="w" pws="yes" l="planning" vstem="plan">planning</w> <w p="NN" id="w16" c="w" pws="yes" l="committee">committee</w> <w p="VBD" id="w26" c="w" pws="yes" l="be">were</w> <w p="RB" id="w31" c="w" pws="yes">always</w> <w p="VBG" id="w38" c="w" pws="yes" l="have">having</w> <w p="JJ" id="w45" c="w" pws="yes">big</w> <w p="NNS" id="w49" c="w" pws="yes" l="argument" vstem="argue">arguments</w> <w p="." id="w58" pws="no" sb="true" c=".">.</w> <w p="DT" id="w60" c="w" pws="yes">The</w> <w p="NNS" id="w64" c="w" pws="yes" l="child">children</w> <w

Chunker

<document> <text>

The planning committee were always having big arguments. The children have frozen the frozen peas.

</text> </document> it is output like this (again modulo white space): <document> <text>

<w p="DT" id="w3" c="w" pws="yes">The</w> <w p="NN" id="w7" c="w" pws="yes" l="planning" vstem="plan">planning</w> <w p="NN" id="w16" c="w" pws="yes" l="committee">committee</w> <w p="VBD" id="w26" c="w" pws="yes" l="be">were</w> <w p="RB" id="w31" c="w" pws="yes">always</w> <w p="VBG" id="w38" c="w" pws="yes" l="have">having</w> <w p="JJ" id="w45" c="w" pws="yes">big</w> <w p="NNS" id="w49" c="w" pws="yes" l="argument" vstem="argue">arguments</w> <w p="." id="w58" pws="no" sb="true" c=".">.</w> <w p="DT" id="w60" c="w" pws="yes">The</w> <w p="NNS" id="w64" c="w" pws="yes" l="child">children</w> <w ,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 LTTTT2ExamplePipelineDocClaire GroverLT-TTT2 Example Pipelines Documentationhttp://www.ltg.ed.ac.uk/software/lt-ttt2/pipeline-doc/file/at download2008