PPLRE Annotator

From GM-RKB
Jump to navigation Jump to search

A PPLRE Annotator is the Annotation System used in the PPLRE Project to put some explicit Syntactic Structure and Shallow Semantic Structure to the PPLRE Corpus.



Current Version

The Current version of the annotator is located at /home/shared/PSort/PPLRE/bin/Annotator/v2.5.1


Sample Input/Output

The following sample refers to the PubMed abstract with PSID=7181.

Sample Input

Here is the relevant portion from the source file.

 cat /home/shared/PSort/PPLRE/data/corpusHome/corpus/7181/2_AnnotatorFiles/v2.3/sourcetext.txt
 <ABSTRACT>

Virulent mycobacteria utilize surface-exposed polyketides to interact with host cells, but the mechanism by which these hydrophobic molecules are transported across the cell envelope to the surface of the bacteria is poorly understood. Phthiocerol dimycocerosate (PDIM), a surface-exposed polyketide lipid necessary for <italic>Mycobacterium tuberculosis</italic> virulence, is the product of several polyketide synthases including PpsE. Transport of PDIM requires MmpL7, a member of the MmpL family of RND permeases. Here we show that a domain of MmpL7 biochemically interacts with PpsE, the first report of an interaction between a biosynthetic enzyme and its cognate transporter. Overexpression of the interaction domain of MmpL7 acts as a dominant negative to PDIM synthesis by poisoning the interaction between synthase and transporter. This suggests that MmpL7 acts in complex with the synthesis machinery to efficiently transport PDIM across the cell membrane. Coordination of synthesis and transport may not only be a feature of MmpL-mediated transport in <italic>M. tuberculosis,</italic> but may also represent a general mechanism of polyketide export in many different microorganisms.

  </ABSTRACT>

Sample Output

Below are two of the resulting output files generated by the PPLRE Annotator that are most typically referenced by subsequent processes: sentences.txt and annotated.tab.

sentences.txt file

Drawn from PPLRE Corpus 7181.a

Virulent mycobacteria utilize surface-exposed polyketides to interact with host cells, but the mechanism by which these hydrophobic molecules are transported across the cell envelope to the surface of the bacteria is poorly understood. Phthiocerol dimycocerosate (PDIM), a surface-exposed polyketide lipid necessary for Mycobacterium tuberculosis virulence, is the product of several polyketide synthases including PpsE. Transport of PDIM requires MmpL7, a member of the MmpL family of RND permeases. Here we show that a domain of MmpL7 biochemically interacts with PpsE, the first report of an interaction between a biosynthetic enzyme and its cognate transporter.


annotated.tab file

Drawn from PPLRE Corpus 7181.a

RowID
Token
Stemmed
POS
Parse Tree
Concept Type
Concept ID
Predicate
SRL-1
SRL-2
SRL-3
SRL-4
1 Virulent Virulent JJ (S1(S(S(NP* (mono_cell (0 (2 * (A0* * * *
2 mycobacteria mycobacteria NNS *) ) ) ) * *) * * *
3 utilize utilize VB (VP* Functional_Concept C0042153 4 utilize (V*) * * *
4 surface-exposed surface-exposed JJ (NP* Spatial_Concept C0205148 4 * (A1* * * *
5 polyketides polyketides NNS *) Clinical_Attribute C0332157 4 * *) * * *
6 to to TO (S(VP* - - - * (AM-PNC* * * *
7 interact interact VB (VP* Molecular_Function C0687133 4 interact * (V*) * *
8 with with IN (PP* - - - * * (A2* * *
9 host host NN (NP* (cell_type (1 (2 * * * * *
10 cells cell NNS *))))))) ) ) ) * *) *) * *
11 , , , * - - - * * * * *
12 but but CC * - - - * * * * *
13 the the DT (S(NP(NP* - - - * * * (AM-MNR* (A1*
14 mechanism mechanism NN *) Functional_Concept C0441712 4 * * * *) *
15 by by IN (SBAR(WHPP* - - - * * * * *
16 which which WDT (WHNP*)) - - - * * * * *
17 these these DT (S(NP* - - - * * * (A1* *
18 hydrophobic hydrophobic JJ * (peptide (2 (2 * * * * *
19 molecules molecule NNS *) ) ) ) * * * *) *
20 are are AUX (VP* - - - * * * * *
21 transported transport VBN (VP* Cell_Function C0005528 4 transported * * (V*) *
22 across across IN (PP* - - - * * * * *
23 the the DT (NP* - - - * * * * *
24 cell cell NN * (LOCALIZATION (GO0005618 (3 * * * * *
25 envelope envelope NN *)) ) ) ) * * * * *
26 to to TO (PP* - - - * * * * *
27 the the DT (NP(NP* - - - * * * * *
28 surface surface NN *) Spatial_Concept C0205148 4 * * * * *
29 of of IN (PP* - - - * * * * *
30 the the DT (NP* - - - * * * * *
31 bacteria bacteria NNS *))))))))) mono_cell 4 2 * * * * *)
32 is is AUX (VP* - - - * * * * *
33 poorly poorly RB (ADVP*) Qualitative_Concept C0205169 4 * * * * (AM-MNR*)
34 understood understand VBN (VP*))) Mental_Process C0162340 4 understood * * * (V*)
35 . . . *)) - - - * *
36
37 Phthiocerol Phthiocerol NNP (S1(S(NP(NP* (Lipid (C0070976 (4 * *
38 dimycocerosate dimycocerosate NNP *) ) ) ) * *
39 ( ( -LRB- (PRN* - - - * *
40 PDIM PDIM NNP (NP*) other_organic_compound 6 2 * *
41 ) ) -RRB- *) - - - * *
42 , , , * - - - * *
43 a a DT (NP(NP* - - - * *
44 surface-exposed surface-exposed JJ * lipid 7 2 * *
45 polyketide polyketide JJ * - - - * *
46 lipid lipid NN *) 14742191 1 5 * *
47 necessary necessary JJ (ADJP* Lipid C0023779 4 * *
48 for for IN (PP* 639871 3 * *
49 Mycobacterium Mycobacterium NNP (NP* (ORGANISM (1773 (3 * *
50 tuberculosis tuberculosis FW * ) ) ) * *
51 virulence virulence NN *)))) Biologic_Function C0042765 4 * *
52 , , , *) - - - * *
53 is is AUX (VP* - - - * *
54 the the DT (NP(NP* - - - * *
55 product product NN *) 3707459 1 5 * *
56 of of IN (PP* - - - * *
57 several several JJ (NP(NP* Quantitative_Concept C0443302 4 * (A2*
58 polyketide polyketide JJ * (PROTEIN (localID_0 (1 * *
59 synthases synthases NNS *) ) ) ) * *)
60 including include VBG (PP* Functional_Concept C0332257 4 including (V*)
61 PpsE PpsE NNP (NP*)))))) PROTEIN localID_1 1 * (A1*)
62 . . . *)) - - - * *
63
64 Transport Transport NNP (S1(S(NP(NP*) Cell_Function C0005528 4 * (A1*
65 of of IN (PP* - - - * *
66 PDIM PDIM NNP (NP*))) PROTEIN localID_2 1 * *)
67 requires require VBZ (VP* 2602586 1 5 requires (V*)
68 MmpL7 MmpL7 NNP (NP(NP*) PROTEIN localID_3 1 * (A0*
69 , , , * - - - * *
70 a a DT (NP(NP* - - - * *
71 member member NN *) Population_Group C0680022 4 * *
72 of of IN (PP* - - - * *
73 the the DT (NP(NP* - - - * *
74 MmpL MmpL NNP * (PROTEIN (localID_4 (1 * *
75 family family NN *) ) ) ) * *
76 of of IN (PP* - - - * *
77 RND RND JJ (NP* (PROTEIN (localID_5 (1 * *
78 permeases permeases NNS *))))))) ) ) ) * *)
79 . . . *)) - - - * * *
80
81 Here Here RB (S1(S(ADVP*) 109485 1 5 * (AM-LOC*) *
82 we we PRP (NP*) - - - * (A0*) *
83 show show VBP (VP* 656725 2 5 show (V*) *
84 that that IN (SBAR* - - - * (A1* *
85 a a DT (S(NP(NP* - - - * * (A1*
86 domain domain NN *) 8437765 2 5 * * *
87 of of IN (PP* - - - * * *
88 MmpL7 MmpL7 NNP (NP*))) PROTEIN localID_3 1 * * *)
89 biochemically biochemically RB (ADVP*) Functional_Concept C0205474 4 * * (AM-ADV*)
90 interacts interact VBZ (VP* Molecular_Function C0687133 4 interacts * (V*)
91 with with IN (PP* - - - * * (A2*
92 PpsE PpsE NNP (NP(NP*) PROTEIN localID_1 1 * * *
93 , , , * - - - * * *
94 the the DT (NP(NP* - - - * * *
95 first first JJ * Quantitative_Concept C0205435 4 * * *
96 report report NN *) Intellectual_Product C0684224 4 * * *
97 of of IN (PP* - - - * * *
98 an an DT (NP(NP* - - - * * *
99 interaction interaction NN *) Molecular_Function C0687133 4 * * *
100 between between IN (PP* - - - * * *
101 a a DT (NP(NP* - - - * * *
102 biosynthetic biosynthetic JJ * - - - * * *
103 enzyme enzyme NN *) PROTEIN 638536 3 * * *
104 and and CC * - - - * * *
105 its its PRP$ (NP* - - - * * *
106 cognate cognate JJ * 2042649 1 5 * * *
107 transporter transporter NN *)))))))))))) PROTEIN localID_6 1 * *) *)
108 . . . *)) - - - * * *
109