PPLRE Corpus 8611.a.0-2

From GM-RKB
Jump to: navigation, search

Back to: PPLRE Corpus, PPLRE Corpus 8611


  • Paper Title: Preformed dimeric state of the sensor protein VirA is involved in plant--Agrobacterium signal transduction

<toc>


NOTES

  • Illustrates the case where the organism is infrequently mentioned. In fact, the organism and the protein are never mentioned in the same sentence.
  • Is an example where the relation is spread over three sentences.
  • Sophisticated Coreference Resolution of the phrase "(vir) genes” in the second sentence with "vir genes” in the third "sentence” would be helpful.

Original Text

Plant signal molecules such as acetosyringone and certain monosaccharides induce the expression of Agrobacterium tumefaciens virulence (vir) genes, which are required for the processing, transfer, and possibly integration of a piece of the bacterial plasmid DNA (T-DNA) into the plant genome. Two of the vir genes, virA and virG, belonging to the bacterial two-component regulatory system family, control the induction of vir genes by plant signals. virA encodes a membrane-bound sensor kinase protein and virG encodes a cytoplasmic regulator protein.


Sentence Boundary Detection

  • Sentence Boundary Detection
  • <PSID=8611.0>Plant signal molecules such as acetosyringone and certain monosaccharides induce the expression of Agrobacterium tumefaciens virulence (vir) genes, which are required for the processing, transfer, and possibly integration of a piece of the bacterial plasmid DNA (T-DNA) into the plant genome.
  • <PSID=8611.1> Two of the vir genes, virA and virG, belonging to the bacterial two-component regulatory system family, control the induction of vir genes by plant signals.
  • <PSID=8611.2> virA encodes a membrane-bound sensor kinase protein and virG encodes a cytoplasmic regulator protein.
    • Some of the examples below manually separate the two sentences.

Named Entity Recognition

  • PPLRE, Named Entity Recognition
  • Simple (v2.3) NER:
    • Plant signal molecules such as acetosyringone and certain monosaccharides induce the expression of Agrobacterium tumefaciens virulence (vir) genes, which are required for the processing, transfer, and possibly integration of a piece of the bacterial plasmid DNA (T-DNA) into the plant genome.
    • Two of the vir genes, virA and virG, belonging to the bacterial two-component regulatory system family, control the induction of vir genes by plant signals.
    • virA encodes a membrane-bound sensor kinase protein and virG encodes a cytoplasmic regulator protein.
  • Tagged (v2.3) NER
    • <PSID=8611.0> Plant signal molecules such as acetosyringone and certain monosaccharides induce the expression of <ORGANISM>Agrobacterium tumefaciens</ORGANISM> virulence (vir) genes, which are required for the processing, transfer, and possibly integration of a piece of the bacterial plasmid DNA (T-DNA) into the plant genome.</SENTENCE>
    • <PSID=8611.1> Two of the vir genes, <PROTEIN>virA<PROTEIN> and <PROTEIN>virG<PROTEIN>, belonging to the bacterial two-component regulatory system family, control the induction of vir genes by plant signals.</SENTENCE>
    • <PSID=8611.2> <PROTEIN>virA<PROTEIN> encodes a <LOCATION>membrane-bound</LOCATION> sensor kinase protein and <PROTEIN>virG<PROTEIN> encodes a <LOCATION>cytoplasmic</LOCATION> regulator protein.</SENTENCE>
    • <PSID=8611.title> Preformed dimeric state of the sensor protein <PROTEIN>VirA</PROTEIN> is involved in plant--Agrobacterium signal transduction.</SENTENCE>
  • Joined Single-Concept Words

Plant signal_molecules such as acetosyringone and certain monosaccharides induce the expression of 'Agrobacterium_tumefaciens virulence (vir) genes, which are required for the processing, transfer, and possibly integration of a piece of the bacterial_plasmid_DNA_T-DNA into the plant_genome. Two of the vir genes, virA and virG, belonging_to the bacterial two-component regulatory system family, control the induction of vir genes by plant signals. virA encodes a membrane-bound sensor_kinase protein and virG encodes a cytoplasmic regulator protein.


Preformed dimeric state of the sensor protein VirA is involved in plant--Agrobacterium signal_transduction.


Dependency Tree

Sentence 1

nn(signalMolecules-2, Plant-1) nsubj(induce-9, signalMolecules-2) dep(signalMolecules-2, such-3) amod(monosaccharides-8, acetosyringone-5) conj_and(acetosyringone-5, certain-7) prep_as(signalMolecules-2, monosaccharides-8) det(expression-11, the-10) dobj(induce-9, expression-11) nn(genes-18, AgrobacteriumTumefaciens-13) dep(genes-18, virulence-14) dep(genes-18, vir-16) prep_of(expression-11, genes-18) nsubjpass(required-22, which-20) auxpass(required-22, are-21) rcmod(genes-18, required-22) det(processing-25, the-24) prep_for(required-22, processing-25) conj_and(processing-25, transfer-27) advmod(integration-31, possibly-30) conj_and(processing-25, integration-31) det(piece-34, a-33) prep_of(integration-31, piece-34) det(bacterialPlasmidDNA-37, the-36) prep_of(piece-34, bacterialPlasmidDNA-37) det(plantGenome-40, the-39) prep_into(required-22, plantGenome-40)

    • organism <- virGenes <- expression.

Sentence 2

nsubj(control-19, Two-1) det(genes-5, the-3) nn(genes-5, vir-4) prep_of(Two-1, genes-5) conj_and(genes-5, virA-7) conj_and(genes-5, virG-9) appos(Two-1, belonging-11) det(family-17, the-12) amod(family-17, bacterial-13) amod(family-17, two-14) amod(family-17, regulatory-15) nn(family-17, system-16) dep(belonging-11, family-17) det(induction-21, the-20) dobj(control-19, induction-21) nn(genes-24, vir-23) prep_of(induction-21, genes-24) nn(signals-27, plant-26) prep_by(control-19, signals-27)
(virA) (virG) <- virGenes <- two

Sentence 3

nsubj(encodes-2, virA-1) det(protein-6, a-3) amod(protein-6, membrane-4) nn(protein-6, sensor-5) dobj(encodes-2, protein-6) amod(encodes-9, virG-8) conj_and(protein-6, encodes-9) det(protein-13, a-10) amod(protein-13, cytoplasmic-11) nn(protein-13, regulator-12) dep(encodes-9, protein-13)

    • virA <- encodes -> protein -> encodes -> (virG) ^ (protein -> cytoplasm)

Compressed

                            encodes
                            /   \
       virGenes         virA    protein
      /   |    \                \  

organism virA virG encodes

                            /     \
                         virG    location


Raw Annotated

Plant Plant NN (S1(S(NP(NP* Clinical_Attribute C1148460 4 * (A0* *
signal signal NN * (protein_family_or_group (0 (2 * * *
molecules molecule NNS *) ) ) ) * * *
such such JJ (PP* 1610028 1 5 * * *
as as IN * - - - * * *
acetosyringone acetosyringone JJ (NP* other_organic_compound 1 2 * * *
and and CC * - - - * * *
certain certain JJ * Qualitative_Concept C0205423 4 * * *
monosaccharides monosaccharide NNS *))) Carbohydrate C0026492 4 * *) *
induce induce VB (VP* Functional_Concept C0205263 4 induce (V*) *
the the DT (NP(NP* - - - * (A1* *
expression expression NN *) Therapeutic_or_Preventive_Procedure C0185117 4 * * *
of of IN (PP* - - - * * *
Agrobacterium Agrobacterium NNP (NP(NP* (ORGANISM (358 (3 * * (A1*
tumefaciens tumefaciens NNS * ) ) ) * * *
virulence virulence NN * Biologic_Function C0042765 4 * * *
( ( -LRB- (PRN* - - - * * *
vir vir NN * (PROTEIN (localID_0 (1 * * *
) ) -RRB- *) ) ) ) * * *
genes gene NNS *) - - - * * *)
, , , * - - - * * *
which which WDT (SBAR(WHNP*) - - - * * (R-A1*)
are are AUX (S(VP* - - - * * *
required require VBN (VP* 2602586 1 5 required * (V*)
for for IN (PP* PROTEIN 639871 3 * * (AM-PNC*
the the DT (NP(NP* - - - * * *
processing processing NN *) Body_Part,_Organ,_or_Organ_Component C1184743 4 * * *
, , , * - - - * * *
transfer transfer NN (NP*) Mental_Process C0040671 4 * * *
, , , * - - - * * *
and and CC * - - - * * *
possibly possibly RB (ADVP*) Qualitative_Concept C0332149 4 * * *
integration integration NN (NP(NP*) Amino_Acid,_Peptide,_or_Protein C0309311 4 * * *
of of IN (PP* - - - * * *
a a DT (NP(NP* - - - * * *
piece piece NN *) 6941165 4 5 * * *
of of IN (PP* - - - * * *
the the DT (NP(NP* - - - * * *
bacterial bacterial JJ * (PROTEIN (localID_1 (1 * * *
plasmid plasmid JJ * * * plasmid * *
DNA DNA NN *) * * DNA * *
( ( -LRB- (PRN* * * ( * *
T-DNA T-DNA NNP (NP*) ) ) ) * * *)
) ) -RRB- *) - - - * * *
into into IN (PP* - - - * * *
the the DT (NP* - - - * * *
plant plant NN * (Gene_or_Genome (C0242965 (4 * * *
genome genome NN *))))))))))))))))) ) ) ) * *) *
. . . *)) - - - * * * *
Two Two CD (S1(S(NP(NP(NP* Quantitative_Concept C0205448 4 * (A0* (A0* *
fo fo NNS *) other C0332282 4 * * * *
the the DT (NP* - - - * * * *
vir vir NN * PROTEIN localID_0 1 * * * *
genes gene NNS * - - - * * * *
, , , * - - - * * * *
virA virA NN * PROTEIN localID_2 1 * * * *
and and CC * - - - * * * *
virG virG NN *)) PROTEIN localID_3 1 * *) * *
, , , * - - - * * * *
belonging belong VBG (VP* (02695676 (1 (5 belonging (V*) * *
to to TO (PP* ) ) ) * (A1* * *
the the DT (NP* - - - * * * *
bacterial bacterial JJ * Functional_Concept C0521009 4 * * * *
two-component two-component JJ * Quantitative_Concept C0205448 4 * * * *
regulatory regulatory JJ * Manufactured_Object C0449432 4 * * * *
system system NN * Regulation_or_Law C0220905 4 * * * *
family family NN *))) Functional_Concept C0449913 4 * *) *) *
, , , *) - - - * * * *
control control VBP (VP* Functional_Concept C0243148 4 control * (V*) *
the the DT (NP(NP* - - - * * (A1* (A0*
induction induction NN *) 7351347 1 5 * * * *
of of IN (PP* - - - * * * *
vir vir JJ (NP* PROTEIN localID_0 1 * * * *
genes gene NNS *)) - - - * * * *
by by IN (PP* - - - * * * *
plant plant NN (NP(NP(NP* Clinical_Attribute C1148460 4 * * * *
signals. signals. NN * - - - * * * *
virA virA NN * PROTEIN localID_2 1 * * * *)
encodes encodes NNS *) Mental_Process C0679058 4 encodes * * (V*)
a a DT (NP* - - - * * * (A1*
membrane-bound membrane-bound JJ * LOCALIZATION GO0005886 3 * * * *
sensor sensor JJ * (PROTEIN (localID_4 (1 * * * *
kinase kinase NN * ) ) ) * * * *
protein protein NN *)) - - - * * * *
and and CC * - - - * * * *
virG virG CD (NP(NP* PROTEIN localID_3 1 * * * *
encodes encodes NNS *) - - - * * * *
a a DT (NP* - - - * * * *
cytoplasmic cytoplasmic JJ * (PROTEIN (localID_5 (1 * * * *
regulator regulator NN * ) ) ) * * * *
protein protein NN *)))))) - - - * * *) *)
. . . *)) - - - * * * * *