2002 MiningMedlineAbstractsSentencesOrPhrases

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Relation Recognition from Text Algorithm, Biochemical Text Mining.

Notes

Cited By

Quotes

Abstract

A growing body of works address automated mining of biochemical knowledge from digital repositories of scientific literature, such as MEDLINE. Some of these works use abstracts as the unit of text from which to extract facts. Others use sentences for this purpose, while still others use phrases. Here we compare abstracts, sentences, and phrases in MEDLINE using the standard information retrieval performance measures of recall, precision, and effectiveness, for the task of mining interactions among biochemical terms based on term co-occurrence. Results show statistically significant differences that can impact the choice of text unit.



References

  • 1. Christian Blaschke, M. Andrade, C. Ouzounis, and A. Valencia, “Automatic extraction of biological information from scientific text: protein-protein interactions” AAAI Conference on Intelligent Systems in Molecular Biology, 60-67 (1999).
  • 2. M. Craven and J. Kumlien, “Constructing biological knowledge based by extracting information from text sources” 7th International Conference on Intelligent Systems for Molecular Biology (ISMB-99).
  • 3. W. Conover, Practical Nonparametric Statistics, 2nd Ed. (Wiley, NY, 1980).
  • 4. J. Dickerson, D. Berleant, Z.Cox, W. Qi, D. Ashlock, and E. Wurtele, “Creating metabolic network models using text mining and expert knowledge” Atlantic Symposium on Computational Biology and Genome Information Systems & Technology (CBGIST 2001), 26-30.
  • 5. K. Humphreys, G. Demetriou, and R. Gaizauskas, “Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures” Pacific Symposium on Biocomputing 5, 502-513 (2000).
  • 6. S.-K. Ng and M. Wong, “Toward routine automatic pathway discovery from on-line scientific text abstracts” Genome Informatics 10, 104-112 (1999).
  • 7. T. Ono, H. Hishigaki, A. Tanigami, and T. Takagi, “Automated extraction of information on protein-protein interactions from the biological literature” Bioinformatics 17, 155-161 (2001).
  • 8. PUBMED interface to MEDLINE, U.S. National Library of Medicine, http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed.
  • 9. T. Rindflesch, L. Hunter, and A. Aronson, “Mining molecular binding terminology from biomedical text” Proceedings of the AMIA ’99 Annual Symposium.
  • 10. T. Rindflesch, L. Tanabe, J. Weinstein, L. Hunter, “EDGAR: extraction of drugs, genes and relations from the biomedical literature” Pacific Symposium on Biocomputing 5, 514-525 (2000).
  • 11. W. Salamonsen, K. Mok, P. Kolatkar, and S. Subbiah, “BioJAKE: a tool for the creation, visualization and manipulation of metabolic pathways” Pacific Symposium on Biocomputing 4, 392-400 (1999).
  • 12. T. Sekimizu, H. Park, and Jun'ichi Tsujii, “Identifying the interaction between genes and gene products based on frequently seen verbs in MEDLINE abstracts” Genome Informatics (Universal Academy Press, Inc., 1998).
  • 13. J. Shaffer, “Modified sequentially rejective multiple test procedures” Journal of the American Statistical Association 81, 826-831 (1986).
  • 14. H. Shatkay, S. Edwards, W. Wilbur, and M. Boguski, “Genes, themes, and microarrays: using information retrieval for large-scale gene analysis” 8th International Conference on Intelligent Systems for Mol. Bio. (ISMB 2000), La Jolla, Aug. 19-23.
  • 15. W. Shaw, R. Burgin, and P. Howell, “Performance standards and evaluations in IR test collections: cluster-based retrieval models” Information Processing and Management 33 (1), 1-14 (1997).
  • 16. B. Stapley, and G. Benoit, “Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in medline abstracts” Pacific Symposium on Biocomputing 5, 529-540 (2000).
  • 17. L. Tanabe, U. Scherf, L. Smith, J. Lee, L. Hunter, and J. Weinstein, “MedMiner: an internet text-mining tool for biomedical information, with application to gene expression profiling” BioTechniques 27, 1210-1217 (1999).
  • 18. J. Thomas, D. Milward, C. Ouzounis, S. Pulman, and M. Carroll, “Automatic extraction of protein interactions from scientific abstracts” Pacific Symposium on Biocomputing 5, 538-549 (2000).
  • 19. C. Van Rijsbergen, Information Retrieval, Butterworths (1979).
  • 20. P. Westfall, and S. Young, Resampling-based Multiple Testing: Examples and Methods for P-Value Adjustment (Wiley, New York, 1993).
  • 21. R. Willmott, P. Rushton, R. Hooley, and C. Lazarus, “DNase1 footprints suggest the involvement of at least three types of transcription factors in the regulation of alpha-Amy2/A by gibberellin” Plant Molecular Biology 38 (5), 817-825 (1998).
  • 22. L. Wong, “A protein interaction extraction system” Pacific Symposium on Biocomputing 6, (2001). Pacific Symposium on Biocomputing 7:326-337 (2002),


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2002 MiningMedlineAbstractsSentencesOrPhrasesJ. Ding
D. Berleant
D. Nettleton
E. Wurtelec
Mining Medline: Abstracts, Sentences, or Phrases?http://psb.stanford.edu/psb-online/proceedings/psb02/ding.pdf2002