2009 SecondGenerationAMIRAToolsforAr

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Phrase Chunking System.

Notes

Cited By

Quotes

Abstract

In this paper, we address the problem of processing Modern Standard Arabic. We present the second generation of tools that process Arabic (AMIRA). AMIRA is a successor suite to the ASVMTools. The AMIRA toolkit includes a clitic tokenizer (TOK), part of speech tagger (POS) and base phrase chunker (BPC) - shallow syntactic parser. The technology of AMIRA is based on supervised learning with no explicit dependence on explicit modeling or knowledge of deep morphology. AMIRA is based on using a unified framework casting each of the component problems as a classification task. The underlying technology employs Support Vector Machines in a sequence modeling framework using the YAMCHA toolkit. The system is very fast and robust and allows for a number of variable user settings depending on the disambiguation granularity. The AMIRA toolkit has been widely used for different NLP (MT, IE, IR, NER, etc.) applications due to its speed and high performance.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 SecondGenerationAMIRAToolsforArMona T. DiabSecond Generation AMIRA Tools for Arabic Processing: Fast and Robust Tokenization, POS Tagging, and Base Phrase Chunking2009