n-Gram Generation System

From GM-RKB
Jump to navigation Jump to search

An n-Gram Generation System is a tuple generation system that can solve an n-Gram generation task (to produce an n-gram sets for a sequence record).



References

2008

  • http://code.prashanthellina.com/code/generate_ngrams.py
    • QUOTE: The “generate_ngrams.py” script creates uni, bi and tri-grams of whatever text is piped into it. The following command pipes all the txt files through both the scripts to create the ngrams file. for i in `find gutenberg_txt/ -name "*.txt"`; do cat $i | python remove_gutenberg_text.py | grep -i -v "project gutenberg" | python generate_ngrams.py >> gutenberg_ngrams; done