One Billion Word Language Modelling Benchmark Task

From GM-RKB
(Redirected from One BillionWord)
Jump to navigation Jump to search

An One Billion Word Language Modelling Benchmark Task is a NLP Benchmark Task that evaluates the performance of language modeling systems.

Model Perplexity
Interpolated KN 5-gram, 1.1B n-grams 67.6
All models 43.8
Model Num. Params Training Time Perplexity
[billions] [hours] [CPUs]
Interpolated KN 5-gram, 1.1B n-grams (KN) 1.76 3 100 67.6
Katz 5-gram, 1.1B n-grams 1.74 2 100 79.9
Stupid Backoff 5-gram (SBO) 1.13 0.4 200 87.9
Interpolated KN 5-gram, 15M n—grams 0.03 3 100 243.2
Katz 5-gram, 15M n-grams 0.03 2 100 127.5
Binary MaXEnt 5-gram (n-gram features) 1.13 1 5000 115.4
Binary MaXEnt 5-gram (n-gram + skip-1 features) 1.8 1.25 5000 107.1
Hierarchical Softmax MaXEnt 4-gram (HME) 6 3 1 101.3
Recurrent NN-256 + MaXEnt 9-gram 20 60 24 58.3
Recurrent NN-512 + MaXEnt 9-gram 20 120 24 54.5
Recurrent NN-1024 + MaXEnt 9-gram 20 240 24 51.3


References

2014