Distributional Word Embedding Modeling Task

From GM-RKB
Jump to navigation Jump to search

A Distributional Word Embedding Modeling Task is a continuous dense distributional word model creation task that is a continuous word vector space model training task (that requires a continuous dense distributional word model/distributional continuous dense word model).



References

2013

  • http://google-opensource.blogspot.be/2013/08/learning-meaning-behind-words.html
    • QUOTE: Word2vec uses distributed representations of text to capture similarities among concepts. For example, it understands that Paris and France are related the same way Berlin and Germany are (capital and country), and not the same way Madrid and Italy are. This chart shows how well it can learn the concept of capital cities, just by reading lots of news articles -- with no human supervision:
      word2vec-Country-and-Capital-Vectors-Projected-by-PCA.gif
      The model not only places similar countries next to each other, but also arranges their capital cities in parallel. The most interesting part is that we didn’t provide any supervised information before or during training. Many more patterns like this arise automatically in training.

      This has a very broad range of potential applications: knowledge representation and extraction; machine translation; question answering; conversational systems; and many others. We’re open sourcing the code for computing these text representations efficiently (on even a single machine) so the research community can take these models further.