spaCy NLP System: Difference between revisions
No edit summary |
m (Remove links to pages that are actually redirects to this page.) |
||
Line 15: | Line 15: | ||
=== 2018a === | === 2018a === | ||
* https://github.com/explosion/spaCy | * https://github.com/explosion/spaCy | ||
** QUOTE: [[spaCy]] is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. [[spaCy]] comes with [[pre-trained statistical model]]s and [[word vector]]s, and currently supports [[tokenization]] for 20+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license. | ** QUOTE: [[spaCy NLP System|spaCy]] is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. [[spaCy NLP System|spaCy]] comes with [[pre-trained statistical model]]s and [[word vector]]s, and currently supports [[tokenization]] for 20+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license. | ||
=== 2018 === | === 2018 === | ||
* (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/SpaCy Retrieved:2018-5-23. | * (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/SpaCy Retrieved:2018-5-23. | ||
** '''spaCy''' ({{IPAc-en|s|p|eɪ|ˈ|s|iː}} {{respell|spay|SEE|'}}) is an [[Open-source software|open-source]] software library for advanced [[Natural language processing|Natural Language Processing]], written in the programming languages [[Python (programming language)|Python]] and [[Cython]]. It offers the fastest [[Statistical parsing|syntactic parser]] in the world.<ref>Choi et al. ([[2015]]). [https://aclweb.org/anthology/P/P15/P15-1038.pdf It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool].</ref><ref>{{Cite web|url=https://www.washingtonpost.com/news/wonk/wp/2016/05/18/googles-new-artificial-intelligence-cant-understand-these-sentences-can-you/|title=Google’s new artificial intelligence can’t understand these sentences. Can you?|website=Washington Post|access-date=2016-12-18}}</ref><ref>{{Cite web|url=https://spacy.io/usage/facts-figures|title=Facts & Figures {{!}} [[spaCy]] Usage Documentation|last=|first=|date=|website=spacy.io|archive-url=|archive-date=|dead-url=|access-date=2017-11-08}}</ref> The library is published under the [[MIT License|MIT license]] and currently offers statistical [[Artificial neural network|neural network]] models for English, German, Spanish, Portuguese, French, Italian, Dutch and multi-language [[Named-entity recognition|NER]], as well as [[Tokenizer|tokenization]] for various other languages.<ref>{{Cite web|url=https://spacy.io/usage/models#languages|title=Models & Languages {{!}} [[spaCy]] Usage Documentation|last=|first=|date=|website=spacy.io|archive-url=|archive-date=|dead-url=|access-date=2017-11-08}}</ref> <P> Unlike [[Natural Language Toolkit|NLTK]], which is widely used for teaching and research, [[spaCy]] focuses on providing software for production usage. <ref name="Bird-Klein-Loper-Baldridge"></ref> As of version 1.0, [[spaCy]] also supports [[deep learning]] workflows that allow connecting statistical models trained by popular [[machine learning]] libraries like [[TensorFlow]], [[Keras]], [[Scikit-learn]] or [[PyTorch]]. [[spaCy]]'s [[machine learning]] library, Thinc, is also available as a separate [[Open-source software|open-source]] [[Python (programming language)|Python]] library. On November 7, 2017, version 2.0 was released. It features [[convolutional neural network]] models for [[part-of-speech tagging]], dependency parsing and [[Named-entity recognition|named entity recognition]], as well as API improvements around training and updating models, and constructing custom processing pipelines. | ** '''spaCy''' ({{IPAc-en|s|p|eɪ|ˈ|s|iː}} {{respell|spay|SEE|'}}) is an [[Open-source software|open-source]] software library for advanced [[Natural language processing|Natural Language Processing]], written in the programming languages [[Python (programming language)|Python]] and [[Cython]]. It offers the fastest [[Statistical parsing|syntactic parser]] in the world.<ref>Choi et al. ([[2015]]). [https://aclweb.org/anthology/P/P15/P15-1038.pdf It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool].</ref><ref>{{Cite web|url=https://www.washingtonpost.com/news/wonk/wp/2016/05/18/googles-new-artificial-intelligence-cant-understand-these-sentences-can-you/|title=Google’s new artificial intelligence can’t understand these sentences. Can you?|website=Washington Post|access-date=2016-12-18}}</ref><ref>{{Cite web|url=https://spacy.io/usage/facts-figures|title=Facts & Figures {{!}} [[spaCy NLP System|spaCy]] Usage Documentation|last=|first=|date=|website=spacy.io|archive-url=|archive-date=|dead-url=|access-date=2017-11-08}}</ref> The library is published under the [[MIT License|MIT license]] and currently offers statistical [[Artificial neural network|neural network]] models for English, German, Spanish, Portuguese, French, Italian, Dutch and multi-language [[Named-entity recognition|NER]], as well as [[Tokenizer|tokenization]] for various other languages.<ref>{{Cite web|url=https://spacy.io/usage/models#languages|title=Models & Languages {{!}} [[spaCy NLP System|spaCy]] Usage Documentation|last=|first=|date=|website=spacy.io|archive-url=|archive-date=|dead-url=|access-date=2017-11-08}}</ref> <P> Unlike [[Natural Language Toolkit|NLTK]], which is widely used for teaching and research, [[spaCy NLP System|spaCy]] focuses on providing software for production usage. <ref name="Bird-Klein-Loper-Baldridge"></ref> As of version 1.0, [[spaCy NLP System|spaCy]] also supports [[deep learning]] workflows that allow connecting statistical models trained by popular [[machine learning]] libraries like [[TensorFlow]], [[Keras]], [[Scikit-learn]] or [[PyTorch]]. [[spaCy NLP System|spaCy]]'s [[machine learning]] library, Thinc, is also available as a separate [[Open-source software|open-source]] [[Python (programming language)|Python]] library. On November 7, 2017, version 2.0 was released. It features [[convolutional neural network]] models for [[part-of-speech tagging]], dependency parsing and [[Named-entity recognition|named entity recognition]], as well as API improvements around training and updating models, and constructing custom processing pipelines. | ||
<references/> | <references/> | ||
Revision as of 20:45, 23 December 2019
A spaCy NLP System is a Python/Cython-based natural language processing library.
- Example(s):
- v2.0.11 (2018-04-04).
- ...
- v1.x (~2015).
- Counter-Example(s):
- See: NER System, MIT License, Syntactic Parsing System, Natural Language Toolkit.
References
2018a
- https://github.com/explosion/spaCy
- QUOTE: spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 20+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.
2018
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/SpaCy Retrieved:2018-5-23.
- spaCy ( /speɪˈsiː/ Template:Respell) is an open-source software library for advanced Natural Language Processing, written in the programming languages Python and Cython. It offers the fastest syntactic parser in the world.[1][2][3] The library is published under the MIT license and currently offers statistical neural network models for English, German, Spanish, Portuguese, French, Italian, Dutch and multi-language NER, as well as tokenization for various other languages.[4]
Unlike NLTK, which is widely used for teaching and research, spaCy focuses on providing software for production usage. [5] As of version 1.0, spaCy also supports deep learning workflows that allow connecting statistical models trained by popular machine learning libraries like TensorFlow, Keras, Scikit-learn or PyTorch. spaCy's machine learning library, Thinc, is also available as a separate open-source Python library. On November 7, 2017, version 2.0 was released. It features convolutional neural network models for part-of-speech tagging, dependency parsing and named entity recognition, as well as API improvements around training and updating models, and constructing custom processing pipelines.
- spaCy ( /speɪˈsiː/ Template:Respell) is an open-source software library for advanced Natural Language Processing, written in the programming languages Python and Cython. It offers the fastest syntactic parser in the world.[1][2][3] The library is published under the MIT license and currently offers statistical neural network models for English, German, Spanish, Portuguese, French, Italian, Dutch and multi-language NER, as well as tokenization for various other languages.[4]
- ↑ Choi et al. (2015). It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool.
- ↑ "Google’s new artificial intelligence can’t understand these sentences. Can you?". https://www.washingtonpost.com/news/wonk/wp/2016/05/18/googles-new-artificial-intelligence-cant-understand-these-sentences-can-you/. Retrieved 2016-12-18.
- ↑ "Facts & Figures | spaCy Usage Documentation". https://spacy.io/usage/facts-figures. Retrieved 2017-11-08.
- ↑ "Models & Languages | spaCy Usage Documentation". https://spacy.io/usage/models#languages. Retrieved 2017-11-08.
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedBird-Klein-Loper-Baldridge
2018b
- (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/SpaCy#Main_features Retrieved:2018-5-23.
- Non-destructive tokenization
- Named entity recognition
- Support for over 25 languages * Statistical models models for 8 languages
- Pre-trained word vectors
- Part-of-speech tagging
- Labelled dependency parsing
- Syntax-driven sentence segmentation
- Text classification
- Built-in visualizers for syntax and named entities
- Deep learning integration