https://www.gabormelli.com/RKB/api.php?action=feedcontributions&user=Replacement+Bot&feedformat=atomGM-RKB - User contributions [en]2024-03-29T06:46:28ZUser contributionsMediaWiki 1.39.4https://www.gabormelli.com/RKB/index.php?title=%E2%84%931_Norm_Minimization_Task&diff=548294ℓ1 Norm Minimization Task2019-12-23T20:45:32Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[ℓ1 Norm Minimization Task]] is a [[minimization task]] that is required to find an [[l1-norm solution]] to an [[underdetermined linear system]] </math>b=Ax</math>.<br />
* <B>Context:</B><br />
** It can be solved by an [[ℓ1 Minimization System]] (that implements an [[ℓ1 Minimization Algorithm]]).<br />
** It can range from being a [[Constrained ℓ1 Minimization Task]] to being an [[Unconstrained ℓ1 Minimization Task]].<br />
* <B>See:</B> [[Covariance Matrix]], [[ℓ2 Minimization]], [[ℓ1 Norm]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2013 ===<br />
* ([[2013_FastMinimizationAlgorithmsforRo|Yang et al., 2013]]) &rArr; [[Allen Y Yang]], [[Zihan Zhou]], [[Arvind Ganesh Balasubramanian]], [[S Shankar Sastry]], and [[Yi Ma]]. ([[2013]]). “[http://arxiv.org/pdf/1007.3753 Fast-minimization Algorithms for Robust Face Recognition].” In: Image Processing, IEEE Transactions on, 22(8). <br />
** QUOTE: [[ℓ1 Norm Minimization Task|<i>l</i>1-minimization]] refers to [[finding the minimum]] [[l1-norm solution]] to an [[underdetermined linear system]] </math>b=Ax</math>. </s> Under certain conditions as described in [[compressive sensing theory]], the [[minimum solution|minimum]] [[l1-norm solution]] is also the [[sparsest solution]]. </s><br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=%E2%84%931_Norm_Minimization_Algorithm&diff=548293ℓ1 Norm Minimization Algorithm2019-12-23T20:45:32Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[ℓ1 Norm Minimization Algorithm]] is a [[norm minimization algorithm]] that can applied by an [[ℓ1 Minimization System]] (to solve an [[ℓ1 Minimization Task]]).<br />
* <B>Context:</B><br />
** It can range from being a [[Constrained ℓ1 Minimization Algorithm]] to being an [[Unconstrained ℓ1 Minimization Algorithm]].<br />
* <B>See:</B> [[ℓ1 Norm Minimization Algorithm|ℓ1 Minimization Algorithm]].<br />
----<br />
----<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=%E2%84%931_Norm_Distance_Function&diff=548292ℓ1 Norm Distance Function2019-12-23T20:45:32Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[ℓ1 Norm Distance Function|ℓ1 norm distance function]] is a [[Minkowski distance function]] with <math>d=1</math> (that represents the [[shortest]] [[distance]] in [[unit step]]s along each [[axis]] between two [[point]]s).<br />
* <B>AKA:</B> [[ℓ1 Norm Distance Function|Taxicab Geometry]], [[Manhattan/Rectilinear Distance]], <math>\ell_1</math>.<br />
* <B>Context</U>:</B><br />
** It can be defined as <math>\|\mathbf{x}\|_1 := \sum_{i=1}^{n} |x_i|.</math><br />
** It can be a part of an [[L1 Norm Metric Space]].<br />
** It can be computed by the [[Sum]] of the differences in each [[Dimension]].<br />
** It can (often) be an input to [[L1-Norm Regularization]].<br />
* <B>Example(s):</B> <br />
** <math>\ell_1 ((1,1),(2,3)) \Rightarrow 3</math>.<br />
* <B>Counter-Example(s):</B> <br />
** [[L2 Norm Distance Function]]<br />
* <B>See:</B> [[Set Distance Function]]; [[Case-Based Learning]]; [[Nearest Neighbor]], [[Lp Space]], [[Absolute Difference]], [[Cartesian Coordinate]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2015 ===<br />
* (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/taxicab_geometry Retrieved:2015-11-22.<br />
** '''Taxicab geometry</B>, considered by [[Hermann Minkowski]] in 19th century Germany, is a form of [[geometry]] in which the usual distance function of [[metric space|metric]] or [[Euclidean geometry]] is replaced by a new metric in which the [[distance]] between two points is the sum of the [[absolute difference]]s of their [[Cartesian coordinate]]s. The taxicab metric is also known as '''rectilinear distance</B>, <B><i>L''<sub>1</sub> distance</B> or ''' <math> \ell_1 </math> norm</B> (see [[Lp space|''L''<sup>''p</i></sup> space]]), <B>city block distance</B>, '''Manhattan distance</B>, or '''Manhattan length</B>, with corresponding variations in the name of the geometry. <ref> [http://www.nist.gov/dads/HTML/manhattanDistance.html Manhattan distance] </ref> The latter names allude to the [[Commissioners' Plan of 1811|grid layout of most streets]] on the island of [[Manhattan]], which causes the shortest path a car could take between two intersections in the [[Borough (New York City)|borough]] to have length equal to the intersections' distance in taxicab geometry.<br />
<references/><br />
<br />
=== 2011 ===<br />
* ([[Craw, 2011c]]) ⇒ Susan Craw. ([[2011]]). “Manhattan Distance.” In: ([[Sammut & Webb, 2011]]) p.639<br />
<BR><br />
* http://en.wikipedia.org/wiki/L1_norm#Formal_description<br />
** QUOTE: The taxicab distance, <math>d_1</math>, between two [[vector]]s <math>\mathbf{p}, \mathbf{q}</math> in an ''n</i>-dimensional [[real number|real]] [[vector space]] with fixed [[Cartesian coordinate system]], is the sum of the lengths of the projections of the [[line segment]] between the points onto the [[coordinate axes]]. More formally, <math>d_1(\mathbf{p}, \mathbf{q}) = \|\mathbf{p} - \mathbf{q}\|_1 = \sum_{i=1}^n |p_i-q_i|,</math> where <math>\mathbf{p}=(p_1,p_2,\dots,p_n)\text{ and }\mathbf{q}=(q_1,q_2,\dots,q_n)\,</math> are [[Euclidean vector|vector]]s. For example, in the [[plane (mathematics)|plane]], the taxicab distance between <math>(p_1,p_2)</math> and <math>(q_1,q_2)</math> is <math>| p_1 - q_1 | + | p_2 - q_2 |.</math> <P> Taxicab distance depends on the [[rotation]] of the coordinate system, but does not depend on its [[reflection (mathematics)|reflection]] about a coordinate axis or its [[translation (geometry)|translation]]. Taxicab geometry satisfies all of [[Hilbert's axiom]]s (a formalization of [[Euclidean geometry]]) except for the [[Congruence (geometry)|side-angle-side axiom]], as one can generate two triangles each with two sides and the angle between them the same, and have them not be congruent.<br />
<references/><br />
<br />
=== 2010 ===<br />
* http://en.wikipedia.org/wiki/Norm_%28mathematics%29#Taxicab_norm_or_Manhattan_norm<br />
** <math>\|\boldsymbol{x}\|_1 := \sum_{i=1}^{n} |x_i|.</math><br />
<br />
=== 2009 ===<br />
* ([[Weisstein, 2009-11-02]]) ⇒ Eric W. Weisstein. ([[2009]]). “L1-Norm." From MathWorld - A Wolfram Web Resource. http://mathworld.wolfram.com/L1-Norm.html <br />
** A vector norm defined for a vector <math>\mathbf{x}=[x_1, x_2, ..., x_n]</math>, with complex entries by <math>|x|_1=\sum_{r=1}^n|x_r|</math>. The <math>L^1</math>-norm <math>|x|_1</math> of a vector <math>x</math> is ...<br />
<br />
=== 2008 ===<br />
* ([[2008_SparseInvarianceCovarianceEstimation|Friedman et al., 2008]]) ⇒ [[Jerome H. Friedman]], [[Trevor Hastie]], and [[Robert Tibshirani]]. ([[2008]]). “[http://arxiv.org/pdf/0708.3517 Sparse Inverse Covariance Estimation with the Graphical Lasso].” In: Biostatistics, 9(3). [http://dx.doi.org/10.1093/biostatistics/kxm045 doi:10.1093/biostatistics/kxm045].<br />
<br />
=== 1990 ===<br />
* ([[Horn & Johnson, 1990]]) ⇒ R. A. Horn, and C. R. Johnson. ([[1990]]). “Norms for Vectors and Matrices." Ch. 5 in Matrix Analysis. Cambridge University Press.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=z-Score&diff=548291z-Score2019-12-23T20:45:32Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[z-Score]] is a [[score]] that is the (signed) number of [[standard deviation]]s an observation or [[data|datum]] is ''above'' the [[mean]].<br />
* <B>AKA:</B> [[z-Score|Standard Score]].<br />
* <B>Context:</B><br />
** It can (typically) be a member of a [[Z-Space]].<br />
* <B>Example(s):</B><br />
** ...<br />
* <B>Counter-Example(s):</B><br />
** [[Raw Score]].<br />
* <B>See:</B> [[z-Distribution]], [[Dimensionless Number]], [[Population Mean]], [[Statistical Population]], [[Normalization (Statistics)]], [[Normal Distribution]], [[Standard Normal Deviate]], [[Student's t-Statistic]], [[z-Factor]], [[Normalizing Constant]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2014 ===<br />
* (Wikipedia, 2014) &rArr; http://en.wikipedia.org/wiki/Standard_score Retrieved:2014-9-20.<br />
** In [[statistics]], the '''standard score</B> is the (signed) number of [[standard deviation]]s an observation or [[data|datum]] is ''above'' the [[mean]]. Thus, a positive standard score indicates a datum above the mean, while a negative standard score indicates a datum below the mean. It is a [[dimensionless number|dimensionless quantity]] obtained by subtracting the [[population mean]] from an individual [[raw score]] and then dividing the difference by the [[statistical population|population]] [[standard deviation]]. This conversion process is called '''standardizing</B> or '''normalizing</B> (however, "normalizing" can refer to many types of ratios; see [[normalization (statistics)]] for more). <P> Standard scores are also called '''z-values, ''z''-scores, normal scores,</B> and '''standardized variables;''' the use of "Z" is because the [[normal distribution]] is also known as the "Z distribution". They are most frequently used to compare a sample to a [[standard normal deviate]] (standard normal distribution, with ''μ''&nbsp;=&nbsp;0 and ''σ''&nbsp;=&nbsp;1), though they can be defined without assumptions of normality. <P> The z-score is ''only'' defined if one knows the population parameters; if one only has a sample set, then the analogous computation with [[sample mean]] and sample standard deviation yields the [[Student's t-statistic]]. <P> The standard score is not the same as the [[z-factor]] used in the analysis of [[high-throughput screening]] data though the two are often conflated.<br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=xml.etree.ElementTree_Module&diff=548290xml.etree.ElementTree Module2019-12-23T20:45:32Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[xml.etree.ElementTree Module]] is a [[Python library]] within [[xml.etree]].<br />
* <B>See:</B> [[MediaWiki XML Snapshot File Parser]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2018 ===<br />
* https://docs.python.org/3/library/xml.etree.elementtree.html<br />
** QUOTE: The [[xml.etree.ElementTree Module|xml.etree.ElementTree module]] implements a simple and efficient API for parsing and creating [[XML data]]. ... <P> This is a short tutorial for using [[xml.etree.ElementTree Module|xml.etree.ElementTree (ET in short)]]. The goal is to demonstrate some of the building blocks and basic concepts of the module. ... <P> [[XML]] is an inherently [[hierarchical data format]], and the most natural way to represent it is with a [[tree data structure|tree]]. ET has two classes for this purpose - [[xml.etree.ElementTree Module|ElementTree]] represents the whole XML document as a tree, and Element represents a single node in this tree. Interactions with the whole document (reading and writing to/from files) are usually done on the ElementTree level. Interactions with a single XML element and its sub-elements are done on the Element level.<br />
<br />
=== 2017 ===<br />
* ([[Heaton, 2017]]) ⇒ [[Jeff Heaton]]. ([[2017]]). “[http://heatonresearch.com/2017/03/03/python-basic-wikipedia-parsing.html Reading Wikipedia XML Dumps with Python]." Blog post<br />
** QUOTE: … The code below shows you the beginning of this file. As you can see the file is made up of page tags that contain revision tags. … To read this file it is important that the XML is streamed and not read directly into memory as a DOM parser might do. The [[xml.etree.ElementTree Module|xml.etree.ElementTree class]] can be used to do this. The following imports are needed for this example. For the complete source code see the following [https://raw.githubusercontent.com/jeffheaton/article-code/master/python/wikipedia/wiki-basic-stream.py GitHub link]. ...<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=xml.etree.ElementTree_Module&diff=548289xml.etree.ElementTree Module2019-12-23T20:45:32Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[xml.etree.ElementTree Module]] is a [[Python library]] within [[xml.etree]].<br />
* <B>See:</B> [[MediaWiki XML Snapshot File Parser]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2018 ===<br />
* https://docs.python.org/3/library/xml.etree.elementtree.html<br />
** QUOTE: The [[xml.etree.ElementTree Module|xml.etree.ElementTree module]] implements a simple and efficient API for parsing and creating [[XML data]]. ... <P> This is a short tutorial for using [[xml.etree.ElementTree Module|xml.etree.ElementTree (ET in short)]]. The goal is to demonstrate some of the building blocks and basic concepts of the module. ... <P> [[XML]] is an inherently [[hierarchical data format]], and the most natural way to represent it is with a [[tree data structure|tree]]. ET has two classes for this purpose - [[xml.etree.ElementTree Module|ElementTree]] represents the whole XML document as a tree, and Element represents a single node in this tree. Interactions with the whole document (reading and writing to/from files) are usually done on the ElementTree level. Interactions with a single XML element and its sub-elements are done on the Element level.<br />
<br />
=== 2017 ===<br />
* ([[Heaton, 2017]]) ⇒ [[Jeff Heaton]]. ([[2017]]). “[http://heatonresearch.com/2017/03/03/python-basic-wikipedia-parsing.html Reading Wikipedia XML Dumps with Python]." Blog post<br />
** QUOTE: … The code below shows you the beginning of this file. As you can see the file is made up of page tags that contain revision tags. … To read this file it is important that the XML is streamed and not read directly into memory as a DOM parser might do. The [[xml.etree.ElementTree class]] can be used to do this. The following imports are needed for this example. For the complete source code see the following [https://raw.githubusercontent.com/jeffheaton/article-code/master/python/wikipedia/wiki-basic-stream.py GitHub link]. ...<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=xMatters_Platform&diff=548288xMatters Platform2019-12-23T20:45:32Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[xMatters Platform]] is a [[alerting platform]].<br />
<br />
* <B>See:</B> [[Time-Sensitive Event]], [[Communications Platform]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
* http://www.xmatters.com<br />
<br />
=== 2019a ===<br />
* https://www.linkedin.com/company/xmatters-inc/about/<br />
** QUOTE: ... [[xMatters Platform|xMatters]] is an integration-driven collaboration platform that relays data between systems while engaging the right people to resolve incidents. [[xMatters Platform|xMatters]] automates and brings structure to communication so you can proactively prevent outages, manage incidents, and keep the right people informed. Any time there’s a hiccup in your business, resolution processes need to coordinate communication across tools, people, and teams—and to top it off, resolution processes need to operate at the same pace (very fast) as the rest of your business. Whether you're managing DevOps Processes, ITSM, major incidents management, or business continuity, your teams need to move data between tools and systems and deliver information in context to the people who take action. Use [[xMatters Platform|xMatters]] to connect your existing tools and optimize your business processes. ...<br />
<br />
=== 2019b ===<br />
* https://www.crunchbase.com/organization/xmatters<br />
** QUOTE: ... [[xMatters Platform|xMatters]] is an intelligent communications platform that connects insights from any system to the people that matter. <P> The platform automates, operationalizes and contextualizes communications within key DevOps processes, fundamentally altering the way business units work together. [[xMatters Platform|xMatters]] also supports enterprises with major incident and change management, alerting the right people on the right channels to time-sensitive events and problems like network outages, supply-chain disruptions, natural disasters and medical emergencies. ...<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=x.ai_Corporation&diff=548287x.ai Corporation2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[x.ai Corporation]] is an [[American AI corporation]].<br />
* <B>See:</B> [[Scheduling Assistant]], [[Nuance Communications, Inc.]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017 ===<br />
* https://x.ai/about/<br />
** QUOTE: Founded in 2014, [[x.ai Corporation|x.ai]] makes an artificial intelligence personal assistant who schedules meetings for you. We’re backed by blue chip investors, including IA Ventures, Firstmark, and Two Sigma Ventures, and located in New York City. <P> We’re a hardcore technology company, developing invisible software. We build our business sustainably through passionate and loyal customers — and every single team member, scientist or not, has a mission of delivering exceptional customer service at all times.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=word2vec_Model_Instance&diff=548286word2vec Model Instance2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[word2vec Model Instance]] is a [[continuous dense distributional word vector space model]] produced by a [[word2vec system]].<br />
* <B>AKA:</B> [[word2vec Model Instance|word2vec Word Vector Space Model]], [[word2vec Model Instance|word2vec WVSM]].<br />
* <B>Context:</B><br />
** It can (typically) include a [[word2vec Word Vectorizing Function]] (to create [[word2vec vector]]s).<br />
* <B>Example(s):</B><br />
** the model created by v1 code on the [[20 Newsgroups Corpus]] using settings <code>-cbow 1 -negative 25 -hs 0 -sample 1e-4 -threads 40 -binary 1 -iter 15 -window 8 -size 200</code>.<br />
* <B>Counter-Example(s):</B><br />
** a [[Dependency Parse-based WVSM]].<br />
** a [[Window Baseline-based WVSM]].<br />
* <B>See:</B> [[word2vec Distance Function]], [[word2vec Analogy Function]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2017 ===<br />
* ([[2017_CompressingWordEmbeddingsviaDee|Shu & Nakayama, 2017]]) ⇒ [[Raphael Shu]], and [[Hideki Nakayama]]. ([[2017]]). “[https://arxiv.org/pdf/1711.01068 Compressing Word Embeddings via Deep Compositional Code Learning].&rdquo; In: Proceedings of 5th [[International Conference on Learning Representations]] ([[ICLR-2017]]). <br />
<br />
=== 2014 ===<br />
* ([[2014_LookingforHyponymsinVectorSpace|Rei & Briscoe, 2014]]) ⇒ [[Marek Rei]], and [[Ted Briscoe]]. ([[2014]]). “[http://www.aclweb.org/anthology/W14-1608 Looking for Hyponyms in Vector Space].” In: Proceedings of CoNLL-2014. <br />
** QUOTE: The [[Window Baseline-based WVSM|window-based]], [[Dependency Parse-based WVSM|dependency-based]] and [[word2vec Model Instance|word2vec vector set]]s were all trained on 112M words from the [[British National Corpus]], with preprocessing steps for [[lower-casing]] and [[lemmatising]]. </s> Any numbers were grouped and substituted by more generic tokens. </s><br />
<br />
=== 2013 ===<br />
* https://code.google.com/p/word2vec/<br />
** QUOTE: The [[word2vec System|word2vec tool]] takes a [[text corpus]] as input and produces the [[word vectors]] as output. It first constructs a [[vocabulary]] from the [[training text data]] and then [[learns vector representation of words]]. The resulting [[word2vec Model Instance|word vector file]] can be used as features in many [[NLP application|natural language processing]] and [[machine learning application]]s. <P> <br />
<br />
----<br />
__NOTOC__<br />
<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=word2vec_Model_Instance&diff=548285word2vec Model Instance2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[word2vec Model Instance]] is a [[continuous dense distributional word vector space model]] produced by a [[word2vec system]].<br />
* <B>AKA:</B> [[word2vec Model Instance|word2vec Word Vector Space Model]], [[word2vec WVSM]].<br />
* <B>Context:</B><br />
** It can (typically) include a [[word2vec Word Vectorizing Function]] (to create [[word2vec vector]]s).<br />
* <B>Example(s):</B><br />
** the model created by v1 code on the [[20 Newsgroups Corpus]] using settings <code>-cbow 1 -negative 25 -hs 0 -sample 1e-4 -threads 40 -binary 1 -iter 15 -window 8 -size 200</code>.<br />
* <B>Counter-Example(s):</B><br />
** a [[Dependency Parse-based WVSM]].<br />
** a [[Window Baseline-based WVSM]].<br />
* <B>See:</B> [[word2vec Distance Function]], [[word2vec Analogy Function]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2017 ===<br />
* ([[2017_CompressingWordEmbeddingsviaDee|Shu & Nakayama, 2017]]) ⇒ [[Raphael Shu]], and [[Hideki Nakayama]]. ([[2017]]). “[https://arxiv.org/pdf/1711.01068 Compressing Word Embeddings via Deep Compositional Code Learning].&rdquo; In: Proceedings of 5th [[International Conference on Learning Representations]] ([[ICLR-2017]]). <br />
<br />
=== 2014 ===<br />
* ([[2014_LookingforHyponymsinVectorSpace|Rei & Briscoe, 2014]]) ⇒ [[Marek Rei]], and [[Ted Briscoe]]. ([[2014]]). “[http://www.aclweb.org/anthology/W14-1608 Looking for Hyponyms in Vector Space].” In: Proceedings of CoNLL-2014. <br />
** QUOTE: The [[Window Baseline-based WVSM|window-based]], [[Dependency Parse-based WVSM|dependency-based]] and [[word2vec Model Instance|word2vec vector set]]s were all trained on 112M words from the [[British National Corpus]], with preprocessing steps for [[lower-casing]] and [[lemmatising]]. </s> Any numbers were grouped and substituted by more generic tokens. </s><br />
<br />
=== 2013 ===<br />
* https://code.google.com/p/word2vec/<br />
** QUOTE: The [[word2vec System|word2vec tool]] takes a [[text corpus]] as input and produces the [[word vectors]] as output. It first constructs a [[vocabulary]] from the [[training text data]] and then [[learns vector representation of words]]. The resulting [[word2vec Model Instance|word vector file]] can be used as features in many [[NLP application|natural language processing]] and [[machine learning application]]s. <P> <br />
<br />
----<br />
__NOTOC__<br />
<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=word2vec_Model_Instance&diff=548284word2vec Model Instance2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[word2vec Model Instance]] is a [[continuous dense distributional word vector space model]] produced by a [[word2vec system]].<br />
* <B>AKA:</B> [[word2vec Model Instance|word2vec Word Vector Space Model]], [[word2vec WVSM]].<br />
* <B>Context:</B><br />
** It can (typically) include a [[word2vec Word Vectorizing Function]] (to create [[word2vec vector]]s).<br />
* <B>Example(s):</B><br />
** the model created by v1 code on the [[20 Newsgroups Corpus]] using settings <code>-cbow 1 -negative 25 -hs 0 -sample 1e-4 -threads 40 -binary 1 -iter 15 -window 8 -size 200</code>.<br />
* <B>Counter-Example(s):</B><br />
** a [[Dependency Parse-based WVSM]].<br />
** a [[Window Baseline-based WVSM]].<br />
* <B>See:</B> [[word2vec Distance Function]], [[word2vec Analogy Function]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2017 ===<br />
* ([[2017_CompressingWordEmbeddingsviaDee|Shu & Nakayama, 2017]]) ⇒ [[Raphael Shu]], and [[Hideki Nakayama]]. ([[2017]]). “[https://arxiv.org/pdf/1711.01068 Compressing Word Embeddings via Deep Compositional Code Learning].&rdquo; In: Proceedings of 5th [[International Conference on Learning Representations]] ([[ICLR-2017]]). <br />
<br />
=== 2014 ===<br />
* ([[2014_LookingforHyponymsinVectorSpace|Rei & Briscoe, 2014]]) ⇒ [[Marek Rei]], and [[Ted Briscoe]]. ([[2014]]). “[http://www.aclweb.org/anthology/W14-1608 Looking for Hyponyms in Vector Space].” In: Proceedings of CoNLL-2014. <br />
** QUOTE: The [[Window Baseline-based WVSM|window-based]], [[Dependency Parse-based WVSM|dependency-based]] and [[word2vec vector set]]s were all trained on 112M words from the [[British National Corpus]], with preprocessing steps for [[lower-casing]] and [[lemmatising]]. </s> Any numbers were grouped and substituted by more generic tokens. </s><br />
<br />
=== 2013 ===<br />
* https://code.google.com/p/word2vec/<br />
** QUOTE: The [[word2vec System|word2vec tool]] takes a [[text corpus]] as input and produces the [[word vectors]] as output. It first constructs a [[vocabulary]] from the [[training text data]] and then [[learns vector representation of words]]. The resulting [[word2vec Model Instance|word vector file]] can be used as features in many [[NLP application|natural language processing]] and [[machine learning application]]s. <P> <br />
<br />
----<br />
__NOTOC__<br />
<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=word2vec-like_System&diff=548283word2vec-like System2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[word2vec-like System]] is a [[distributional word embedding training system]] that applies a [[word2vec algorithm]] (based on work by [[Tomáš Mikolov]], [[Kai Chen]], [[Greg Corrado]], [[Jeffrey Dean]], et al[https://code.google.com/p/word2vec/people/list]).<br />
* <B>Context:</B><br />
** It can train a [[word2vec Model Instance]] (that defines a [[word2vec model space]]).<br />
** It can can require billions of words to train a good [[Word Embedding]].<br />
** It can have source code available at https://code.google.com/p/word2vec/source/checkout<br />
* <B>Example(s):</B><br />
** the original release http://code.google.com/p/word2vec/<br />
** [[Gensim]]'s https://radimrehurek.com/gensim/models/word2vec.html<br />
* <B>Counter-Example(s):</B><br />
** [[rnnlm System]].<br />
** [[SemanticVectors System]].<br />
** [[GloVe-based System]] (using the [[GloVe algorithm]]).<br />
** [[word2phrase]].<br />
* <B>See:</B> [[Bag-of-Words Representation]], [[Word Context Vector]]s.<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2015 ===<br />
* ([[2015_AutoExtendExtendingWordEmbeddin|Rothe & Schütze, 2015]]) ⇒ [[Sascha Rothe]], and [[Hinrich Schütze]]. ([[2015]]). “[http://arxiv.org/pdf/1507.01127.pdf AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes].” In: arXiv preprint arXiv:1507.01127. <br />
** QUOTE: ... [[Unsupervised methods for word embeddings]] (also called “[[distributed word representation]]s”) have become popular in [[natural language processing (NLP)]]. </s> [[These method]]s only need [[very large corpora]] as input to create [[sparse representation]]s (e.g., based on [[local collocation]]s) and project them into a [[lower dimensional dense vector space]]. </s> Examples for [[word embedding]]s are [[SENNA]] ([[Collobert and Weston, 2008]]), the [[hierarchical log-bilinear model]] ([[Mnih and Hinton, 2009]]), [[word2vec-like System|word2vec]] ([[Mikolov et al., 2013c]]) and [[GloVe]] ([[Pennington et al., 2014]]). </s><br />
<br />
=== 2014 ===<br />
* Dec-23-2014 http://radimrehurek.com/2014/12/making-sense-of-word2vec/<br />
** QUOTE: [[Tomáš Mikolov]] (together with his colleagues at Google) ... releasing [[word2vec-like System|word2vec]], an [[unsupervised algorithm]] for [[learning the meaning behind words]]. ... <P> ... Using [[large amounts of unannotated plain text]], [[word2vec-like System|word2vec]] [[learns relationships between words automatically]]. The output are [[vector]]s, one [[Word Vector|vector per word]], with remarkable [[linear relationship]]s that allow us to do things like vec(“king”) – vec(“man”) + vec(“woman”) =~ vec(“queen”), or vec(“Montreal Canadiens”) – vec(“Montreal”) + vec(“Toronto”) [[resemble]]s the [[word vector|vector]] for “Toronto Maple Leafs”. ... <P> ... Basically, where [[GloVe]] precomputes the large [[word x word co-occurrence matrix]] in memory and then quickly factorizes it, [[word2vec-like System|word2vec]] sweeps through the sentences in an online fashion, handling each co-occurrence separately. So, there is a tradeoff between taking more memory ([[GloVe]]) vs. taking longer to train ([[word2vec-like System|word2vec]]). Also, once computed, [[GloVe]] can re-use the [[word-word co-occurrence matrix|co-occurrence matrix]] to quickly factorize with any dimensionality, whereas [[word2vec-like System|word2vec]] has to be trained from scratch after changing its [[embedding dimensionality]]. <br />
<br />
=== 2014 ===<br />
* ([[Rei & Briscoe, 2014]]) ⇒ [[Marek Rei]], and [[Ted Briscoe]]. ([[2014]]). “[http://www.aclweb.org/anthology/W14-1608 Looking for Hyponyms in Vector Space].” In: Proceedings of CoNLL-2014. <br />
** QUOTE: [[word2vec-like System|Word2vec]]: [[We]] created word representations using the [[word2vec-like System|word2vec toolkit]]<ref>https://code.google.com/p/word2vec/</ref>. </s> The tool is based on a [[feedforward neural network language model]], with modifications to make [[representation learning]] more efficient ([[Mikolov et al., 2013a]]). </s> [[We]] make use of the [[skip-gram model]], which takes each [[word in a sequence]] as an input to a [[log-linear classifier]] with a [[continuous projection layer]], and [[predicts word]]s within a [[text window|certain range before and after the input word]]. </s> The [[text window size|window size]] was set to 5 and [[vector]]s were trained with both [[100]] and 500 dimensions. </s><br />
<br />
=== 2013 ===<br />
* https://code.google.com/p/word2vec/<br />
** [[word2vec-like System|This tool]] provides an efficient implementation of the [[continuous bag-of-words]] and [[skip-gram architecture]]s for computing [[vector representations of words]]. These representations can be subsequently used in many [[natural language processing application]]s and for further research. <P> ... <P> The [[word2vec-like System|word2vec tool]] takes a [[text corpus]] as input and produces the [[word vectors]] as output. It first constructs a [[vocabulary]] from the [[training text data]] and then [[learns vector representation of words]]. The resulting [[word2vec Model|word vector file]] can be used as features in many [[NLP application|natural language processing]] and [[machine learning application]]s. <P> A simple way to investigate the learned representations is to find the closest [[word]]s for a [[user-specified]] [[word]]. The [[word2vec distance|distance tool]] serves that purpose. For example, if you enter 'france', distance will display the most similar words and their distances to 'france', which should look like ... <br />
<references/><br />
<br />
=== 2013b ===<br />
* ([[Mikolov et al., 2013a]]) ⇒ [[Tomáš Mikolov]], Kai Chen, Greg Corrado, and [[Jeffrey Dean]]. ([[2013]]). “[http://arxiv.org/pdf/1301.3781 Efficient Estimation of Word Representations in Vector Space].” In: Proceedings of International Conference of Learning Representations Workshop.<br />
<br />
=== 2013a ===<br />
* ([[2013_LinguisticRegularitiesinContinu|Mikolov et al., 2013b]]) ⇒ [[Tomáš Mikolov]], [[Wen-tau Yih]], and [[Geoffrey Zweig]]. ([[2013]]). “[http://www.aclweb.org/anthology/N13-1#page=784 Linguistic Regularities in Continuous Space Word Representations.].” In: HLT-NAACL.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=word2vec-like_System&diff=548282word2vec-like System2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[word2vec-like System]] is a [[distributional word embedding training system]] that applies a [[word2vec algorithm]] (based on work by [[Tomáš Mikolov]], [[Kai Chen]], [[Greg Corrado]], [[Jeffrey Dean]], et al[https://code.google.com/p/word2vec/people/list]).<br />
* <B>Context:</B><br />
** It can train a [[word2vec Model Instance]] (that defines a [[word2vec model space]]).<br />
** It can can require billions of words to train a good [[Word Embedding]].<br />
** It can have source code available at https://code.google.com/p/word2vec/source/checkout<br />
* <B>Example(s):</B><br />
** the original release http://code.google.com/p/word2vec/<br />
** [[Gensim]]'s https://radimrehurek.com/gensim/models/word2vec.html<br />
* <B>Counter-Example(s):</B><br />
** [[rnnlm System]].<br />
** [[SemanticVectors System]].<br />
** [[GloVe-based System]] (using the [[GloVe algorithm]]).<br />
** [[word2phrase]].<br />
* <B>See:</B> [[Bag-of-Words Representation]], [[Word Context Vector]]s.<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2015 ===<br />
* ([[2015_AutoExtendExtendingWordEmbeddin|Rothe & Schütze, 2015]]) ⇒ [[Sascha Rothe]], and [[Hinrich Schütze]]. ([[2015]]). “[http://arxiv.org/pdf/1507.01127.pdf AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes].” In: arXiv preprint arXiv:1507.01127. <br />
** QUOTE: ... [[Unsupervised methods for word embeddings]] (also called “[[distributed word representation]]s”) have become popular in [[natural language processing (NLP)]]. </s> [[These method]]s only need [[very large corpora]] as input to create [[sparse representation]]s (e.g., based on [[local collocation]]s) and project them into a [[lower dimensional dense vector space]]. </s> Examples for [[word embedding]]s are [[SENNA]] ([[Collobert and Weston, 2008]]), the [[hierarchical log-bilinear model]] ([[Mnih and Hinton, 2009]]), [[word2vec-like System|word2vec]] ([[Mikolov et al., 2013c]]) and [[GloVe]] ([[Pennington et al., 2014]]). </s><br />
<br />
=== 2014 ===<br />
* Dec-23-2014 http://radimrehurek.com/2014/12/making-sense-of-word2vec/<br />
** QUOTE: [[Tomáš Mikolov]] (together with his colleagues at Google) ... releasing [[word2vec-like System|word2vec]], an [[unsupervised algorithm]] for [[learning the meaning behind words]]. ... <P> ... Using [[large amounts of unannotated plain text]], [[word2vec-like System|word2vec]] [[learns relationships between words automatically]]. The output are [[vector]]s, one [[Word Vector|vector per word]], with remarkable [[linear relationship]]s that allow us to do things like vec(“king”) – vec(“man”) + vec(“woman”) =~ vec(“queen”), or vec(“Montreal Canadiens”) – vec(“Montreal”) + vec(“Toronto”) [[resemble]]s the [[word vector|vector]] for “Toronto Maple Leafs”. ... <P> ... Basically, where [[GloVe]] precomputes the large [[word x word co-occurrence matrix]] in memory and then quickly factorizes it, [[word2vec-like System|word2vec]] sweeps through the sentences in an online fashion, handling each co-occurrence separately. So, there is a tradeoff between taking more memory ([[GloVe]]) vs. taking longer to train ([[word2vec-like System|word2vec]]). Also, once computed, [[GloVe]] can re-use the [[word-word co-occurrence matrix|co-occurrence matrix]] to quickly factorize with any dimensionality, whereas [[word2vec-like System|word2vec]] has to be trained from scratch after changing its [[embedding dimensionality]]. <br />
<br />
=== 2014 ===<br />
* ([[Rei & Briscoe, 2014]]) ⇒ [[Marek Rei]], and [[Ted Briscoe]]. ([[2014]]). “[http://www.aclweb.org/anthology/W14-1608 Looking for Hyponyms in Vector Space].” In: Proceedings of CoNLL-2014. <br />
** QUOTE: [[Word2vec]]: [[We]] created word representations using the [[word2vec-like System|word2vec toolkit]]<ref>https://code.google.com/p/word2vec/</ref>. </s> The tool is based on a [[feedforward neural network language model]], with modifications to make [[representation learning]] more efficient ([[Mikolov et al., 2013a]]). </s> [[We]] make use of the [[skip-gram model]], which takes each [[word in a sequence]] as an input to a [[log-linear classifier]] with a [[continuous projection layer]], and [[predicts word]]s within a [[text window|certain range before and after the input word]]. </s> The [[text window size|window size]] was set to 5 and [[vector]]s were trained with both [[100]] and 500 dimensions. </s><br />
<br />
=== 2013 ===<br />
* https://code.google.com/p/word2vec/<br />
** [[word2vec-like System|This tool]] provides an efficient implementation of the [[continuous bag-of-words]] and [[skip-gram architecture]]s for computing [[vector representations of words]]. These representations can be subsequently used in many [[natural language processing application]]s and for further research. <P> ... <P> The [[word2vec-like System|word2vec tool]] takes a [[text corpus]] as input and produces the [[word vectors]] as output. It first constructs a [[vocabulary]] from the [[training text data]] and then [[learns vector representation of words]]. The resulting [[word2vec Model|word vector file]] can be used as features in many [[NLP application|natural language processing]] and [[machine learning application]]s. <P> A simple way to investigate the learned representations is to find the closest [[word]]s for a [[user-specified]] [[word]]. The [[word2vec distance|distance tool]] serves that purpose. For example, if you enter 'france', distance will display the most similar words and their distances to 'france', which should look like ... <br />
<references/><br />
<br />
=== 2013b ===<br />
* ([[Mikolov et al., 2013a]]) ⇒ [[Tomáš Mikolov]], Kai Chen, Greg Corrado, and [[Jeffrey Dean]]. ([[2013]]). “[http://arxiv.org/pdf/1301.3781 Efficient Estimation of Word Representations in Vector Space].” In: Proceedings of International Conference of Learning Representations Workshop.<br />
<br />
=== 2013a ===<br />
* ([[2013_LinguisticRegularitiesinContinu|Mikolov et al., 2013b]]) ⇒ [[Tomáš Mikolov]], [[Wen-tau Yih]], and [[Geoffrey Zweig]]. ([[2013]]). “[http://www.aclweb.org/anthology/N13-1#page=784 Linguistic Regularities in Continuous Space Word Representations.].” In: HLT-NAACL.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=word2vec-like_System&diff=548281word2vec-like System2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[word2vec-like System]] is a [[distributional word embedding training system]] that applies a [[word2vec algorithm]] (based on work by [[Tomáš Mikolov]], [[Kai Chen]], [[Greg Corrado]], [[Jeffrey Dean]], et al[https://code.google.com/p/word2vec/people/list]).<br />
* <B>Context:</B><br />
** It can train a [[word2vec Model Instance]] (that defines a [[word2vec model space]]).<br />
** It can can require billions of words to train a good [[Word Embedding]].<br />
** It can have source code available at https://code.google.com/p/word2vec/source/checkout<br />
* <B>Example(s):</B><br />
** the original release http://code.google.com/p/word2vec/<br />
** [[Gensim]]'s https://radimrehurek.com/gensim/models/word2vec.html<br />
* <B>Counter-Example(s):</B><br />
** [[rnnlm System]].<br />
** [[SemanticVectors System]].<br />
** [[GloVe-based System]] (using the [[GloVe algorithm]]).<br />
** [[word2phrase]].<br />
* <B>See:</B> [[Bag-of-Words Representation]], [[Word Context Vector]]s.<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2015 ===<br />
* ([[2015_AutoExtendExtendingWordEmbeddin|Rothe & Schütze, 2015]]) ⇒ [[Sascha Rothe]], and [[Hinrich Schütze]]. ([[2015]]). “[http://arxiv.org/pdf/1507.01127.pdf AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes].” In: arXiv preprint arXiv:1507.01127. <br />
** QUOTE: ... [[Unsupervised methods for word embeddings]] (also called “[[distributed word representation]]s”) have become popular in [[natural language processing (NLP)]]. </s> [[These method]]s only need [[very large corpora]] as input to create [[sparse representation]]s (e.g., based on [[local collocation]]s) and project them into a [[lower dimensional dense vector space]]. </s> Examples for [[word embedding]]s are [[SENNA]] ([[Collobert and Weston, 2008]]), the [[hierarchical log-bilinear model]] ([[Mnih and Hinton, 2009]]), [[word2vec-like System|word2vec]] ([[Mikolov et al., 2013c]]) and [[GloVe]] ([[Pennington et al., 2014]]). </s><br />
<br />
=== 2014 ===<br />
* Dec-23-2014 http://radimrehurek.com/2014/12/making-sense-of-word2vec/<br />
** QUOTE: [[Tomáš Mikolov]] (together with his colleagues at Google) ... releasing [[word2vec-like System|word2vec]], an [[unsupervised algorithm]] for [[learning the meaning behind words]]. ... <P> ... Using [[large amounts of unannotated plain text]], [[word2vec-like System|word2vec]] [[learns relationships between words automatically]]. The output are [[vector]]s, one [[Word Vector|vector per word]], with remarkable [[linear relationship]]s that allow us to do things like vec(“king”) – vec(“man”) + vec(“woman”) =~ vec(“queen”), or vec(“Montreal Canadiens”) – vec(“Montreal”) + vec(“Toronto”) [[resemble]]s the [[word vector|vector]] for “Toronto Maple Leafs”. ... <P> ... Basically, where [[GloVe]] precomputes the large [[word x word co-occurrence matrix]] in memory and then quickly factorizes it, [[word2vec-like System|word2vec]] sweeps through the sentences in an online fashion, handling each co-occurrence separately. So, there is a tradeoff between taking more memory ([[GloVe]]) vs. taking longer to train ([[word2vec-like System|word2vec]]). Also, once computed, [[GloVe]] can re-use the [[word-word co-occurrence matrix|co-occurrence matrix]] to quickly factorize with any dimensionality, whereas [[word2vec-like System|word2vec]] has to be trained from scratch after changing its [[embedding dimensionality]]. <br />
<br />
=== 2014 ===<br />
* ([[Rei & Briscoe, 2014]]) ⇒ [[Marek Rei]], and [[Ted Briscoe]]. ([[2014]]). “[http://www.aclweb.org/anthology/W14-1608 Looking for Hyponyms in Vector Space].” In: Proceedings of CoNLL-2014. <br />
** QUOTE: [[Word2vec]]: [[We]] created word representations using the [[word2vec-like System|word2vec toolkit]]<ref>https://code.google.com/p/word2vec/</ref>. </s> The tool is based on a [[feedforward neural network language model]], with modifications to make [[representation learning]] more efficient ([[Mikolov et al., 2013a]]). </s> [[We]] make use of the [[skip-gram model]], which takes each [[word in a sequence]] as an input to a [[log-linear classifier]] with a [[continuous projection layer]], and [[predicts word]]s within a [[text window|certain range before and after the input word]]. </s> The [[text window size|window size]] was set to 5 and [[vector]]s were trained with both [[100]] and 500 dimensions. </s><br />
<br />
=== 2013 ===<br />
* https://code.google.com/p/word2vec/<br />
** [[word2vec System|This tool]] provides an efficient implementation of the [[continuous bag-of-words]] and [[skip-gram architecture]]s for computing [[vector representations of words]]. These representations can be subsequently used in many [[natural language processing application]]s and for further research. <P> ... <P> The [[word2vec System|word2vec tool]] takes a [[text corpus]] as input and produces the [[word vectors]] as output. It first constructs a [[vocabulary]] from the [[training text data]] and then [[learns vector representation of words]]. The resulting [[word2vec Model|word vector file]] can be used as features in many [[NLP application|natural language processing]] and [[machine learning application]]s. <P> A simple way to investigate the learned representations is to find the closest [[word]]s for a [[user-specified]] [[word]]. The [[word2vec distance|distance tool]] serves that purpose. For example, if you enter 'france', distance will display the most similar words and their distances to 'france', which should look like ... <br />
<references/><br />
<br />
=== 2013b ===<br />
* ([[Mikolov et al., 2013a]]) ⇒ [[Tomáš Mikolov]], Kai Chen, Greg Corrado, and [[Jeffrey Dean]]. ([[2013]]). “[http://arxiv.org/pdf/1301.3781 Efficient Estimation of Word Representations in Vector Space].” In: Proceedings of International Conference of Learning Representations Workshop.<br />
<br />
=== 2013a ===<br />
* ([[2013_LinguisticRegularitiesinContinu|Mikolov et al., 2013b]]) ⇒ [[Tomáš Mikolov]], [[Wen-tau Yih]], and [[Geoffrey Zweig]]. ([[2013]]). “[http://www.aclweb.org/anthology/N13-1#page=784 Linguistic Regularities in Continuous Space Word Representations.].” In: HLT-NAACL.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=wh-Question&diff=548280wh-Question2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[wh-Question]] is a [[question]] that use an [[interrogative word]] to specify the [[desired information]].<br />
* <B>AKA:</B> [[wh-Question|Non-Polar Question]].<br />
* <B>See:</B> [[Information]], [[Answer]], [[Rhetorical Question]], [[Grammar]], [[Sentence (Linguistics)]], [[Inversion (Grammar)]], [[Interrogative]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2014 ===<br />
* (Wikipedia, 2014) &rArr; http://en.wikipedia.org/wiki/Question#wh Retrieved:2014-7-11.<br />
** ... The other main type of question (other than yes–no questions) is those called <B><i>wh''-questions</B> (or ''non-polar questions''). These use [[interrogative word]]s (''wh''-words) such as ''when'', ''which'', ''who'', ''how'', etc. to specify the information that is desired. (In some languages the formation of such questions may involve [[wh-movement|''wh''-movement]] – see the section below for grammatical description.) The name derives from the fact that most of the English interrogative words (with the exception of ''how'') begin with the letters ''wh''. These are the types of question sometimes referred to in journalism and other investigative contexts as the [[Five Ws]].<br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=von_Neumann-Morgenstern_Utility_Function&diff=548279von Neumann-Morgenstern Utility Function2019-12-23T20:45:31Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[von Neumann-Morgenstern Utility Function]] is a [[decision utility function]] that satisfies the following four axioms ...<br />
* <B>AKA:</B> [[von Neumann-Morgenstern Utility Function|vNM Utility]].<br />
* <B>See:</B> [[Hedonic Utility Function]], [[Exponential Utility Function]], [[Learning Cost Function]], [[von Neumann–Morgenstern Utility Theorem]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 1979 ===<br />
* ([[Morgenstern, 1979]]) ⇒ [[Oskar Morgenstern]]. (1979). “Some Reflections on Utility.” In: Theory and Decision Library, 21.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=vim_Text_Editor&diff=548278vim Text Editor2019-12-23T20:45:30Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[vim Text Editor]] is a [[charityware]] [[CLI-based]] [[text file editor]] that extends the [[vi Editor]].<br />
* <B>AKA:</B> [[vim Text Editor|Vi Improved Editor]].<br />
* <B>Context:</B><br />
** It can be configured with [[Vim Setting]]s (that can be stored in a [[.vimrc file]]).<br />
* <B>Example(s):</B><br />
** [[vim 7.3]].<br />
* <B>Counter-Example(s):</B><br />
** [[Emacs]].<br />
** [[pico]].<br />
** [[UltraEdit Text Editor]].<br />
* <B>See:</B> [[Text File Processor]].<br />
----<br />
----<br />
== References ==<br />
* http://www.vim.org/<br />
<br />
=== 2013 ===<br />
* http://en.wikipedia.org/wiki/Vim_%28text_editor%29<br />
** '''Vim</B> is a [[text editor]] written by [[Bram Moolenaar]] and first released publicly in 1991. Based on the [[vi]] editor common to [[Unix-like]] systems, Vim is designed for use both from a [[command line interface]] and as a standalone application in a [[graphical user interface]]. Vim is [[free and open source software]] and is released under a license that includes some [[charityware]] clauses, encouraging users who enjoy the software to consider donating to children in [[Uganda]].<ref>[http://vimdoc.sourceforge.net/htmldoc/uganda.html#license Vim documentation: uganda<!-- Bot generated title -->]</ref> The license is compatible with the [[GNU General Public License]]. <P> Although Vim was originally released for the [[Amiga]], Vim has since been developed to be [[cross-platform]], supporting [[#Availability|many other platforms]]. In 2006, it was voted the most popular editor amongst [[Linux Journal]] readers.<ref>{{cite web|url=http://www.linuxjournal.com/article/7029#N0x850ca10.0x85cf4c4|title=Linux Journal: 2003 Readers' Choice Awards|accessdate=2006-05-24|date=2003-11-01}}; {{cite web|url=http://www.linuxjournal.com/article/7724#N0x850cd80.0x85d3e3c|title=Linux Journal: 2004 Readers' Choice Awards|accessdate=2006-05-24|date=2004-11-01}}; {{cite web|url=http://www.linuxjournal.com/article/8520#N0x850cd80.0x87983bc|title=Linux Journal: 2005 Readers' Choice Awards|accessdate=2006-05-24|date=2005-09-28}}</ref><br />
<references/><br />
<BR><br />
* http://www.vim.org/about.php<br />
** [[vim Text Editor|Vim]] is an advanced [[text editor]] that seeks to provide the power of the de-facto [[Unix editor]] '[[Vi]]', with a more complete feature set. It's useful whether you're already using vi or using a different editor. Users of [[Vim 5]] and [[Vim 6|6]] should consider upgrading to [[Vim 7]].<br />
** [[vim Text Editor|Vim]] isn't an editor designed to hold its users' hands. It is a tool, the use of which must be learned. <P> Vim isn't a [[word processor]]. Although it can display text with various forms of [[highlighting]] and [[formatting]], it isn't there to provide [[WYSIWYG editing]] of [[typeset document]]s. (It is great for editing [[TeX Document|TeX]], though.) <br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=vim_Text_Editor&diff=548277vim Text Editor2019-12-23T20:45:30Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[vim Text Editor]] is a [[charityware]] [[CLI-based]] [[text file editor]] that extends the [[vi Editor]].<br />
* <B>AKA:</B> [[Vi Improved Editor]].<br />
* <B>Context:</B><br />
** It can be configured with [[Vim Setting]]s (that can be stored in a [[.vimrc file]]).<br />
* <B>Example(s):</B><br />
** [[vim 7.3]].<br />
* <B>Counter-Example(s):</B><br />
** [[Emacs]].<br />
** [[pico]].<br />
** [[UltraEdit Text Editor]].<br />
* <B>See:</B> [[Text File Processor]].<br />
----<br />
----<br />
== References ==<br />
* http://www.vim.org/<br />
<br />
=== 2013 ===<br />
* http://en.wikipedia.org/wiki/Vim_%28text_editor%29<br />
** '''Vim</B> is a [[text editor]] written by [[Bram Moolenaar]] and first released publicly in 1991. Based on the [[vi]] editor common to [[Unix-like]] systems, Vim is designed for use both from a [[command line interface]] and as a standalone application in a [[graphical user interface]]. Vim is [[free and open source software]] and is released under a license that includes some [[charityware]] clauses, encouraging users who enjoy the software to consider donating to children in [[Uganda]].<ref>[http://vimdoc.sourceforge.net/htmldoc/uganda.html#license Vim documentation: uganda<!-- Bot generated title -->]</ref> The license is compatible with the [[GNU General Public License]]. <P> Although Vim was originally released for the [[Amiga]], Vim has since been developed to be [[cross-platform]], supporting [[#Availability|many other platforms]]. In 2006, it was voted the most popular editor amongst [[Linux Journal]] readers.<ref>{{cite web|url=http://www.linuxjournal.com/article/7029#N0x850ca10.0x85cf4c4|title=Linux Journal: 2003 Readers' Choice Awards|accessdate=2006-05-24|date=2003-11-01}}; {{cite web|url=http://www.linuxjournal.com/article/7724#N0x850cd80.0x85d3e3c|title=Linux Journal: 2004 Readers' Choice Awards|accessdate=2006-05-24|date=2004-11-01}}; {{cite web|url=http://www.linuxjournal.com/article/8520#N0x850cd80.0x87983bc|title=Linux Journal: 2005 Readers' Choice Awards|accessdate=2006-05-24|date=2005-09-28}}</ref><br />
<references/><br />
<BR><br />
* http://www.vim.org/about.php<br />
** [[vim Text Editor|Vim]] is an advanced [[text editor]] that seeks to provide the power of the de-facto [[Unix editor]] '[[Vi]]', with a more complete feature set. It's useful whether you're already using vi or using a different editor. Users of [[Vim 5]] and [[Vim 6|6]] should consider upgrading to [[Vim 7]].<br />
** [[vim Text Editor|Vim]] isn't an editor designed to hold its users' hands. It is a tool, the use of which must be learned. <P> Vim isn't a [[word processor]]. Although it can display text with various forms of [[highlighting]] and [[formatting]], it isn't there to provide [[WYSIWYG editing]] of [[typeset document]]s. (It is great for editing [[TeX Document|TeX]], though.) <br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=tf-idf_Vector_Distance_Function&diff=548276tf-idf Vector Distance Function2019-12-23T20:45:30Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[tf-idf Vector Distance Function]] is a [[cosine distance function]] between [[TF-IDF vector]]s (based on [[relative term frequency]] and [[inverse document frequency]]).<br />
* <B>Context</B>:<br />
** <B>[[Function Domain|domain]]:</B> 2 [[tf-idf Vector]]s; and an [[IDF Model]] (from the same [[multiset set]]).<br />
** <B>[[Function Range|range]]:</B> a [[Distance Score]].<br />
** It can be calculated as <math>\mathrm{tf-idf}(t,d,D) = \mathrm{tf}(t,d) \times \mathrm{idf}(t,D)</math>.<br />
** It can (often) be used as:<br />
*** a [[String Distance Function]], by mapping each [[string]] and underlying [[Base Corpus]] as [[Multiset]]s. (however, it cannot handle the [[Word Semantic Challenge]]).<br />
*** a [[Document Distance Function]], by mapping each [[Document]] and the underlying [[Base Corpus]] as [[Multiset]]s.<br />
*** a [[Information Retrieval Ranking Function]] to compare [[Document]] similarity and distance to a [[Keyword Query]]<br />
*** a [[TF-IDF Ranking Function]].<br />
* <B>Example(s):</B> <br />
** [[tf-idf Distance]]({a,b},{b,a},C) = 0<br />
** [[tf-idf Distance]]({a,b},{c,d},C) = 1<br />
** IF [[TF]](a)=0.5, THEN [[tf-idf Vector Distance Function|TFIDF Distance]]({a,a,b},{a,b,b})= ???, because [[IDF]](a)= ???<br />
* <B>Counter-Example(s):</B> <br />
** [[Jaccard Distance Function]].<br />
** [[MI-based Distance Function]].<br />
* <B>See:</B> [[Term Vector Space Model]]; [[Stop-Words]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2015 ===<br />
* (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/Tf–idf Retrieved:2015-2-21.<br />
** '''tf–idf</B>, short for '''term frequency–inverse document frequency</B>, is a numerical statistic that is intended to reflect how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. <P> The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general. <P> Variations of the tf–idf weighting scheme are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf–idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification. <P> One of the simplest [[ranking function]]s is computed by summing the tf–idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<br />
=== 2012 ===<br />
* http://en.wikipedia.org/wiki/Tf*idf<br />
** QUOTE: The '''tf*idf''' weight ([[tf-idf Vector Distance Function|term frequency–inverse document frequency]]) is a numerical statistic which reflects how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to control for the fact that some words are generally more common than others. <P> Variations of the [[tf*idf weighting scheme]] are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf*idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification.<ref>[http://vetsky.narod2.ru/catalog/tfidf_ranker/ TF*IDF Ranker]</ref> <P> One of the simplest [[ranking function]]s is computed by summing the tf*idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<references/><br />
<br />
=== 2011 ===<br />
* ([[Sammut & Webb, 2011]]) ⇒ [[Claude Sammut]], and [[Geoffrey I. Webb]]. ([[2011]]). “TF-IDF.” In: ([[Sammut & Webb, 2011]]) p.986<br />
<br />
=== 2010 ===<br />
* http://alias-i.com/lingpipe/docs/api/com/aliasi/spell/TfIdfDistance.html<br />
** QUOTE: Note that there are a range of different distances called "TF/IDF" distance. The one in this class is defined to be symmetric, unlike typical TF/IDF distances defined for information retrieval. It scales inverse-document frequencies by logs, and both inverse-document frequencies and term frequencies by square roots. This causes the influence of IDF to grow logarithmically, and term frequency comparison to grow linearly. <P> Suppose we have a collection <code>docs</code> of <code>n</code> strings, which we will call documents in keeping with tradition. Further let <code>df(t,docs)</code> be the document frequency of token <code>t</code>, that is, the number of documents in which the token <code>t</code> appears. Then the inverse document frequency (IDF) of <code>t</code> is defined by: <code>idf(t,docs) = sqrt(log(n/df(t,docs)))</code>. <P> If the document frequency <code>df(t,docs)</code> of a term is zero, then <code>idf(t,docs)</code> is set to zero. As a result, only terms that appeared in at least one training document are used during comparison. <p>The term vector for a string is then defined by its term frequencies. If <code>count(t,cs)</code> is the count of term <code>t</code> in character sequence <code>cs</code>, then the [[term frequency (TF)]] is defined by: <code>tf(t,cs) = sqrt(count(t,cs)) </code>. The term-frequency/inverse-document frequency (TF/IDF) vector <code>tfIdf(cs,docs)</code> for a character sequence <code>cs</code> over a [[collection of documents]] <code>ds</code> has a value <code>tfIdf(cs,docs)(t)</code> for term <code>t</code> defined by: <code>tfIdf(cs,docs)(t) = tf(t,cs) * idf(t,docs)</code> <p>The proximity between character sequences <code>cs1</code> and <code>cs2</code> is defined as the cosine of their TF/IDF vectors: <blockquote><code> dist(cs1,cs2) = 1 - cosine(tfIdf(cs1,docs),tfIdf(cs2,docs)) </code></blockquote> <p>Recall that the cosine of two [[vector]]s is the [[dot product]] of the [[vector]]s divided by their lengths: <blockquote><code> cos(x,y) = x <sup>.</sup> y / (|x| * |y| ) </code></blockquote> where dot products are defined by: <blockquote><code> x <sup>.</sup> y = <big>Σ</big><sub>i</sub> x[i] * y[i] </code></blockquote> and length is defined by: <blockquote><code> |x| = sqrt(x <sup>.</sup> x) </code></blockquote> <p>Distance is then just 1 minus the proximity value. <P> <blockquote>distance(cs1,cs2) = 1 - proximity(cs1,cs2)</blockquote><br />
<br />
=== 2009 ===<br />
* http://alias-i.com/lingpipe/demos/tutorial/stringCompare/read-me.html<br />
** [[TF/IDF Distance]] LingPipe implements a second kind of token-based distance in the class spell.TfIdfDistance. By varying tokenizers, different behaviors may be had with the same underlying implementation. [[tf-idf Vector Distance Function|TF/IDF distance]] is based on vector similarity (using the cosine measure of angular similarity) over dampened and discriminatively weighted term frequencies. The basic idea is that two strings are more similar if they contain many of the same tokens with the same relative number of occurrences of each. Tokens are weighted more heavily if they occur in few documents. See the class documentation for a full definition of [[tf-idf Vector Distance Function|TF/IDF distance]]. <br />
<br />
=== 2003 ===<br />
* ([[2003_AComparisonOfStringEditDistMetrics|Cohen et al., 2003]]) ⇒ [[William W. Cohen]], Pradeep Ravikumar, and [[Stephen E. Fienberg]]. ([[2003]]). “[http://secondstring.sourceforge.net/doc/iiweb03.pdf A Comparison of String Distance Metrics for Name-Matching Tasks].” In: Workshop on Information Integration on the Web (IIWeb-03).<br />
** Two [[symbol string|string]]s <math>s</math> and <math>t</math> can also be considered as [[Symbol Multiset|multisets (or bags) of words (or tokens)]]. [[We]] also considered several [[token-based distance metric]]s. The [[Jaccard similarity]] between the word sets S and T is simply jS\Tj jS[Tj . [[tf-idf Vector Distance Function|TFIDF or 1 Affine edit-distance function]]s assign a relatively lower cost to a sequence of insertions or deletions. cosine similarity, which is widely used in the information retrieval community<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=tf-idf_Vector_Distance_Function&diff=548275tf-idf Vector Distance Function2019-12-23T20:45:30Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[tf-idf Vector Distance Function]] is a [[cosine distance function]] between [[TF-IDF vector]]s (based on [[relative term frequency]] and [[inverse document frequency]]).<br />
* <B>Context</B>:<br />
** <B>[[Function Domain|domain]]:</B> 2 [[tf-idf Vector]]s; and an [[IDF Model]] (from the same [[multiset set]]).<br />
** <B>[[Function Range|range]]:</B> a [[Distance Score]].<br />
** It can be calculated as <math>\mathrm{tf-idf}(t,d,D) = \mathrm{tf}(t,d) \times \mathrm{idf}(t,D)</math>.<br />
** It can (often) be used as:<br />
*** a [[String Distance Function]], by mapping each [[string]] and underlying [[Base Corpus]] as [[Multiset]]s. (however, it cannot handle the [[Word Semantic Challenge]]).<br />
*** a [[Document Distance Function]], by mapping each [[Document]] and the underlying [[Base Corpus]] as [[Multiset]]s.<br />
*** a [[Information Retrieval Ranking Function]] to compare [[Document]] similarity and distance to a [[Keyword Query]]<br />
*** a [[TF-IDF Ranking Function]].<br />
* <B>Example(s):</B> <br />
** [[tf-idf Distance]]({a,b},{b,a},C) = 0<br />
** [[tf-idf Distance]]({a,b},{c,d},C) = 1<br />
** IF [[TF]](a)=0.5, THEN [[tf-idf Vector Distance Function|TFIDF Distance]]({a,a,b},{a,b,b})= ???, because [[IDF]](a)= ???<br />
* <B>Counter-Example(s):</B> <br />
** [[Jaccard Distance Function]].<br />
** [[MI-based Distance Function]].<br />
* <B>See:</B> [[Term Vector Space Model]]; [[Stop-Words]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2015 ===<br />
* (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/Tf–idf Retrieved:2015-2-21.<br />
** '''tf–idf</B>, short for '''term frequency–inverse document frequency</B>, is a numerical statistic that is intended to reflect how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. <P> The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general. <P> Variations of the tf–idf weighting scheme are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf–idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification. <P> One of the simplest [[ranking function]]s is computed by summing the tf–idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<br />
=== 2012 ===<br />
* http://en.wikipedia.org/wiki/Tf*idf<br />
** QUOTE: The '''tf*idf''' weight ([[term frequency–inverse document frequency]]) is a numerical statistic which reflects how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to control for the fact that some words are generally more common than others. <P> Variations of the [[tf*idf weighting scheme]] are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf*idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification.<ref>[http://vetsky.narod2.ru/catalog/tfidf_ranker/ TF*IDF Ranker]</ref> <P> One of the simplest [[ranking function]]s is computed by summing the tf*idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<references/><br />
<br />
=== 2011 ===<br />
* ([[Sammut & Webb, 2011]]) ⇒ [[Claude Sammut]], and [[Geoffrey I. Webb]]. ([[2011]]). “TF-IDF.” In: ([[Sammut & Webb, 2011]]) p.986<br />
<br />
=== 2010 ===<br />
* http://alias-i.com/lingpipe/docs/api/com/aliasi/spell/TfIdfDistance.html<br />
** QUOTE: Note that there are a range of different distances called "TF/IDF" distance. The one in this class is defined to be symmetric, unlike typical TF/IDF distances defined for information retrieval. It scales inverse-document frequencies by logs, and both inverse-document frequencies and term frequencies by square roots. This causes the influence of IDF to grow logarithmically, and term frequency comparison to grow linearly. <P> Suppose we have a collection <code>docs</code> of <code>n</code> strings, which we will call documents in keeping with tradition. Further let <code>df(t,docs)</code> be the document frequency of token <code>t</code>, that is, the number of documents in which the token <code>t</code> appears. Then the inverse document frequency (IDF) of <code>t</code> is defined by: <code>idf(t,docs) = sqrt(log(n/df(t,docs)))</code>. <P> If the document frequency <code>df(t,docs)</code> of a term is zero, then <code>idf(t,docs)</code> is set to zero. As a result, only terms that appeared in at least one training document are used during comparison. <p>The term vector for a string is then defined by its term frequencies. If <code>count(t,cs)</code> is the count of term <code>t</code> in character sequence <code>cs</code>, then the [[term frequency (TF)]] is defined by: <code>tf(t,cs) = sqrt(count(t,cs)) </code>. The term-frequency/inverse-document frequency (TF/IDF) vector <code>tfIdf(cs,docs)</code> for a character sequence <code>cs</code> over a [[collection of documents]] <code>ds</code> has a value <code>tfIdf(cs,docs)(t)</code> for term <code>t</code> defined by: <code>tfIdf(cs,docs)(t) = tf(t,cs) * idf(t,docs)</code> <p>The proximity between character sequences <code>cs1</code> and <code>cs2</code> is defined as the cosine of their TF/IDF vectors: <blockquote><code> dist(cs1,cs2) = 1 - cosine(tfIdf(cs1,docs),tfIdf(cs2,docs)) </code></blockquote> <p>Recall that the cosine of two [[vector]]s is the [[dot product]] of the [[vector]]s divided by their lengths: <blockquote><code> cos(x,y) = x <sup>.</sup> y / (|x| * |y| ) </code></blockquote> where dot products are defined by: <blockquote><code> x <sup>.</sup> y = <big>Σ</big><sub>i</sub> x[i] * y[i] </code></blockquote> and length is defined by: <blockquote><code> |x| = sqrt(x <sup>.</sup> x) </code></blockquote> <p>Distance is then just 1 minus the proximity value. <P> <blockquote>distance(cs1,cs2) = 1 - proximity(cs1,cs2)</blockquote><br />
<br />
=== 2009 ===<br />
* http://alias-i.com/lingpipe/demos/tutorial/stringCompare/read-me.html<br />
** [[TF/IDF Distance]] LingPipe implements a second kind of token-based distance in the class spell.TfIdfDistance. By varying tokenizers, different behaviors may be had with the same underlying implementation. [[tf-idf Vector Distance Function|TF/IDF distance]] is based on vector similarity (using the cosine measure of angular similarity) over dampened and discriminatively weighted term frequencies. The basic idea is that two strings are more similar if they contain many of the same tokens with the same relative number of occurrences of each. Tokens are weighted more heavily if they occur in few documents. See the class documentation for a full definition of [[tf-idf Vector Distance Function|TF/IDF distance]]. <br />
<br />
=== 2003 ===<br />
* ([[2003_AComparisonOfStringEditDistMetrics|Cohen et al., 2003]]) ⇒ [[William W. Cohen]], Pradeep Ravikumar, and [[Stephen E. Fienberg]]. ([[2003]]). “[http://secondstring.sourceforge.net/doc/iiweb03.pdf A Comparison of String Distance Metrics for Name-Matching Tasks].” In: Workshop on Information Integration on the Web (IIWeb-03).<br />
** Two [[symbol string|string]]s <math>s</math> and <math>t</math> can also be considered as [[Symbol Multiset|multisets (or bags) of words (or tokens)]]. [[We]] also considered several [[token-based distance metric]]s. The [[Jaccard similarity]] between the word sets S and T is simply jS\Tj jS[Tj . [[tf-idf Vector Distance Function|TFIDF or 1 Affine edit-distance function]]s assign a relatively lower cost to a sequence of insertions or deletions. cosine similarity, which is widely used in the information retrieval community<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=tf-idf_Vector_Distance_Function&diff=548274tf-idf Vector Distance Function2019-12-23T20:45:30Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[tf-idf Vector Distance Function]] is a [[cosine distance function]] between [[TF-IDF vector]]s (based on [[relative term frequency]] and [[inverse document frequency]]).<br />
* <B>Context</B>:<br />
** <B>[[Function Domain|domain]]:</B> 2 [[tf-idf Vector]]s; and an [[IDF Model]] (from the same [[multiset set]]).<br />
** <B>[[Function Range|range]]:</B> a [[Distance Score]].<br />
** It can be calculated as <math>\mathrm{tf-idf}(t,d,D) = \mathrm{tf}(t,d) \times \mathrm{idf}(t,D)</math>.<br />
** It can (often) be used as:<br />
*** a [[String Distance Function]], by mapping each [[string]] and underlying [[Base Corpus]] as [[Multiset]]s. (however, it cannot handle the [[Word Semantic Challenge]]).<br />
*** a [[Document Distance Function]], by mapping each [[Document]] and the underlying [[Base Corpus]] as [[Multiset]]s.<br />
*** a [[Information Retrieval Ranking Function]] to compare [[Document]] similarity and distance to a [[Keyword Query]]<br />
*** a [[TF-IDF Ranking Function]].<br />
* <B>Example(s):</B> <br />
** [[tf-idf Distance]]({a,b},{b,a},C) = 0<br />
** [[tf-idf Distance]]({a,b},{c,d},C) = 1<br />
** IF [[TF]](a)=0.5, THEN [[TFIDF Distance]]({a,a,b},{a,b,b})= ???, because [[IDF]](a)= ???<br />
* <B>Counter-Example(s):</B> <br />
** [[Jaccard Distance Function]].<br />
** [[MI-based Distance Function]].<br />
* <B>See:</B> [[Term Vector Space Model]]; [[Stop-Words]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2015 ===<br />
* (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/Tf–idf Retrieved:2015-2-21.<br />
** '''tf–idf</B>, short for '''term frequency–inverse document frequency</B>, is a numerical statistic that is intended to reflect how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. <P> The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general. <P> Variations of the tf–idf weighting scheme are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf–idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification. <P> One of the simplest [[ranking function]]s is computed by summing the tf–idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<br />
=== 2012 ===<br />
* http://en.wikipedia.org/wiki/Tf*idf<br />
** QUOTE: The '''tf*idf''' weight ([[term frequency–inverse document frequency]]) is a numerical statistic which reflects how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to control for the fact that some words are generally more common than others. <P> Variations of the [[tf*idf weighting scheme]] are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf*idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification.<ref>[http://vetsky.narod2.ru/catalog/tfidf_ranker/ TF*IDF Ranker]</ref> <P> One of the simplest [[ranking function]]s is computed by summing the tf*idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<references/><br />
<br />
=== 2011 ===<br />
* ([[Sammut & Webb, 2011]]) ⇒ [[Claude Sammut]], and [[Geoffrey I. Webb]]. ([[2011]]). “TF-IDF.” In: ([[Sammut & Webb, 2011]]) p.986<br />
<br />
=== 2010 ===<br />
* http://alias-i.com/lingpipe/docs/api/com/aliasi/spell/TfIdfDistance.html<br />
** QUOTE: Note that there are a range of different distances called "TF/IDF" distance. The one in this class is defined to be symmetric, unlike typical TF/IDF distances defined for information retrieval. It scales inverse-document frequencies by logs, and both inverse-document frequencies and term frequencies by square roots. This causes the influence of IDF to grow logarithmically, and term frequency comparison to grow linearly. <P> Suppose we have a collection <code>docs</code> of <code>n</code> strings, which we will call documents in keeping with tradition. Further let <code>df(t,docs)</code> be the document frequency of token <code>t</code>, that is, the number of documents in which the token <code>t</code> appears. Then the inverse document frequency (IDF) of <code>t</code> is defined by: <code>idf(t,docs) = sqrt(log(n/df(t,docs)))</code>. <P> If the document frequency <code>df(t,docs)</code> of a term is zero, then <code>idf(t,docs)</code> is set to zero. As a result, only terms that appeared in at least one training document are used during comparison. <p>The term vector for a string is then defined by its term frequencies. If <code>count(t,cs)</code> is the count of term <code>t</code> in character sequence <code>cs</code>, then the [[term frequency (TF)]] is defined by: <code>tf(t,cs) = sqrt(count(t,cs)) </code>. The term-frequency/inverse-document frequency (TF/IDF) vector <code>tfIdf(cs,docs)</code> for a character sequence <code>cs</code> over a [[collection of documents]] <code>ds</code> has a value <code>tfIdf(cs,docs)(t)</code> for term <code>t</code> defined by: <code>tfIdf(cs,docs)(t) = tf(t,cs) * idf(t,docs)</code> <p>The proximity between character sequences <code>cs1</code> and <code>cs2</code> is defined as the cosine of their TF/IDF vectors: <blockquote><code> dist(cs1,cs2) = 1 - cosine(tfIdf(cs1,docs),tfIdf(cs2,docs)) </code></blockquote> <p>Recall that the cosine of two [[vector]]s is the [[dot product]] of the [[vector]]s divided by their lengths: <blockquote><code> cos(x,y) = x <sup>.</sup> y / (|x| * |y| ) </code></blockquote> where dot products are defined by: <blockquote><code> x <sup>.</sup> y = <big>Σ</big><sub>i</sub> x[i] * y[i] </code></blockquote> and length is defined by: <blockquote><code> |x| = sqrt(x <sup>.</sup> x) </code></blockquote> <p>Distance is then just 1 minus the proximity value. <P> <blockquote>distance(cs1,cs2) = 1 - proximity(cs1,cs2)</blockquote><br />
<br />
=== 2009 ===<br />
* http://alias-i.com/lingpipe/demos/tutorial/stringCompare/read-me.html<br />
** [[TF/IDF Distance]] LingPipe implements a second kind of token-based distance in the class spell.TfIdfDistance. By varying tokenizers, different behaviors may be had with the same underlying implementation. [[tf-idf Vector Distance Function|TF/IDF distance]] is based on vector similarity (using the cosine measure of angular similarity) over dampened and discriminatively weighted term frequencies. The basic idea is that two strings are more similar if they contain many of the same tokens with the same relative number of occurrences of each. Tokens are weighted more heavily if they occur in few documents. See the class documentation for a full definition of [[tf-idf Vector Distance Function|TF/IDF distance]]. <br />
<br />
=== 2003 ===<br />
* ([[2003_AComparisonOfStringEditDistMetrics|Cohen et al., 2003]]) ⇒ [[William W. Cohen]], Pradeep Ravikumar, and [[Stephen E. Fienberg]]. ([[2003]]). “[http://secondstring.sourceforge.net/doc/iiweb03.pdf A Comparison of String Distance Metrics for Name-Matching Tasks].” In: Workshop on Information Integration on the Web (IIWeb-03).<br />
** Two [[symbol string|string]]s <math>s</math> and <math>t</math> can also be considered as [[Symbol Multiset|multisets (or bags) of words (or tokens)]]. [[We]] also considered several [[token-based distance metric]]s. The [[Jaccard similarity]] between the word sets S and T is simply jS\Tj jS[Tj . [[tf-idf Vector Distance Function|TFIDF or 1 Affine edit-distance function]]s assign a relatively lower cost to a sequence of insertions or deletions. cosine similarity, which is widely used in the information retrieval community<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=tf-idf_Scoring_Function&diff=548273tf-idf Scoring Function2019-12-23T20:45:30Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[tf-idf Scoring Function]] is a [[scoring function for a vocabulary member relative to a multiset]] based on the [[multiplication]] of a [[tf measure]] and an [[idf measure]], <math>\operatorname{tf}() \times \operatorname{idf}()</math>.<br />
* <B>Context:</B><br />
** [[Function Input|inputs]] (<math>t,D,\mathbf{C}</math>):<br />
*** a [[Multiset Member]], <math>t</math> (e.g. a [[vocabulary member]]).<br />
*** a [[Multiset]], <math>D</math> (e.g. a [[document bag-of-words]]).<br />
*** a [[Multiset Set]], <math>\mathbf{C}</math> (e.g. a [[corpus]]).<br />
** [[Function Output|output(s)]]:<br />
*** [[tf-idf Score]].<br />
** [[Function Definition|definition]]:<br />
*** <math>\operatorname{tf-idf}(t,D,\mathbf{C}) = \operatorname{tf}(t,D) \times \operatorname{idf}(t,\mathbf{C})</math>.<br />
* <B>Counter-Example(s):</B><br />
** [[tf-idf Ranking Function]].<br />
** [[F Measure]].<br />
** [[PMI Measure]].<br />
* <B>See:</B> [[tf-idf Vector]], [[Text Corpus]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2015 ===<br />
* (Wikipedia, 2015) &rArr; http://en.wikipedia.org/wiki/Tf–idf Retrieved:2015-2-22.<br />
** '''tf–idf</B>, short for '''term frequency–inverse document frequency</B>, is a numerical statistic that is intended to reflect how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. <P> The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general. <P> Variations of the tf–idf weighting scheme are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf–idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification. <P> One of the simplest [[ranking function]]s is computed by summing the tf–idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<br />
=== 2007 ===<br />
* ([[Pazzani & Billsus, 2007]]) &rArr; [[Michael J. Pazzani]], and Daniel Billsus. "Content-based recommendation systems." In The adaptive web, pp. 325-341. Springer Berlin Heidelberg, 2007.<br />
** QUOTE: ... associated with a [[text term|term]] is a [[real number score|real number]] that represents the [[importance or relevance]]. This value is called the [[tf*idf weight]] ([[tf-idf Scoring Function|term-frequency times inverse document frequency]]). The [[tf*idf weight]], w(t,d), of a term t in a document d is a function of the frequency of t in the document (tft,d), the number of documents that contain the term (dft) and the number of documents in the collection (N)<ref>Note that in the description of [[tf*idf weight]]s, the word “document” is traditionally used since the original motivation was to [[retrieve documents]]. While the chapter will stick with the original terminology, in a recommendation system, the documents correspond to a text description of an item to be recommended. Note that the equations here are representative of the class of formulae called [[tf-idf Scoring Function|tf*idf]]. In general, [[tf*idf system]]s have weights that increase monotonically with [[term frequency]] and decrease monotonically with [[document frequency]].</ref><br />
<references/><br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=tf-idf_Scoring_Function&diff=548272tf-idf Scoring Function2019-12-23T20:45:30Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[tf-idf Scoring Function]] is a [[scoring function for a vocabulary member relative to a multiset]] based on the [[multiplication]] of a [[tf measure]] and an [[idf measure]], <math>\operatorname{tf}() \times \operatorname{idf}()</math>.<br />
* <B>Context:</B><br />
** [[Function Input|inputs]] (<math>t,D,\mathbf{C}</math>):<br />
*** a [[Multiset Member]], <math>t</math> (e.g. a [[vocabulary member]]).<br />
*** a [[Multiset]], <math>D</math> (e.g. a [[document bag-of-words]]).<br />
*** a [[Multiset Set]], <math>\mathbf{C}</math> (e.g. a [[corpus]]).<br />
** [[Function Output|output(s)]]:<br />
*** [[tf-idf Score]].<br />
** [[Function Definition|definition]]:<br />
*** <math>\operatorname{tf-idf}(t,D,\mathbf{C}) = \operatorname{tf}(t,D) \times \operatorname{idf}(t,\mathbf{C})</math>.<br />
* <B>Counter-Example(s):</B><br />
** [[tf-idf Ranking Function]].<br />
** [[F Measure]].<br />
** [[PMI Measure]].<br />
* <B>See:</B> [[tf-idf Vector]], [[Text Corpus]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2015 ===<br />
* (Wikipedia, 2015) &rArr; http://en.wikipedia.org/wiki/Tf–idf Retrieved:2015-2-22.<br />
** '''tf–idf</B>, short for '''term frequency–inverse document frequency</B>, is a numerical statistic that is intended to reflect how important a word is to a [[document]] in a collection or [[Text corpus|corpus]]. It is often used as a weighting factor in [[information retrieval]] and [[text mining]]. <P> The tf-idf value increases [[Proportionality (mathematics)|proportionally]] to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general. <P> Variations of the tf–idf weighting scheme are often used by [[search engine]]s as a central tool in scoring and ranking a document's [[Relevance (information retrieval)|relevance]] given a user [[Information retrieval|query]]. tf–idf can be successfully used for [[stop-words]] filtering in various subject fields including [[automatic summarization|text summarization]] and classification. <P> One of the simplest [[ranking function]]s is computed by summing the tf–idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<br />
=== 2007 ===<br />
* ([[Pazzani & Billsus, 2007]]) &rArr; [[Michael J. Pazzani]], and Daniel Billsus. "Content-based recommendation systems." In The adaptive web, pp. 325-341. Springer Berlin Heidelberg, 2007.<br />
** QUOTE: ... associated with a [[text term|term]] is a [[real number score|real number]] that represents the [[importance or relevance]]. This value is called the [[tf*idf weight]] ([[term-frequency times inverse document frequency]]). The [[tf*idf weight]], w(t,d), of a term t in a document d is a function of the frequency of t in the document (tft,d), the number of documents that contain the term (dft) and the number of documents in the collection (N)<ref>Note that in the description of [[tf*idf weight]]s, the word “document” is traditionally used since the original motivation was to [[retrieve documents]]. While the chapter will stick with the original terminology, in a recommendation system, the documents correspond to a text description of an item to be recommended. Note that the equations here are representative of the class of formulae called [[tf-idf Scoring Function|tf*idf]]. In general, [[tf*idf system]]s have weights that increase monotonically with [[term frequency]] and decrease monotonically with [[document frequency]].</ref><br />
<references/><br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=tf-idf_Score&diff=548271tf-idf Score2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[tf-idf Score]] is a [[non-negative real number score]] from a [[tf-idf function]] (for a [[vocabulary member]] relative to a [[multiset set member]]).<br />
* <B>Context</U>:</B><br />
** It can (typically) increase with respect to [[Set Member Frequency]] (frequent vocab members within a single multiset/document are more informative than rare items).<br />
** It can (typically) increase with respect to [[IDF Score]] (frequent vocab members over an entire multiset/corpus are less informative than rare terms).<br />
** It can be a member of a [[tf-idf Vector]].<br />
* <B>Example(s):</B> <br />
** <math>0</math>, when every multiset contains the member.<br />
** <math>0.046...</math> for <math>\operatorname{tf-idf}(``\text{quaint}'',\text{doc}_{184}, \text{Newsgroups 20 corpus})</math>, i.e. <math>\frac{\log(200)}{500} \equiv \frac{4}{2,000} \times \log(\frac{8,000}{40})</math>, if the word <i>quaint</i> is present 4 times in document <math>\text{doc}_{184}</math>with 2,000 words, and is contained in 40 documents from a corpus with 8,000 documents.<br />
* <B>Counter-Example(s):</B><br />
** a [[PMI Score]].<br />
* <B>See:</B> [[TF-IDF Ranking Function]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2009 ===<br />
* http://en.wikipedia.org/wiki/Tf%E2%80%93idf<br />
** The '''tf–idf''' weight (term frequency–inverse document frequency) is a weight often used in [[Information Retrieval|information retrieval]] and [[Text Mining|text mining]]. This weight is a statistical measure used to [[evaluate]] how important a word is to a [[Document|document]] in a collection or [[Text Corpus|corpus]]. The importance increases [[Proportionality (Mathematics)|proportionally]] to the number of times a word appears in the document but is offset by the frequency of the word in the corpus. Variations of the tf–idf weighting scheme are often used by [[Search Engine|search engine]]s as a central tool in scoring and ranking a document's [[Relevance (Information Retrieval)|relevance]] given a user [[Query|query]].<br />
** One of the simplest [[Ranking Function|ranking function]]s is computed by summing the tf-idf for each query term; many more sophisticated ranking functions are variants of this simple model.<br />
<BR><br />
* [http://en.wikipedia.org/wiki/Tf%E2%80%93idf#Mathematical_details http://en.wikipedia.org/wiki/Tf%E2%80%93idf#Mathematical_details]<br />
** A high weight in tf–idf is reached by a high term [[Frequency (Statistics)|frequency]] (in the given document) and a low document frequency of the term in the whole [[collection of documents]]; the weights hence tend to filter out common terms. The tf-idf value for a term will always be greater than or equal to zero.<br />
<br />
=== 2007 ===<br />
* ([[Pazzani & Billsus, 2007]]) ⇒ [[Michael J. Pazzani]], and [[Daniel Billsus]]. ([[2007]]). “Content-based Recommendation Systems.” In: The adaptive web. Springer Berlin Heidelberg, 2007.<br />
** QUOTE: ... associated with a [[text term|term]] is a [[real number score|real number]] that represents the [[importance or relevance]]. This value is called the [[tf-idf Score|tf*idf weight]] ([[term-frequency times inverse document frequency]]). The [[tf-idf Score|tf*idf weight]], w(t,d), of a term t in a document d is a function of the frequency of t in the document (tft,d), the number of documents that contain the term (dft) and the number of documents in the collection (N)<ref>Note that in the description of [[tf-idf Score|tf*idf weight]]s, the word “document” is traditionally used since the original motivation was to [[retrieve documents]]. While the chapter will stick with the original terminology, in a recommendation system, the documents correspond to a text description of an item to be recommended. Note that the equations here are representative of the class of formulae called [[tf*idf]]. In general, [[tf*idf system]]s have weights that increase monotonically with [[term frequency]] and decrease monotonically with [[document frequency]].</ref><br />
<references/><br />
<br />
----<br />
<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Student_Density_Function&diff=548270t-Student Density Function2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Student Density Function]] is a [[Probability Density Function]] that ...<br />
* <B>AKA:</B> [[t-Student Density Function|t-Student Distribution]].<br />
* <B>Counter-Example(s):</B><br />
** a [[Gaussian Density Function]].<br />
* <B>See:</B> [[Probability Function]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2006 ===<br />
* ([[2006_DubnickaStat510|Dubnicka, 2006h]]) &rArr; [[Suzanne R. Dubnicka]]. ([[2006]]). “[http://www.k-state.edu/stats/tch/dubnicka/stat510/handout8.pdf The Normal Distribution and Related Distributions - Handout 8.]'' Kansas State University, Introduction to Probability and Statistics I, STAT 510 - Fall 2006.<br />
** TERMINOLOGY : Suppose that Z N(0, 1) and V 2( ) are [[independent random variable]]s. Let T = .... Then T is said to have a t distribution with degrees of freedom, denoted T t().<br />
** PROBABILITY DENSITY FUNCTION: If T t( ), then the pdf of T is given by ...<br />
*** FACTS ABOUT THE t DISTRIBUTION:<br />
*** 1. The pdf is symmetric about 0.<br />
*** 2. The shape of the pdf is similar to that of the standard normal pdf except it is a bit flatter so that its tails are a bit thicker.<br />
*** 3. As ! 1, the pdf of the t distribution tends to the pdf of the [[standard normal distribution]].<br />
*** 4. If T t( ), then E(T) = 0 provided > 1 and Var(X) = ...<br />
<br />
----<br />
<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Statistic&diff=548269t-Statistic2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Statistic]] is a [[test statistic]] used in a [[Student's t-Test]]. <br />
* <B>AKA:</B> [[t-Score]], [[t-Statistic|t-test statistic]].<br />
* <B>Context:</B><br />
** It can be defined by the <math>t= \frac{X_s - m}{S} </math> where <math>X_s</math> is a [[point estimate]] of the [[population parameter]] <math>m</math>, and <math>S</math> [[scaling parameter]]. Note that, <math>X_s</math> depends only on the [[sample data]], <math>m</math> is value defined under [[null hypothesis]], <math>S</math> is value that characterizes both [[sampling distribution]] and [[population distribution]] as well as [[sample size]] and [[population size]]. <br />
** It ranges from being a [[one-sample t-statistic]] to [[two-sample t-statistic]].<br />
** It ranges from being a [[independent-measure t-statistic]] to a [[matched-pair t-statistic]].<br />
** It can be used in [[Comparison of Means Test]]s, [[Correlational Hypothesis Test]]s, [[Group Differences Hypothesis Test]]s and [[Regression Test]].<br />
** It assumes that sample [[centrality measure]]s are all [[identically normally distributed]], i.e. [[mean]]s and [[variance]]s are equal.<br />
* <B>Example(s)</B><br />
** [[One-Sample t-Statistic]], <math>X_s=\overline{x}</math> ([[sample mean]]), <math>m=\mu</math> ([[population mean]]) and <math>S\propto s/n</math> (where <math>s</math> is the [[sample standard deviation]] and <math>n</math> the [[sample size]]) <br />
** [[Two-Sample t-Statistic]]<br />
** [[Matched-Pair t-Statistic]]<br />
** [[Student's t-Test for Correlation]]<br />
** [[Regression t-Test]]<br />
* <B>Counter-Example(s):</B><br />
** [[Welch's t-test]] is used when [[sample variance]]s are not equal.<br />
** a [[z-Statistic]], which can produce a [[z-score]], it is used when the [[standard deviation]] is known.<br />
** an [[F-Statistic]], which can produce an [[F-statistic score]].<br />
** a [[Chi-Squared Statistic]].<br />
* <B>See:</B> [[t-Test]], [[Bootstrapping (Statistics)]], [[Standard Error (Statistics)]], [[Statistical Hypothesis Testing]], [[Augmented Dickey–Fuller Test]], [[Sampling Distribution]], [[t-Distribution]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2016 ===<br />
* (Wikipedia, 2016) &rArr; https://en.wikipedia.org/wiki/t-statistic Retrieved:2016-7-12.<br />
** In [[statistics]], the '''''t</i>-statistic''' is a ratio of the departure of an estimated parameter from its notional value and its [[Standard error (statistics)|standard error]]. It is used in [[statistical hypothesis testing|hypothesis testing]], for example in the [[Student's t-test|Student’s ''t</i>-test]], in the [[augmented Dickey–Fuller test]], and in [[Bootstrapping (statistics)|bootstrapping]].<br />
<BR><br />
* (Changing Minds, 2016) &rArr; http://changingminds.org/explanations/research/analysis/t-test.htm Retrieved 2016-10-16<br />
** QUOTE: The t-test (or student's t-test) gives an indication of the separateness of two sets of measurements, and is thus used to check whether two sets of measures are essentially different (and usually that an experimental effect has been demonstrated). The typical way of doing this is with the null hypothesis that means of the two sets of measures are equal.<br />
::The t-test assumes:<br />
:::*A normal distribution (parametric data)<br />
:::*Underlying variances are equal (if not, use Welch's test)<br />
::It is used when there is random assignment and only two sets of measurement to compare.<br />
::There are two main types of t-test:<br />
:::*Independent-measures t-test: when samples are not matched.<br />
:::*Matched-pair t-test: When samples appear in pairs (eg. before-and-after).<br />
::A single-sample t-test compares a sample against a known figure, for example where measures of a manufactured item are compared against the required standard.<br />
<br />
=== 2014 ===<br />
* http://stattrek.com/statistics/dictionary.aspx?definition=t_statistic<br />
** QUOTE: The [[t-Statistic|t statistic]] is defined by: <math>t = \frac{x - μ}{\frac{s}{\sqrt{n}}}</math><br />
::where x is the [[population sample|sample]] [[mean]], μ is the [[population]] [[mean]], s is the [[standard deviation]] of the [[population sample|sample]], and n is the sample size.<br />
=== 2007 ===<br />
* (Goldsman, 2007) &rArr; [[David Goldsman]] ([[2007]]). Chapter 6 - Sampling Distributions, Course Notes: "ISyE 3770 - Probability and Statistics" [http://www.coursehero.com/file/11659797/Chap06/], [http://aimm02.cse.ttu.edu.tw/class/92-1/Statistics/Chap06.pdf PDF file]<br />
** QUOTE: Let <math>X \sim N(\mu, \sigma^2)</math>. Then <math>X \sim N(\mu, \sigma^2/n) </math> or, equivalently, <math>Z =(X − \mu)/(\sigma/\sqrt{n}) \sim N(0, 1) </math>. In most cases, the value of <math>\sigma^2</math> is not available. Thus, we will use <math>S^2</math> to estimate <math>\sigma^2</math>. The [[t-distribution]] deals with the distribution about the [[t-Statistic|statistic T]] defined by<br />
:<math> T =\frac{X-\mu}{S/\sqrt{n}}</math><br />
::[...] Let <math>Z \sim N(0, 1)</math> and <math>W \sim \chi^{2\nu}</math> be two independent random variables. The random variable <math>T =Z/\sqrt{W/\nu}</math> is said to possess a [[t-distribution]] with <math>\nu</math> degrees of freedom and is denoted by <math>T \sim t_\nu</math><br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Statistic&diff=548268t-Statistic2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Statistic]] is a [[test statistic]] used in a [[Student's t-Test]]. <br />
* <B>AKA:</B> [[t-Score]], [[t-Statistic|t-test statistic]].<br />
* <B>Context:</B><br />
** It can be defined by the <math>t= \frac{X_s - m}{S} </math> where <math>X_s</math> is a [[point estimate]] of the [[population parameter]] <math>m</math>, and <math>S</math> [[scaling parameter]]. Note that, <math>X_s</math> depends only on the [[sample data]], <math>m</math> is value defined under [[null hypothesis]], <math>S</math> is value that characterizes both [[sampling distribution]] and [[population distribution]] as well as [[sample size]] and [[population size]]. <br />
** It ranges from being a [[one-sample t-statistic]] to [[two-sample t-statistic]].<br />
** It ranges from being a [[independent-measure t-statistic]] to a [[matched-pair t-statistic]].<br />
** It can be used in [[Comparison of Means Test]]s, [[Correlational Hypothesis Test]]s, [[Group Differences Hypothesis Test]]s and [[Regression Test]].<br />
** It assumes that sample [[centrality measure]]s are all [[identically normally distributed]], i.e. [[mean]]s and [[variance]]s are equal.<br />
* <B>Example(s)</B><br />
** [[One-Sample t-Statistic]], <math>X_s=\overline{x}</math> ([[sample mean]]), <math>m=\mu</math> ([[population mean]]) and <math>S\propto s/n</math> (where <math>s</math> is the [[sample standard deviation]] and <math>n</math> the [[sample size]]) <br />
** [[Two-Sample t-Statistic]]<br />
** [[Matched-Pair t-Statistic]]<br />
** [[Student's t-Test for Correlation]]<br />
** [[Regression t-Test]]<br />
* <B>Counter-Example(s):</B><br />
** [[Welch's t-test]] is used when [[sample variance]]s are not equal.<br />
** a [[z-Statistic]], which can produce a [[z-score]], it is used when the [[standard deviation]] is known.<br />
** an [[F-Statistic]], which can produce an [[F-statistic score]].<br />
** a [[Chi-Squared Statistic]].<br />
* <B>See:</B> [[t-Test]], [[Bootstrapping (Statistics)]], [[Standard Error (Statistics)]], [[Statistical Hypothesis Testing]], [[Augmented Dickey–Fuller Test]], [[Sampling Distribution]], [[t-Distribution]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2016 ===<br />
* (Wikipedia, 2016) &rArr; https://en.wikipedia.org/wiki/t-statistic Retrieved:2016-7-12.<br />
** In [[statistics]], the '''''t</i>-statistic''' is a ratio of the departure of an estimated parameter from its notional value and its [[Standard error (statistics)|standard error]]. It is used in [[statistical hypothesis testing|hypothesis testing]], for example in the [[Student's t-test|Student’s ''t</i>-test]], in the [[augmented Dickey–Fuller test]], and in [[Bootstrapping (statistics)|bootstrapping]].<br />
<BR><br />
* (Changing Minds, 2016) &rArr; http://changingminds.org/explanations/research/analysis/t-test.htm Retrieved 2016-10-16<br />
** QUOTE: The t-test (or student's t-test) gives an indication of the separateness of two sets of measurements, and is thus used to check whether two sets of measures are essentially different (and usually that an experimental effect has been demonstrated). The typical way of doing this is with the null hypothesis that means of the two sets of measures are equal.<br />
::The t-test assumes:<br />
:::*A normal distribution (parametric data)<br />
:::*Underlying variances are equal (if not, use Welch's test)<br />
::It is used when there is random assignment and only two sets of measurement to compare.<br />
::There are two main types of t-test:<br />
:::*Independent-measures t-test: when samples are not matched.<br />
:::*Matched-pair t-test: When samples appear in pairs (eg. before-and-after).<br />
::A single-sample t-test compares a sample against a known figure, for example where measures of a manufactured item are compared against the required standard.<br />
<br />
=== 2014 ===<br />
* http://stattrek.com/statistics/dictionary.aspx?definition=t_statistic<br />
** QUOTE: The [[t statistic]] is defined by: <math>t = \frac{x - μ}{\frac{s}{\sqrt{n}}}</math><br />
::where x is the [[population sample|sample]] [[mean]], μ is the [[population]] [[mean]], s is the [[standard deviation]] of the [[population sample|sample]], and n is the sample size.<br />
=== 2007 ===<br />
* (Goldsman, 2007) &rArr; [[David Goldsman]] ([[2007]]). Chapter 6 - Sampling Distributions, Course Notes: "ISyE 3770 - Probability and Statistics" [http://www.coursehero.com/file/11659797/Chap06/], [http://aimm02.cse.ttu.edu.tw/class/92-1/Statistics/Chap06.pdf PDF file]<br />
** QUOTE: Let <math>X \sim N(\mu, \sigma^2)</math>. Then <math>X \sim N(\mu, \sigma^2/n) </math> or, equivalently, <math>Z =(X − \mu)/(\sigma/\sqrt{n}) \sim N(0, 1) </math>. In most cases, the value of <math>\sigma^2</math> is not available. Thus, we will use <math>S^2</math> to estimate <math>\sigma^2</math>. The [[t-distribution]] deals with the distribution about the [[t-Statistic|statistic T]] defined by<br />
:<math> T =\frac{X-\mu}{S/\sqrt{n}}</math><br />
::[...] Let <math>Z \sim N(0, 1)</math> and <math>W \sim \chi^{2\nu}</math> be two independent random variables. The random variable <math>T =Z/\sqrt{W/\nu}</math> is said to possess a [[t-distribution]] with <math>\nu</math> degrees of freedom and is denoted by <math>T \sim t_\nu</math><br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Statistic&diff=548267t-Statistic2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Statistic]] is a [[test statistic]] used in a [[Student's t-Test]]. <br />
* <B>AKA:</B> [[t-Score]], [[t-test statistic]].<br />
* <B>Context:</B><br />
** It can be defined by the <math>t= \frac{X_s - m}{S} </math> where <math>X_s</math> is a [[point estimate]] of the [[population parameter]] <math>m</math>, and <math>S</math> [[scaling parameter]]. Note that, <math>X_s</math> depends only on the [[sample data]], <math>m</math> is value defined under [[null hypothesis]], <math>S</math> is value that characterizes both [[sampling distribution]] and [[population distribution]] as well as [[sample size]] and [[population size]]. <br />
** It ranges from being a [[one-sample t-statistic]] to [[two-sample t-statistic]].<br />
** It ranges from being a [[independent-measure t-statistic]] to a [[matched-pair t-statistic]].<br />
** It can be used in [[Comparison of Means Test]]s, [[Correlational Hypothesis Test]]s, [[Group Differences Hypothesis Test]]s and [[Regression Test]].<br />
** It assumes that sample [[centrality measure]]s are all [[identically normally distributed]], i.e. [[mean]]s and [[variance]]s are equal.<br />
* <B>Example(s)</B><br />
** [[One-Sample t-Statistic]], <math>X_s=\overline{x}</math> ([[sample mean]]), <math>m=\mu</math> ([[population mean]]) and <math>S\propto s/n</math> (where <math>s</math> is the [[sample standard deviation]] and <math>n</math> the [[sample size]]) <br />
** [[Two-Sample t-Statistic]]<br />
** [[Matched-Pair t-Statistic]]<br />
** [[Student's t-Test for Correlation]]<br />
** [[Regression t-Test]]<br />
* <B>Counter-Example(s):</B><br />
** [[Welch's t-test]] is used when [[sample variance]]s are not equal.<br />
** a [[z-Statistic]], which can produce a [[z-score]], it is used when the [[standard deviation]] is known.<br />
** an [[F-Statistic]], which can produce an [[F-statistic score]].<br />
** a [[Chi-Squared Statistic]].<br />
* <B>See:</B> [[t-Test]], [[Bootstrapping (Statistics)]], [[Standard Error (Statistics)]], [[Statistical Hypothesis Testing]], [[Augmented Dickey–Fuller Test]], [[Sampling Distribution]], [[t-Distribution]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2016 ===<br />
* (Wikipedia, 2016) &rArr; https://en.wikipedia.org/wiki/t-statistic Retrieved:2016-7-12.<br />
** In [[statistics]], the '''''t</i>-statistic''' is a ratio of the departure of an estimated parameter from its notional value and its [[Standard error (statistics)|standard error]]. It is used in [[statistical hypothesis testing|hypothesis testing]], for example in the [[Student's t-test|Student’s ''t</i>-test]], in the [[augmented Dickey–Fuller test]], and in [[Bootstrapping (statistics)|bootstrapping]].<br />
<BR><br />
* (Changing Minds, 2016) &rArr; http://changingminds.org/explanations/research/analysis/t-test.htm Retrieved 2016-10-16<br />
** QUOTE: The t-test (or student's t-test) gives an indication of the separateness of two sets of measurements, and is thus used to check whether two sets of measures are essentially different (and usually that an experimental effect has been demonstrated). The typical way of doing this is with the null hypothesis that means of the two sets of measures are equal.<br />
::The t-test assumes:<br />
:::*A normal distribution (parametric data)<br />
:::*Underlying variances are equal (if not, use Welch's test)<br />
::It is used when there is random assignment and only two sets of measurement to compare.<br />
::There are two main types of t-test:<br />
:::*Independent-measures t-test: when samples are not matched.<br />
:::*Matched-pair t-test: When samples appear in pairs (eg. before-and-after).<br />
::A single-sample t-test compares a sample against a known figure, for example where measures of a manufactured item are compared against the required standard.<br />
<br />
=== 2014 ===<br />
* http://stattrek.com/statistics/dictionary.aspx?definition=t_statistic<br />
** QUOTE: The [[t statistic]] is defined by: <math>t = \frac{x - μ}{\frac{s}{\sqrt{n}}}</math><br />
::where x is the [[population sample|sample]] [[mean]], μ is the [[population]] [[mean]], s is the [[standard deviation]] of the [[population sample|sample]], and n is the sample size.<br />
=== 2007 ===<br />
* (Goldsman, 2007) &rArr; [[David Goldsman]] ([[2007]]). Chapter 6 - Sampling Distributions, Course Notes: "ISyE 3770 - Probability and Statistics" [http://www.coursehero.com/file/11659797/Chap06/], [http://aimm02.cse.ttu.edu.tw/class/92-1/Statistics/Chap06.pdf PDF file]<br />
** QUOTE: Let <math>X \sim N(\mu, \sigma^2)</math>. Then <math>X \sim N(\mu, \sigma^2/n) </math> or, equivalently, <math>Z =(X − \mu)/(\sigma/\sqrt{n}) \sim N(0, 1) </math>. In most cases, the value of <math>\sigma^2</math> is not available. Thus, we will use <math>S^2</math> to estimate <math>\sigma^2</math>. The [[t-distribution]] deals with the distribution about the [[t-Statistic|statistic T]] defined by<br />
:<math> T =\frac{X-\mu}{S/\sqrt{n}}</math><br />
::[...] Let <math>Z \sim N(0, 1)</math> and <math>W \sim \chi^{2\nu}</math> be two independent random variables. The random variable <math>T =Z/\sqrt{W/\nu}</math> is said to possess a [[t-distribution]] with <math>\nu</math> degrees of freedom and is denoted by <math>T \sim t_\nu</math><br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Distribution_Table&diff=548266t-Distribution Table2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Distribution Table]] is a [[probability distribution table]] that includes [[critical value]]s of the [[t-distribution]] calculated using a [[cumulative distribution function]]. <br />
* <B>Context:</B><br />
** It can be used for [[One-Tailed Hypothesis Test|one-tailed]] and [[Two-Tailed Hypothesis Test|two-tailed test]]s by using the appropriate value of [[significance level]] or upper and lower limits of the [[region of acceptance]].<br />
** It can be referenced by a [[t-Distribution Calculating System]].<br />
* <B>Example(s):</B><br />
** Table A.2 in http://home.ubalt.edu/ntsbarsh/Business-stat/StatistialTables.pdf<br />
** Table 2 and 3 in http://www.stat.ufl.edu/~athienit/Tables/tables<br />
* <B>Counter-Example(s):</B><br />
** [[F-table]]<br />
** [[P-Value]]<br />
** [[Significance Level]]<br />
* <B>See:</B> [[Student's t-Distribution]], [[Statistical Distribution Table]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017 ===<br />
* ([[Wikipedia, 2016]]) &rArr; http://en.wikipedia.org/wiki/Student's_t-distribution#Table_of_selected_values<br />
** Most statistical textbooks list ''t''-distribution tables. Nowadays, the better way to a fully precise critical ''t'' value or a cumulative probability is the statistical function implemented in [[spreadsheets]], or an interactive calculating web page. The relevant spreadsheet functions are <code>TDIST</code> and <code>TINV</code>, while online calculating pages save troubles like positions of parameters or names of functions. <P> The following table lists a few selected values for ''t''-distributions with ν degrees of freedom for a range of ''one-sided'' or ''two-sided'' critical regions. For an example of how to read this table, take the fourth row, which begins with 4; that means ν, the number of degrees of freedom, is 4 (and if we are dealing, as above, with ''n'' values with a fixed sum, ''n'' = 5). Take the fifth entry, in the column headed 95% for ''one-sided'' (90% for ''two-sided''). The value of that entry is 2.132. Then the probability that ''T'' is less than 2.132 is 95% or Pr(−∞ < ''T'' < 2.132) = 0.95; this also means that Pr(−2.132 < ''T'' < 2.132) = 0.9. (...)<br />
<br />
=== 2013 ===<br />
* (NIST/SEMATECH, 2013) &rArr; Retrieved on 2017-03-12 from NIST/SEMATECH e-Handbook of Statistical Methods "1.3.6.7.2.-Critical Values of the Student's t Distribution" http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm <br />
** QUOTE: This table contains critical values of the Student's t distribution computed using the cumulative distribution function. The t distribution is symmetric so that<br />
::: <math>t_{1-\alpha,\nu} = -t_{\alpha,\nu}</math><br />
:: The [[t-Distribution Table|t-table]] can be used for both one-sided (lower and upper) and two-sided tests using the appropriate value of &alpha;.<br />
:: The [[significance level]], &alpha;, is demonstrated in the graph below, which displays a [[t distribution]] with 10 degrees of freedom. The most commonly used significance level is <math>\alpha = 0.05</math>. For a two-sided test, we compute <math>1 - \alpha/2</math>, or <math>1 - 0.05/2 = 0.975</math> when <math>\alpha = 0.05</math>. If the absolute value of the test statistic is greater than the critical value (0.975), then we reject the null hypothesis. Due to the symmetry of the t distribution, we only tabulate the positive critical values in the table below.<br />
<br />
=== 2002a ===<br />
* (Dougherty, 2002) &rArr; Statistical Tables in Dougherty (2002) "Introduction to Econometrics" (second edition 2002, Oxford University Press, Oxford) http://home.ubalt.edu/ntsbarsh/Business-stat/StatistialTables.pdf<br />
<br />
=== 2002b ===<br />
* (Hildebrand,2002) &rArr; http://www.stat.ufl.edu/~athienit/Tables/tables<br />
<br />
----<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Distribution_Calculating_System&diff=548265t-Distribution Calculating System2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Distribution Calculating System]] is a [[probability distribution calculation system]] that can solve a [[t-distribution calculation task]] (for a [[student's t-Distribution]]).<br />
* <B>AKA:</B> [[t-Distribution Calculating System|Student's t-Distribution Calculator]], [[t-Distribution Calculating System|T-distribution Algorithm]], [[t-Distribution Calculating System|T-calculator]]. <br />
* <B>Context:</B><br />
** It can be based on a [[t-table]]s.<br />
* <B>Example(s):</B><br />
** An [[interactive online t-calculator]] such as: <br />
*** http://stattrek.com/online-calculator/t-distribution.aspx<br />
*** http://surfstat.anu.edu.au/surfstat-home/tables/t.php <br />
** One based on on the [[scipy.stat]] functions[http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html#scipy.stats.t]: <code>scipy.stat.t.pdf</code>, <code>scipy.stat.t.cdf</code> and <code>scipy.stat.t.interval</code> .<br />
** An [[Excel spreadsheet function]]s such as <code>TDIST</code> and <code>TINV</code>.<br />
* <B>Counter-Example(s):</B><br />
** [[P-Value]]<br />
** [[Significance Level]]<br />
* <B>See:</B> [[Student's t-Distribution]], [[One-Sample t-Test System]], [[t-Table]].<br />
----<br />
----<br />
== References ==<br />
=== 2017a ===<br />
* (Stattrek, 2017) &rArr; http://stattrek.com/online-calculator/t-distribution.aspx<br />
** The t-distribution calculator makes it easy to compute cumulative probabilities, based on t statistics; or to compute t statistics, based on cumulative probabilities. <br />
=== 2017b ===<br />
* ([[Wikipedia, 2016]]) &rArr; http://en.wikipedia.org/wiki/Student's_t-distribution#Table_of_selected_values<br />
** Most statistical textbooks list ''t''-distribution tables. Nowadays, the better way to a fully precise critical ''t'' value or a cumulative probability is the statistical function implemented in [[spreadsheets]], or an interactive calculating web page. The relevant spreadsheet functions are <code>TDIST</code> and <code>TINV</code>, while online calculating pages save troubles like positions of parameters or names of functions.<br />
<br />
----<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Distribution_Calculating_System&diff=548264t-Distribution Calculating System2019-12-23T20:45:29Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Distribution Calculating System]] is a [[probability distribution calculation system]] that can solve a [[t-distribution calculation task]] (for a [[student's t-Distribution]]).<br />
* <B>AKA:</B> [[t-Distribution Calculating System|Student's t-Distribution Calculator]], [[t-Distribution Calculating System|T-distribution Algorithm]], [[T-calculator]]. <br />
* <B>Context:</B><br />
** It can be based on a [[t-table]]s.<br />
* <B>Example(s):</B><br />
** An [[interactive online t-calculator]] such as: <br />
*** http://stattrek.com/online-calculator/t-distribution.aspx<br />
*** http://surfstat.anu.edu.au/surfstat-home/tables/t.php <br />
** One based on on the [[scipy.stat]] functions[http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html#scipy.stats.t]: <code>scipy.stat.t.pdf</code>, <code>scipy.stat.t.cdf</code> and <code>scipy.stat.t.interval</code> .<br />
** An [[Excel spreadsheet function]]s such as <code>TDIST</code> and <code>TINV</code>.<br />
* <B>Counter-Example(s):</B><br />
** [[P-Value]]<br />
** [[Significance Level]]<br />
* <B>See:</B> [[Student's t-Distribution]], [[One-Sample t-Test System]], [[t-Table]].<br />
----<br />
----<br />
== References ==<br />
=== 2017a ===<br />
* (Stattrek, 2017) &rArr; http://stattrek.com/online-calculator/t-distribution.aspx<br />
** The t-distribution calculator makes it easy to compute cumulative probabilities, based on t statistics; or to compute t statistics, based on cumulative probabilities. <br />
=== 2017b ===<br />
* ([[Wikipedia, 2016]]) &rArr; http://en.wikipedia.org/wiki/Student's_t-distribution#Table_of_selected_values<br />
** Most statistical textbooks list ''t''-distribution tables. Nowadays, the better way to a fully precise critical ''t'' value or a cumulative probability is the statistical function implemented in [[spreadsheets]], or an interactive calculating web page. The relevant spreadsheet functions are <code>TDIST</code> and <code>TINV</code>, while online calculating pages save troubles like positions of parameters or names of functions.<br />
<br />
----<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=t-Distribution_Calculating_System&diff=548263t-Distribution Calculating System2019-12-23T20:45:28Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[t-Distribution Calculating System]] is a [[probability distribution calculation system]] that can solve a [[t-distribution calculation task]] (for a [[student's t-Distribution]]).<br />
* <B>AKA:</B> [[t-Distribution Calculating System|Student's t-Distribution Calculator]], [[T-distribution Algorithm]], [[T-calculator]]. <br />
* <B>Context:</B><br />
** It can be based on a [[t-table]]s.<br />
* <B>Example(s):</B><br />
** An [[interactive online t-calculator]] such as: <br />
*** http://stattrek.com/online-calculator/t-distribution.aspx<br />
*** http://surfstat.anu.edu.au/surfstat-home/tables/t.php <br />
** One based on on the [[scipy.stat]] functions[http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html#scipy.stats.t]: <code>scipy.stat.t.pdf</code>, <code>scipy.stat.t.cdf</code> and <code>scipy.stat.t.interval</code> .<br />
** An [[Excel spreadsheet function]]s such as <code>TDIST</code> and <code>TINV</code>.<br />
* <B>Counter-Example(s):</B><br />
** [[P-Value]]<br />
** [[Significance Level]]<br />
* <B>See:</B> [[Student's t-Distribution]], [[One-Sample t-Test System]], [[t-Table]].<br />
----<br />
----<br />
== References ==<br />
=== 2017a ===<br />
* (Stattrek, 2017) &rArr; http://stattrek.com/online-calculator/t-distribution.aspx<br />
** The t-distribution calculator makes it easy to compute cumulative probabilities, based on t statistics; or to compute t statistics, based on cumulative probabilities. <br />
=== 2017b ===<br />
* ([[Wikipedia, 2016]]) &rArr; http://en.wikipedia.org/wiki/Student's_t-distribution#Table_of_selected_values<br />
** Most statistical textbooks list ''t''-distribution tables. Nowadays, the better way to a fully precise critical ''t'' value or a cumulative probability is the statistical function implemented in [[spreadsheets]], or an interactive calculating web page. The relevant spreadsheet functions are <code>TDIST</code> and <code>TINV</code>, while online calculating pages save troubles like positions of parameters or names of functions.<br />
<br />
----<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=statsmodels.tsa_System&diff=548262statsmodels.tsa System2019-12-23T20:45:28Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[statsmodels.tsa System]] is a [[Univariate Timeseries Modeling System]] within [[statsmodels]].<br />
* <B>Context:</B><br />
** ...<br />
* <B>Example(s):</B><br />
** v??<br />
* <B>See:</B> [[Multivariate Timeseries Analysis]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2016 ===<br />
* http://www.statsmodels.org/dev/tsa.html<br />
** QUOTE: [[statsmodels.tsa System|statsmodels.tsa]] contains model classes and functions that are useful for time series analysis. Basic models include univariate autoregressive models (AR), vector autoregressive models (VAR) and univariate autoregressive moving average models (ARMA). Non-linear models include Markov switching dynamic regression and autoregression. It also includes descriptive statistics for time series, for example autocorrelation, partial autocorrelation function and periodogram, as well as the corresponding theoretical properties of ARMA or related processes. It also includes methods to work with autoregressive and moving average lag-polynomials. Additionally, related statistical tests and some useful helper functions are available. <P> Estimation is either done by exact or conditional Maximum Likelihood or conditional least-squares, either using Kalman Filter or direct filters. <P> Currently, functions and classes have to be imported from the corresponding module, but the main classes will be made available in the statsmodels.tsa namespace. The module structure is within statsmodels.tsa is<br />
*** stattools : empirical properties and tests, acf, pacf, granger-causality, adf unit root test, kpss test, bds test, ljung-box test and others.<br />
ar_model : univariate autoregressive process, estimation with conditional and exact maximum likelihood and conditional least-squares<br />
arima_model : univariate ARMA process, estimation with conditional and exact maximum likelihood and conditional least-squares<br />
vector_ar, var : vector autoregressive process (VAR) estimation models, impulse response analysis, forecast error variance decompositions, and data visualization tools<br />
kalmanf : estimation classes for ARMA and other models with exact MLE using Kalman Filter<br />
arma_process : properties of arma processes with given parameters, this includes tools to convert between ARMA, MA and AR representation as well as acf, pacf, spectral density, impulse response function and similar<br />
sandbox.tsa.fftarma : similar to arma_process but working in frequency domain<br />
tsatools : additional helper functions, to create arrays of lagged variables, construct regressors for trend, detrend and similar.<br />
filters : helper function for filtering time series<br />
regime_switching : Markov switching dynamic regression and autoregression models<br />
<br />
----<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=ssh_Client_Program&diff=548261ssh Client Program2019-12-23T20:45:28Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[ssh Client Program]] is a [[client software]] that facilitates access to a [[remote computer]] that accepts [[ssh connection]]s (using the [[secure shell protocol]]).<br />
* <B>Context:</B><br />
** It can be used to create an [[ssh Tunnel]].<br />
* <B>Example(s):</B><br />
** [[PuTTY]].<br />
** <code>[[ssh Client Program|ssh]] -vT -i id_xxx.pem myhost.com</code><br />
** <code> ssh -i ~/EMR.pem -ND 8157 hadoop@54.204.218.87 # setup an [[ssh tunnel]]</code><br />
* <B>Counter-Example(s):</B><br />
** an [[SSH Daemon]].<br />
** an [[SSH Server Software]].<br />
** [[FTP Client]].<br />
** [[Telnet Client]].<br />
* <B>See:</B> [[SSH]], [[SSH Private Key]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2013 ===<br />
* http://en.wikipedia.org/wiki/Comparison_of_SSH_clients<br />
** An '''SSH client</B> is a software program which uses the [[secure shell]] protocol to connect to a [[Server (computing)|remote computer]].<br />
* http://www.putty.org/<br />
** PuTTY is an [[ssh Client Program|SSH]] and [[telnet client]], developed originally by Simon Tatham for the [[Windows platform]]. PuTTY is [[open source software]] that is available with [[source code]] and is developed and supported by a [[group of volunteers]].<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=ssh_Client_Program&diff=548260ssh Client Program2019-12-23T20:45:28Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[ssh Client Program]] is a [[client software]] that facilitates access to a [[remote computer]] that accepts [[ssh connection]]s (using the [[secure shell protocol]]).<br />
* <B>Context:</B><br />
** It can be used to create an [[ssh Tunnel]].<br />
* <B>Example(s):</B><br />
** [[PuTTY]].<br />
** <code>[[ssh]] -vT -i id_xxx.pem myhost.com</code><br />
** <code> ssh -i ~/EMR.pem -ND 8157 hadoop@54.204.218.87 # setup an [[ssh tunnel]]</code><br />
* <B>Counter-Example(s):</B><br />
** an [[SSH Daemon]].<br />
** an [[SSH Server Software]].<br />
** [[FTP Client]].<br />
** [[Telnet Client]].<br />
* <B>See:</B> [[SSH]], [[SSH Private Key]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2013 ===<br />
* http://en.wikipedia.org/wiki/Comparison_of_SSH_clients<br />
** An '''SSH client</B> is a software program which uses the [[secure shell]] protocol to connect to a [[Server (computing)|remote computer]].<br />
* http://www.putty.org/<br />
** PuTTY is an [[ssh Client Program|SSH]] and [[telnet client]], developed originally by Simon Tatham for the [[Windows platform]]. PuTTY is [[open source software]] that is available with [[source code]] and is developed and supported by a [[group of volunteers]].<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=spray_REST_Framework&diff=548259spray REST Framework2019-12-23T20:45:28Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[spray REST Framework]] is a [[REST framework]] for [[Akka Web Application Framework]] (for [[Scala program]]s).<br />
* <B>Counter-Example(s):</B><br />
** [[Django REST Framework]], for [[Django framework]] (for [[Python program]]s)..<br />
** [[Zend REST Framework]], for [[Php program]]s.<br />
* <B>See:</B> [[Play Scala Framework]].<br />
----<br />
----<br />
== References ==<br />
* http://spray.io<br />
<br />
=== 2013 ===<br />
* http://spray.io/<br />
** [[spray REST Framework|spray]] is an [[open-source toolkit]] for building [[REST/HTTP-based integration layer]]s on top of [[Scala]] and [[Akka]]. Being [[asynchronous]], [[actor-based]], [[fast]], [[lightweight]], [[modular]] and [[testable]] it's a great way to connect your [[Scala application]]s to the world. <br />
<BR><br />
* https://github.com/spray/spray<br />
** A suite of scala libraries for building and consuming RESTful web services on top of Akka: lightweight, asynchronous, non-blocking, actor-based, testable<br />
<BR><br />
* http://spray.io/introduction/what-is-spray/<br />
** [[spray REST Framework|spray]] is a suite of [[lightweight]] [[Scala librari]]es providing [[client- and [[server-side]] [[REST/HTTP support]] on top [[Akka framework|Akka]]. <P> We believe that, having chosen [[Scala]] (and possibly [[Akka]]) as primary tools for building software, you’ll want to rely on their power not only in your application layer but throughout the [[full (JVM-level) network stack]]. [[spray REST Framework|spray]] provides just that: a set of integrated components for all your [[REST/HTTP]] needs that let you work with idiomatic [[Scala API|Scala]] (and [[Akka API|Akka]]) [[API]]s at the stack level of your choice, all implemented without any wrapping layers around “legacy” [[Java librari]]es.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=spaCy_NLP_System&diff=548258spaCy NLP System2019-12-23T20:45:27Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[spaCy NLP System]] is a [[Python-based|Python]]/[[Cython-based]] [[natural language processing library]].<br />
* <B>Example(s):</B><br />
** v2.0.11 (2018-04-04).<br />
** ...<br />
** v1.x (~2015).<br />
* <B>Counter-Example(s):</B><br />
** [[Spark NLP]], [[AllenNLP]], [[Gensim]], [[NLTK]], [[Lingpipe]], ...<br />
* <B>See:</B> [[NER System]], [[MIT License]], [[Syntactic Parsing System]], [[Natural Language Toolkit]].<br />
----<br />
----<br />
<br />
== References ==<br />
* https://spacy.io/<br />
<br />
=== 2018a ===<br />
* https://github.com/explosion/spaCy<br />
** QUOTE: [[spaCy NLP System|spaCy]] is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. [[spaCy NLP System|spaCy]] comes with [[pre-trained statistical model]]s and [[word vector]]s, and currently supports [[tokenization]] for 20+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.<br />
<br />
=== 2018 ===<br />
* (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/SpaCy Retrieved:2018-5-23.<br />
** '''spaCy''' ({{IPAc-en|s|p|eɪ|ˈ|s|iː}} {{respell|spay|SEE|'}}) is an [[Open-source software|open-source]] software library for advanced [[Natural language processing|Natural Language Processing]], written in the programming languages [[Python (programming language)|Python]] and [[Cython]]. It offers the fastest [[Statistical parsing|syntactic parser]] in the world.<ref>Choi et al. ([[2015]]). [https://aclweb.org/anthology/P/P15/P15-1038.pdf It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool].</ref><ref>{{Cite web|url=https://www.washingtonpost.com/news/wonk/wp/2016/05/18/googles-new-artificial-intelligence-cant-understand-these-sentences-can-you/|title=Google’s new artificial intelligence can’t understand these sentences. Can you?|website=Washington Post|access-date=2016-12-18}}</ref><ref>{{Cite web|url=https://spacy.io/usage/facts-figures|title=Facts & Figures {{!}} [[spaCy NLP System|spaCy]] Usage Documentation|last=|first=|date=|website=spacy.io|archive-url=|archive-date=|dead-url=|access-date=2017-11-08}}</ref> The library is published under the [[MIT License|MIT license]] and currently offers statistical [[Artificial neural network|neural network]] models for English, German, Spanish, Portuguese, French, Italian, Dutch and multi-language [[Named-entity recognition|NER]], as well as [[Tokenizer|tokenization]] for various other languages.<ref>{{Cite web|url=https://spacy.io/usage/models#languages|title=Models & Languages {{!}} [[spaCy NLP System|spaCy]] Usage Documentation|last=|first=|date=|website=spacy.io|archive-url=|archive-date=|dead-url=|access-date=2017-11-08}}</ref> <P> Unlike [[Natural Language Toolkit|NLTK]], which is widely used for teaching and research, [[spaCy NLP System|spaCy]] focuses on providing software for production usage. <ref name="Bird-Klein-Loper-Baldridge"></ref> As of version 1.0, [[spaCy NLP System|spaCy]] also supports [[deep learning]] workflows that allow connecting statistical models trained by popular [[machine learning]] libraries like [[TensorFlow]], [[Keras]], [[Scikit-learn]] or [[PyTorch]]. [[spaCy NLP System|spaCy]]'s [[machine learning]] library, Thinc, is also available as a separate [[Open-source software|open-source]] [[Python (programming language)|Python]] library. On November 7, 2017, version 2.0 was released. It features [[convolutional neural network]] models for [[part-of-speech tagging]], dependency parsing and [[Named-entity recognition|named entity recognition]], as well as API improvements around training and updating models, and constructing custom processing pipelines.<br />
<references/><br />
<br />
=== 2018b ===<br />
* (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/SpaCy#Main_features Retrieved:2018-5-23.<br />
** Non-destructive [[Tokenization (lexical analysis)|tokenization]]<br />
** [[Named-entity recognition|Named entity recognition]]<br />
** Support for over 25 languages * [[Statistical model|Statistical models]] models for 8 languages <br />
** Pre-trained [[Word embedding|word vectors]]<br />
** [[Part-of-speech tagging]]<br />
** Labelled [[Dependency grammar|dependency]] parsing<br />
** Syntax-driven [[Sentence boundary disambiguation|sentence segmentation]]<br />
** [[Document classification|Text classification]]<br />
** Built-in visualizers for [[syntax]] and [[Named-entity recognition|named entities]]<br />
** [[Deep learning]] integration<br />
<br />
----<br />
[[Category:Concept]]<br />
__NOTOC__</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.tree_Module&diff=548257sklearn.tree Module2019-12-23T20:45:26Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[sklearn.tree Module]] is an [[sklearn module]] of [[decision-tree learning system]]s.<br />
* <B>AKA:</B> [[sklearn.tree Module|Scikit-Learn Decision-Tree Class]].<br />
* <B>Context:</B><br />
** It require to call/select a [[Decision Tree Learning System]] :<br />
*** <code>[[sklearn.tree Module|sklearn.tree]].<span style="font-weight:italic; color:Green">Model_Name(self, arguments)</i></code> or simply <code>[[sklearn.tree Module|sklearn.tree]].<span style="font-weight:italic; color:Green">Model_Name()</i></code> <P>where <i>Model_Name</i> is the name of the selected [[decision-tree learning system]].<br />
* <B>Example(s)</B><br />
** <code>[[sklearn.tree.DecisionTreeClassifier]]()</code>, a [[Classification Tree Learning System]].<br />
** <code>[[sklearn.tree.DecisionTreeRegressor]]()</code>, a [[Regression Tree Learning System]]. <br />
** <code>[[sklearn.tree.ExtraTreeClassifier]]()</code>, a [[Classification Extra Trees Learning System]].<br />
** <code>[[sklearn.tree.ExtraTreeRegressor]]()</code>, a [[Regression Extra Trees Learning System]].<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.manifold]]</code>, a collection of [[Manifold Learning System]]s.<br />
** <code>[[sklearn.ensemble]]</code>, a collection of [[Decision Tree Ensemble Learning System]]s.<br />
** <code>[[sklearn.metrics]]</code>, a collection of [[Metric]]s [[Subroutine]]s.<br />
** <code>[[sklearn.covariance]]</code>,a collection of [[Covariance Estimator]]s.<br />
** <code>[[sklearn.cluster.bicluster]]</code>, a collection of [[Spectral Biclustering Algorithm]]s.<br />
** <code>[[sklearn.linear_model]]</code>, a collection of [[Linear Model Regression System]]s.<br />
** <code>[[sklearn.neighbors]]</code>, a collection of [[K Nearest Neighbors Algorithm]]s.<br />
** <code>[[sklearn.neural_network]]</code>, a collection of [[Neural Network System]]s.<br />
* <B>See:</B> [[DTree System]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017 ===<br />
* http://scikit-learn.org/stable/modules/classes.html#module-sklearn.tree<br />
** QUOTE: The sklearn.tree module includes decision tree-based models for classification and regression.<br />
User guide: See the Decision Trees section for further details.<br />
tree.DecisionTreeClassifier([criterion, …]) A decision tree classifier.<br />
tree.DecisionTreeRegressor([criterion, …]) A decision tree regressor.<br />
tree.ExtraTreeClassifier([criterion, …]) An extremely randomized tree classifier.<br />
tree.ExtraTreeRegressor([criterion, …]) An extremely randomized tree regressor.<br />
tree.export_graphviz(decision_tree[, …]) Export a decision tree in DOT format.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.tree_Module&diff=548256sklearn.tree Module2019-12-23T20:45:26Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[sklearn.tree Module]] is an [[sklearn module]] of [[decision-tree learning system]]s.<br />
* <B>AKA:</B> [[Scikit-Learn Decision-Tree Class]].<br />
* <B>Context:</B><br />
** It require to call/select a [[Decision Tree Learning System]] :<br />
*** <code>[[sklearn.tree Module|sklearn.tree]].<span style="font-weight:italic; color:Green">Model_Name(self, arguments)</i></code> or simply <code>[[sklearn.tree Module|sklearn.tree]].<span style="font-weight:italic; color:Green">Model_Name()</i></code> <P>where <i>Model_Name</i> is the name of the selected [[decision-tree learning system]].<br />
* <B>Example(s)</B><br />
** <code>[[sklearn.tree.DecisionTreeClassifier]]()</code>, a [[Classification Tree Learning System]].<br />
** <code>[[sklearn.tree.DecisionTreeRegressor]]()</code>, a [[Regression Tree Learning System]]. <br />
** <code>[[sklearn.tree.ExtraTreeClassifier]]()</code>, a [[Classification Extra Trees Learning System]].<br />
** <code>[[sklearn.tree.ExtraTreeRegressor]]()</code>, a [[Regression Extra Trees Learning System]].<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.manifold]]</code>, a collection of [[Manifold Learning System]]s.<br />
** <code>[[sklearn.ensemble]]</code>, a collection of [[Decision Tree Ensemble Learning System]]s.<br />
** <code>[[sklearn.metrics]]</code>, a collection of [[Metric]]s [[Subroutine]]s.<br />
** <code>[[sklearn.covariance]]</code>,a collection of [[Covariance Estimator]]s.<br />
** <code>[[sklearn.cluster.bicluster]]</code>, a collection of [[Spectral Biclustering Algorithm]]s.<br />
** <code>[[sklearn.linear_model]]</code>, a collection of [[Linear Model Regression System]]s.<br />
** <code>[[sklearn.neighbors]]</code>, a collection of [[K Nearest Neighbors Algorithm]]s.<br />
** <code>[[sklearn.neural_network]]</code>, a collection of [[Neural Network System]]s.<br />
* <B>See:</B> [[DTree System]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017 ===<br />
* http://scikit-learn.org/stable/modules/classes.html#module-sklearn.tree<br />
** QUOTE: The sklearn.tree module includes decision tree-based models for classification and regression.<br />
User guide: See the Decision Trees section for further details.<br />
tree.DecisionTreeClassifier([criterion, …]) A decision tree classifier.<br />
tree.DecisionTreeRegressor([criterion, …]) A decision tree regressor.<br />
tree.ExtraTreeClassifier([criterion, …]) An extremely randomized tree classifier.<br />
tree.ExtraTreeRegressor([criterion, …]) An extremely randomized tree regressor.<br />
tree.export_graphviz(decision_tree[, …]) Export a decision tree in DOT format.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.tree.DecisionTreeRegressor&diff=548255sklearn.tree.DecisionTreeRegressor2019-12-23T20:45:26Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[sklearn.tree.DecisionTreeRegressor]] is a [[regression tree learning system]] within <code>[[sklearn.tree]]</code>.<br />
* <B>AKA:</B> [[sklearn.tree.DecisionTreeRegressor|tree.DecisionTreeRegressor]], [[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor]]<br />
* <B>Context</B><br />
** Usage: <br />
::: 1) Import [[Regression Tree Learning System]] from [[scikit-learn]] : <code>from [[sklearn.tree]] import [[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor]]</code><br />
::: 2) Create [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> <br />
::: 3) Create [[Decision Tree Regressor]] object: <code>DTreg=[[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor]](criterion=’mse’, splitter=’best’[, max_depth=None, min_samples_split=2, min_samples_leaf=1,...])</code><br />
::: 4) Choose method(s):<br />
::::* <code>DTreg</code>.<code>apply(X[, check_input])</code>, returns the leaf index for each sample predictor.<br />
::::* <code>DTreg</code>.<code>decision_path(X[, check_input])</code>, returns the decision path in the tree.<br />
::::* <code>DTreg</code>.<code>fit(X, y[, sample_weight, check_input,...])</code> builds a [[decision tree regressor]] from the [[training set]] (X, y).<br />
::::* <code>DTreg</code>.<code>get_params([deep])</code> returns parameters for this estimator.<br />
::::* <code>DTreg</code>.<code>predict(X[, check_input])</code>, predicts [[regression value]] for X.<br />
::::* <code>DTreg</code>.<code>score(X, y[, sample_weight])</code>, returns the [[coefficient of determination]] R^2 of the [[prediction]].<br />
::::* <code>DTreg</code>.<code>set_params(**params)</code>, sets the parameters of this estimator.<br />
* <B>Example(s):</B><br />
** [http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression.html#sphx-glr-auto-examples-tree-plot-tree-regression-py Decision Tree Regression]<br />
** [http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression_multioutput.html#sphx-glr-auto-examples-tree-plot-tree-regression-multioutput-py Multi-output Decision Tree Regression]. <br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.tree.DecisionTreeClassifier]]</code><br />
** <code>[[sklearn.tree.ExtraTreeRegressor]]</code><br />
** <code>[[sklearn.tree.ExtraTreeClassifier]]</code><br />
** [[rpart System]].<br />
* <B>See:</B> [[Decision Tree]], [[Regression System]], [[Regularization Task]], [[Ridge Regression Task]], [[Kernel-based Classification Algorithm]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017 ===<br />
* (Scikit-Learn, 2017) &rArr; http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html Retrieved:2017-10-22<br />
** QUOTE: <code>class sklearn.tree.DecisionTreeRegressor(criterion=’mse’, splitter=’best’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, presort=False)</code> <P>A [[decision tree regressor]]. <P> Read more in the [http://scikit-learn.org/stable/modules/tree.html#regression User Guide].<br />
=== 2015 ===<br />
* http://scikit-learn.org/stable/modules/tree.html#regression<br />
** [[decision tree training system|Decision trees]] can also be applied to [[regression problem]]s, using the [[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor class]]. <P> As in the [[supervised classification task|classification setting]], the fit method will take as argument arrays X and y, only that in this case y is expected to have [[floating point value]]s instead of [[integer value]]s:<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.tree.DecisionTreeRegressor&diff=548254sklearn.tree.DecisionTreeRegressor2019-12-23T20:45:26Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[sklearn.tree.DecisionTreeRegressor]] is a [[regression tree learning system]] within <code>[[sklearn.tree]]</code>.<br />
* <B>AKA:</B> [[tree.DecisionTreeRegressor]], [[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor]]<br />
* <B>Context</B><br />
** Usage: <br />
::: 1) Import [[Regression Tree Learning System]] from [[scikit-learn]] : <code>from [[sklearn.tree]] import [[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor]]</code><br />
::: 2) Create [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> <br />
::: 3) Create [[Decision Tree Regressor]] object: <code>DTreg=[[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor]](criterion=’mse’, splitter=’best’[, max_depth=None, min_samples_split=2, min_samples_leaf=1,...])</code><br />
::: 4) Choose method(s):<br />
::::* <code>DTreg</code>.<code>apply(X[, check_input])</code>, returns the leaf index for each sample predictor.<br />
::::* <code>DTreg</code>.<code>decision_path(X[, check_input])</code>, returns the decision path in the tree.<br />
::::* <code>DTreg</code>.<code>fit(X, y[, sample_weight, check_input,...])</code> builds a [[decision tree regressor]] from the [[training set]] (X, y).<br />
::::* <code>DTreg</code>.<code>get_params([deep])</code> returns parameters for this estimator.<br />
::::* <code>DTreg</code>.<code>predict(X[, check_input])</code>, predicts [[regression value]] for X.<br />
::::* <code>DTreg</code>.<code>score(X, y[, sample_weight])</code>, returns the [[coefficient of determination]] R^2 of the [[prediction]].<br />
::::* <code>DTreg</code>.<code>set_params(**params)</code>, sets the parameters of this estimator.<br />
* <B>Example(s):</B><br />
** [http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression.html#sphx-glr-auto-examples-tree-plot-tree-regression-py Decision Tree Regression]<br />
** [http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression_multioutput.html#sphx-glr-auto-examples-tree-plot-tree-regression-multioutput-py Multi-output Decision Tree Regression]. <br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.tree.DecisionTreeClassifier]]</code><br />
** <code>[[sklearn.tree.ExtraTreeRegressor]]</code><br />
** <code>[[sklearn.tree.ExtraTreeClassifier]]</code><br />
** [[rpart System]].<br />
* <B>See:</B> [[Decision Tree]], [[Regression System]], [[Regularization Task]], [[Ridge Regression Task]], [[Kernel-based Classification Algorithm]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017 ===<br />
* (Scikit-Learn, 2017) &rArr; http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html Retrieved:2017-10-22<br />
** QUOTE: <code>class sklearn.tree.DecisionTreeRegressor(criterion=’mse’, splitter=’best’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, presort=False)</code> <P>A [[decision tree regressor]]. <P> Read more in the [http://scikit-learn.org/stable/modules/tree.html#regression User Guide].<br />
=== 2015 ===<br />
* http://scikit-learn.org/stable/modules/tree.html#regression<br />
** [[decision tree training system|Decision trees]] can also be applied to [[regression problem]]s, using the [[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor class]]. <P> As in the [[supervised classification task|classification setting]], the fit method will take as argument arrays X and y, only that in this case y is expected to have [[floating point value]]s instead of [[integer value]]s:<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.tree.DecisionTreeRegressor&diff=548253sklearn.tree.DecisionTreeRegressor2019-12-23T20:45:26Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[sklearn.tree.DecisionTreeRegressor]] is a [[regression tree learning system]] within <code>[[sklearn.tree]]</code>.<br />
* <B>AKA:</B> [[tree.DecisionTreeRegressor]], [[DecisionTreeRegressor]]<br />
* <B>Context</B><br />
** Usage: <br />
::: 1) Import [[Regression Tree Learning System]] from [[scikit-learn]] : <code>from [[sklearn.tree]] import [[DecisionTreeRegressor]]</code><br />
::: 2) Create [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> <br />
::: 3) Create [[Decision Tree Regressor]] object: <code>DTreg=[[DecisionTreeRegressor]](criterion=’mse’, splitter=’best’[, max_depth=None, min_samples_split=2, min_samples_leaf=1,...])</code><br />
::: 4) Choose method(s):<br />
::::* <code>DTreg</code>.<code>apply(X[, check_input])</code>, returns the leaf index for each sample predictor.<br />
::::* <code>DTreg</code>.<code>decision_path(X[, check_input])</code>, returns the decision path in the tree.<br />
::::* <code>DTreg</code>.<code>fit(X, y[, sample_weight, check_input,...])</code> builds a [[decision tree regressor]] from the [[training set]] (X, y).<br />
::::* <code>DTreg</code>.<code>get_params([deep])</code> returns parameters for this estimator.<br />
::::* <code>DTreg</code>.<code>predict(X[, check_input])</code>, predicts [[regression value]] for X.<br />
::::* <code>DTreg</code>.<code>score(X, y[, sample_weight])</code>, returns the [[coefficient of determination]] R^2 of the [[prediction]].<br />
::::* <code>DTreg</code>.<code>set_params(**params)</code>, sets the parameters of this estimator.<br />
* <B>Example(s):</B><br />
** [http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression.html#sphx-glr-auto-examples-tree-plot-tree-regression-py Decision Tree Regression]<br />
** [http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression_multioutput.html#sphx-glr-auto-examples-tree-plot-tree-regression-multioutput-py Multi-output Decision Tree Regression]. <br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.tree.DecisionTreeClassifier]]</code><br />
** <code>[[sklearn.tree.ExtraTreeRegressor]]</code><br />
** <code>[[sklearn.tree.ExtraTreeClassifier]]</code><br />
** [[rpart System]].<br />
* <B>See:</B> [[Decision Tree]], [[Regression System]], [[Regularization Task]], [[Ridge Regression Task]], [[Kernel-based Classification Algorithm]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017 ===<br />
* (Scikit-Learn, 2017) &rArr; http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html Retrieved:2017-10-22<br />
** QUOTE: <code>class sklearn.tree.DecisionTreeRegressor(criterion=’mse’, splitter=’best’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, presort=False)</code> <P>A [[decision tree regressor]]. <P> Read more in the [http://scikit-learn.org/stable/modules/tree.html#regression User Guide].<br />
=== 2015 ===<br />
* http://scikit-learn.org/stable/modules/tree.html#regression<br />
** [[decision tree training system|Decision trees]] can also be applied to [[regression problem]]s, using the [[sklearn.tree.DecisionTreeRegressor|DecisionTreeRegressor class]]. <P> As in the [[supervised classification task|classification setting]], the fit method will take as argument arrays X and y, only that in this case y is expected to have [[floating point value]]s instead of [[integer value]]s:<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.tree.DecisionTreeClassifier&diff=548252sklearn.tree.DecisionTreeClassifier2019-12-23T20:45:26Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[sklearn.tree.DecisionTreeClassifier]] is a [[classification tree learning system]] within <code>[[sklearn.tree]]</code>.<br />
* <B>Context</B><br />
** Usage: <br />
::: 1) Import [[Classification Tree Learning System]] from [[scikit-learn]] : <code>from [[sklearn.tree]] import [[sklearn.tree.DecisionTreeClassifier|DecisionTreeClassifier]]</code><br />
::: 2) Create [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> <br />
::: 3) Create [[Decision Tree Classifier]] object: <code>DTclf=[[sklearn.tree.DecisionTreeClassifier|DecisionTreeClassifier]](criterion=’gini’, splitter=’best’[, max_depth=None, min_samples_split=2, min_samples_leaf=1,...])</code><br />
::: 4) Choose method(s):<br />
::::* <code>DTclf</code>.<code>apply(X[, check_input])</code>, returns the leaf index for each sample predictor.<br />
::::* <code>DTclf</code>.<code>decision_path(X[, check_input])</code>, returns the decision path in the tree.<br />
::::* <code>DTclf</code>.<code>fit(X, y[, sample_weight, check_input,...])</code> builds a [[decision tree classifier]] from the [[training set]] (X, y).<br />
::::* <code>DTclf</code>.<code>get_params([deep])</code> returns parameters for this estimator.<br />
::::* <code>DTclf</code>.<code>predict(X[, check_input])</code>, predicts [[class]] for X.<br />
::::* <code>DTclf</code>.<code>predict_log_proba(X)</code>, predicts class log-probabilities of the input samples X.<br />
::::* <code>DTclf</code>.<code>predict_proba(X[, check_input])</code>, predicts class probabilities of the input samples X.<br />
::::* <code>DTclf</code>.<code>score(X, y[, sample_weight])</code>, returns the mean accuracy on the given test data and labels.<br />
::::* <code>DTclf</code>.<code>set_params(**params)</code>, sets the parameters of this estimator.<br />
* <B>Example(s):</B><br />
** [http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#sphx-glr-auto-examples-tree-plot-iris-py Plot the decision surface of a decision tree on the iris dataset]<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.tree.DecisionTreeRegressor]]</code><br />
** <code>[[sklearn.tree.ExtraTreeRegressor]]</code><br />
** <code>[[sklearn.tree.ExtraTreeClassifier]]</code><br />
* <B>See:</B> [[Decision Tree]], [[Classification System]], [[Regularization Task]], [[Ridge Regression Task]], [[Kernel-based Classification Algorithm]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017a ===<br />
* (Scikit-Learn, 2017A) &rArr; http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html Retrieved:2017-10-22<br />
** QUOTE: <code>class sklearn.tree.DecisionTreeClassifier(criterion=’gini’, splitter=’best’, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, class_weight=None, presort=False)</code><P>A decision tree classifier.<P>Read more in the [http://scikit-learn.org/stable/modules/tree.html#classification User Guide].<br />
=== 2017b ===<br />
* (Scikit-Learn, 2017B) &rArr; http://scikit-learn.org/stable/modules/tree.html#classification<br />
** QUOTE: [[sklearn.tree.DecisionTreeClassifier|DecisionTreeClassifier]] is a [[class]] capable of performing [[multi-class classification]] on a [[dataset]]. <P> As with other [[classifier]]s, [[sklearn.tree.DecisionTreeClassifier|DecisionTreeClassifier]] takes as input two arrays: an array X, sparse or dense, of size <code>[n_samples, n_features]</code> holding the [[training sample]]s, and an array Y of integer values, size <code>[n_samples]</code>, holding the class labels for the training samples:<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.svm_Module&diff=548251sklearn.svm Module2019-12-23T20:45:25Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[sklearn.svm Module]] is an [[sklearn module]] of [[Support Vector Machine System]]s.<br />
* <B>AKA:</B> [[Scikit-Learn Support Vector Machines Class]].<br />
* <B>Context:</B><br />
** It require to call/select a [[Support Vector Machine System]] :<br />
*** <code>[[sklearn.svm Module|sklearn.svm]].<span style="font-weight:italic; color:Green">SVM_ModelName(self, arguments)</i></code> or simply <code>[[sklearn.svm Module|sklearn.svm]].<span style="font-weight:italic; color:Green">SVM_ModelName()</i></code> <P>where <i>SVM_ModelName</i> is the name of the selected [[support vector machine system]].<br />
* <B>Example(s)</B><br />
** <code>[[sklearn.svm.LinearSVC]]</code>, a [[Linear Support Vector Classification System]];<br />
** <code>[[sklearn.svm.LinearSVR]]</code>, a [[Linear Support Vector Regression System]];<br />
** <code>[[sklearn.svm.NuSVC]]</code>, a [[Nu-Support Vector Classification System]];<br />
** <code>[[sklearn.svm.NuSVR]]</code>, a [[Nu Support Vector Regression System]];<br />
** <code>[[sklearn.svm.OneClassSVM]]</code>, an [[Unsupervised Outlier Detection System]];<br />
** <code>[[sklearn.svm.SVC]]</code>, a [[C-Support Vector Classification System]];<br />
** <code>[[sklearn.svm.SVR]]</code>, an [[Epsilon-Support Vector Regression System]].<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.tree]]</code>, a collection of [[Decision-tree Learning system]]s.<br />
** <code>[[sklearn.manifold]]</code>, a collection of [[Manifold Learning System]]s.<br />
** <code>[[sklearn.ensemble]]</code>, a collection of [[Decision Tree Ensemble Learning System]]s.<br />
** <code>[[sklearn.metrics]]</code>, a collection of [[Metric]]s [[Subroutine]]s.<br />
** <code>[[sklearn.covariance]]</code>,a collection of [[Covariance Estimator]]s.<br />
** <code>[[sklearn.cluster.bicluster]]</code>, a collection of [[Spectral Biclustering Algorithm]]s.<br />
** <code>[[sklearn.linear_model]]</code>, a collection of [[Linear Model Regression System]]s.<br />
** <code>[[sklearn.neighbors]]</code>, a collection of [[K Nearest Neighbors Algorithm]]s.<br />
** <code>[[sklearn.neural_network]]</code>, a collection of [[Neural Network System]]s.<br />
* <B>See:</B> [[Restricted Boltzmann Machines]], [[Artificial Neural Network]], [[Classification System]], [[Regression System]], [[Unsupervised Learning System]], [[Supervised Learning System]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2018 ===<br />
* (Scikit-learn, 2018) ⇒ http://scikit-learn.org/stable/modules/svm.html#svm Retrieved: 2018-03-11<br />
** QUOTE: [[Support vector machines (SVMs)]] are a set of [[supervised learning method]]s used for [[Classification Task|classification]], [[Regression Analysis Task|regression]] and [[outliers detection]]. <P> The advantages of [[support vector machine]]s are:<br />
*** Effective in [[high dimensional space]]s.<br />
*** Still effective in cases where number of [[dimension]]s is greater than the number of [[sample]]s.<br />
*** Uses a subset of [[training point]]s in the [[decision function]] (called [[support vector]]s), so it is also memory efficient.<br />
*** Versatile: different [[Kernel function]]s can be specified for the [[decision function]]. Common [[kernel]]s are provided, but it is also possible to specify custom [[kernels]].<br />
:: The disadvantages of [[support vector machines]] include:<br />
::* If the number of [[feature]]s is much greater than the number of [[sample]]s, avoid [[over-fitting]] in choosing [[Kernel function]]s and [[regularization term]] is crucial.<br />
::* [[SVM]]s do not directly provide [[probability estimate]]s, these are calculated using an expensive five-fold [[cross-validation]] (see Scores and probabilities, below).<br />
:: The [[support vector machine]]s in [[scikit-learn]] support both dense (<code>numpy.ndarray</code> and convertible to that by <code>numpy.asarray</code>) and sparse (any <code>scipy.sparse</code>) [[sample]] [[vector]]s as [[input]]. However, to use an [[SVM]] to make [[prediction]]s for [[sparse data]], it must have been fit on such [[data]]. For optimal performance, use C-ordered <code>numpy.ndarray</code> (dense) or <code>scipy.sparse.csr_matrix</code> (sparse) with <code>dtype=float64</code>.<br />
----<br />
__NOTOC__<br />
[[Category:Concept]] [[Category:Machine Learning]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.neural_network_Module&diff=548250sklearn.neural network Module2019-12-23T20:45:25Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[sklearn.neural network Module]] is a [[neural network platform]] that is an [[sklearn module]] (which contains a collection of [[neural network algorithm implementation]]s).<br />
* <B>Context:</B><br />
** It can (often) reference a [[sklearn.neural network Module|sklearn.neural network]] system, such as by: <code>[[sklearn.neural network Module|sklearn.neural network]].<span style="font-weight:italic; color:green">Model_Name(self, arguments)</i></code> or simply <code>[[sklearn.neural network Module|sklearn.neural network]].<span style="font-weight:italic; color:green">Model_Name()</i></code> (where <code>Model_Name</code> is a [[neural network algorithm]]).<br />
* <B>Example(s)</B><br />
** <code>[[sklearn.neural_network.BernoulliRBM]]</code>, a [[Bernoulli Restricted Boltzmann Machine (RBM) System]] that implements an [[Isometric Mapping Algorithm]]. <br />
** <code>[[sklearn.neural_network.MLPClassifier]]</code>, a [[Multi-layer Perceptron Classifier]].<br />
** <code>[[sklearn.neural_network.MLPRegressor]]</code>, a [[Multi-layer Perceptron Regressor]].<br />
* <B>Counter-Example(s):</B><br />
** [[PyTorch]].<br />
** [[Keras]].<br />
** [[TensorFlow]].<br />
** [[MXNet]].<br />
** <code>[[sklearn.tree]]</code>, a collection of [[Decision Tree Learning System]]s.<br />
** <code>[[sklearn.svm]]</code>, a collection of [[Support Vector Machine]] algorithms.<br />
* <B>See:</B> [[Multilayer Perceptron (MLP) Training]], <code>[[sklearn.metrics]]</code>, <code>[[sklearn.covariance]]</code>, <code>[[sklearn.cluster.bicluster]]</code>, <code>[[sklearn.linear_model]]</code>, <code>[[sklearn.manifold]]</code>.<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017a ===<br />
* (Scikit Learn, 2017) &rArr; http://scikit-learn.org/stable/modules/classes.html#module-sklearn.neural_network Retrieved: 2017-12-17<br />
** QUOTE: The [[sklearn.neural_network module]] includes models based on [[neural network]]s. <P> User guide: See the [http://scikit-learn.org/stable/modules/neural_networks_supervised.html#neural-networks-supervised Neural network models (supervised)] and [http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Neural network models (unsupervised)] sections for further details.<br />
*** <code>neural_network.BernoulliRBM([n_components, …])</code> [[Bernoulli Restricted Boltzmann Machine (RBM)]].<br />
*** <code>neural_network.MLPClassifier([…])</code> [[Multi-layer Perceptron classifier]].<br />
*** <code>neural_network.MLPRegressor([…])</code> [[Multi-layer Perceptron regressor]].<br />
<br />
=== 2017b ===<br />
* (sklearn,2017) &rArr; http://scikit-learn.org/stable/modules/neural_networks_supervised.html#multi-layer-perceptron Retrieved:2017-12-3.<br />
** QUOTE: [[Multi-layer Perceptron (MLP)]] is a [[supervised learning algorithm]] that learns a function <math>f(\cdot): R^m \rightarrow R^o</math> by [[training]] on a [[dataset]], where <math>m</math> is the number of [[dimension]]s for [[input]] and <math>o</math> is the number of [[dimension]]s for [[output]]. Given a [[set of features]] <math>X = {x_1, x_2, \cdots, x_m}</math> and a target <math>y</math>, it can learn a [[non-linear function approximator]] for either [[classification]] or [[regression]]. It is different from [[logistic regression]], in that between the [[input]] and the [[output]] [[layer]], there can be one or more [[non-linear layer]]s, called [[hidden layer]]s. [http://scikit-learn.org/stable/_images/multilayerperceptron_network.png Figure 1] shows a one [[hidden layer]] [[MLP]] with [[scalar output]]. <P> The leftmost layer, known as the [[input layer]], consists of a set of [[neuron]]s <math>\{x_i | x_1, x_2, \cdots, x_m\}</math> representing the input [[feature]]s. Each [[neuron]] in the [[hidden layer]] transforms the values from the [[previous layer]] with a [[weighted linear summation]] <math>w_1x_1 + w_2x_2 + \cdots + w_mx_m</math>, followed by a [[non-linear activation function]] <math>g(\cdot):R \rightarrow R -</math> like the [[hyperbolic tan function]]. The [[output layer]] receives the values from the last [[hidden layer]] and transforms them into [[output value]]s. <P> The [[module]] contains the public attributes <code>coefs_</code> and <code>intercepts_</code>. <code>coefs_</code> is a list of [[weight matrice]]s, where [[weight matrix]] at index <math>i</math> represents the [[weight]]s between [[layer]] <math>i</math> and layer <math>i+1</math>. <code>intercepts_</code> is a list of [[bias vector]]s, where the vector at index <math>i</math> represents the [[bias]] values added to [[layer]] <math>i+1</math>. <P> The advantages of [[Multi-layer Perceptron]] are:<br />
*** Capability to learn [[non-linear model]]s.<br />
***Capability to learn models in [[real-time]] ([[on-line learning]]) using <code>partial_fit</code>. <br />
:: The disadvantages of [[Multi-layer Perceptron (MLP)]] include:<br />
::* [[MLP]] with hidden layers have a [[non-convex loss function]] where there exists more than one [[local minimum]]. Therefore different [[random weight initialization]]s can lead to different [[validation accuracy]].<br />
::* [[MLP]] requires [[tuning]] a number of [[hyperparameter]]s such as the number of [[hidden neuron]]s, [[layer]]s, and [[iteration]]s.<br />
::* [[MLP]] is sensitive to [[feature scaling]].<br />
<br />
=== 2017c ===<br />
* (sklearn,2017) &rArr; http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Retrieved:2017-12-17<br />
** QUOTE: [[Restricted Boltzmann machines (RBM)]] are [[unsupervised nonlinear feature learner]]s based on a [[probabilistic model]]. The [[feature]]s extracted by an [[RBM]] or a hierarchy of [[RBM]]s often give good results when fed into a [[linear classifier]] such as a [[linear SVM]] or a [[perceptron]]. <P> The model makes assumptions regarding the [[distribution]] of [[input]]s. At the moment, [[scikit-learn]] only provides [[BernoulliRBM]], which assumes the inputs are either [[binary value]]s or values between 0 and 1, each encoding the [[probability]] that the specific [[feature]] would be turned on. <P> The [[RBM]] tries to [[maximize]] the [[likelihood]] of the [[data]] using a particular [[graphical model]]. The [[parameter]] [[learning algorithm]] used ([[Stochastic Maximum Likelihood]]) prevents the [[representation]]s from straying far from the [[input data]], which makes them capture interesting regularities, but makes the model less useful for [[small dataset]]s, and usually not useful for [[density estimation]]. <P> The method gained popularity for initializing [[deep neural network]]s with the weights of independent [[RBM]]s. This method is known as [[unsupervised pre-training]].<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.neural_network.MLPRegressor&diff=548249sklearn.neural network.MLPRegressor2019-12-23T20:45:25Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[sklearn.neural_network.MLPRegressor]] is a [[multi-layer perceptron regression system]] within <code>[[sklearn.neural_network]]</code>.<br />
* <B>Context</B><br />
** Usage: <br />
::: 1) Import [[MLP Regression System]] from [[scikit-learn]] : <code>from [[sklearn.neural_network]] import [[sklearn.neural network.MLPRegressor|MLPRegressor]]</code><br />
::: 2) Create [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> <br />
::: 3) Create [[Regressor]] object: <code>regressor_model=MLPRegressor([hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, ...])</code><br />
::: 4) Choose method(s):<br />
::::* <code>fit(X, y)</code> Fits the [[regression model]] to [[data matrix]] X and [[target]](s) y.<br />
::::* <code>get_params([deep])</code> Gets [[parameter]]s for this [[estimator]].<br />
::::* <code>predict(X)</code> Predicts using the [[multi-layer perceptron]] model.<br />
::::* <code>score(X, y[, sample_weight])</code> Returns the [[coefficient of determination]] R^2 of the prediction.<br />
::::* <code>set_params(**params)</code> Set the [[parameter]]s of this [[estimator]].<br />
* <B>Example(s):</B><br />
** [https://github.com/omoreira/GM-Python-Workbook/blob/master/NN_examples/MLP_regression_10foldcv_boston.py MLP_regression_10foldcv_boston.py] ([[Boston Dataset-based Regression]])<br />
{| class="wikitable" style="margin-left: 100px;border:0px;background:white"<br />
|[[File:MLP_ReLU.png|300px]]<br />
|[[File:MLP_Logistic.png|300px]]<br />
|[[File:MLP_TanH.png|300px]]<br />
|-<br />
|Method: MLP using [[ReLU]] <P> RMSE on the data: 5.3323 <P> RMSE on 10-fold CV: 6.7892<br />
|Method: MLP using Logistic Neurons <P>RMSE on the data: 7.3161 <P> RMSE on 10-fold CV: 8.0986<br />
|Method: MLP using TanH Neurons <P> RMSE on the data: 6.3860 <P> RMSE on 10-fold CV: 8.0147<br />
|}<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.neural network.MLPClassifier]]</code><br />
** <code>[[sklearn.neural_network.BernoulliRBM]]</code><br />
** <code>[[sklearn.svm.LinearSVC]]</code><br />
** <code>[[sklearn.svm.LinearSVR]]</code><br />
* <B>See:</B> [[Artificial Neural Network]], [[Supervised Learning System]], [[Regression Task]], [[Feedforward Neural Network]], [[Restricted Boltzmann Machines]], [[Support Vector Machines]].<br />
----<br />
----<br />
== References ==<br />
<br />
=== 2017a ===<br />
* (Scikit-Learn, 2017A) &rArr; http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html Retrieved:2017-12-17<br />
** QUOTE: <code>class sklearn.neural_network.MLPRegressor(hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08)</code><P>[[Multi-layer Perceptron regressor]]. This model optimizes the [[squared-loss]] using [[LBFGS]] or [[stochastic gradient descent]]. <P> (...)<P><B>Notes</B><P>MLPRegressor trains iteratively since at each time step the partial derivatives of the loss function with respect to the model parameters are computed to update the parameters. It can also have a regularization term added to the loss function that shrinks model parameters to prevent overfitting. This implementation works with data represented as dense and sparse numpy arrays of floating point values.<br />
=== 2017b ===<br />
* (sklearn,2017) &rArr; http://scikit-learn.org/stable/modules/neural_networks_supervised.html#regression Retrieved:2017-12-3.<br />
** QUOTE: Class <code>[[sklearn.neural network.MLPRegressor|MLPRegressor]]</code> implements a [[multi-layer perceptron (MLP)]] that [[train]]s using [[backpropagation]] with no [[activation function]] in the [[output layer]], which can also be seen as using the [[identity function]] as [[activation function]]. Therefore, it uses the [[square error]] as the [[loss function]], and the [[output]] is a set of [[continuous value]]s. <P> <code>MLPRegressor</code> also supports [[multi-output regression]], in which a [[sample]] can have more than one [[target]].<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.neighbors_Module&diff=548248sklearn.neighbors Module2019-12-23T20:45:24Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[sklearn.neighbors Module]] is a [[nearest neighbors system]] within [[sklearn]].<br />
* <B>Context:</B><br />
** It require to call/select a [[Decision Tree Learning System]] :<br />
*** <code>[[sklearn.neighbors Module|sklearn.neighbors]].<span style="font-weight:italic; color:Green">Model_Name(self, arguments)</i></code> or simply <code>[[sklearn.tree]].<span style="font-weight:italic; color:Green">Model_Name()</i></code> <P>where <i>Model_Name</i> is the name of the selected [[K-Nearest Neighbor System]].<br />
*** It can cotain [[Unsupervised kNN Learning System]]s, [[Supervised kNN Classification System]]s and [[Supervised kNN Regression System]]s.<br />
* <B>Example(s)</B>: <br />
** [[Unsupervised kNN Learning System]]s: <br />
*** <code>[[sklearn.neighbors.BallTree]]</code>, for solving a [[Fast Generalized N-points Task]].<br />
*** <code>[[sklearn.neighbors.KDTree]]</code>, for solving a [[Fast Generalized N-points Task]].<br />
*** <code>[[sklearn.neighbors.DistanceMetric]]</code>, a [[Distance Metric]] Algorithm.<br />
*** <code>[[sklearn.neighbors.KernelDensity]]</code>, for solving a [[Kernel Density Estimation Task]]<br />
*** <code>[[sklearn.neighbors.LocalOutlierFactor]]</code>, an [[Unsupervised Outlier Detection System]] that uses the [[Local Outlier Factor (LOF) Algorithm]].<br />
*** <code>[[sklearn.neighbors.NearestNeighbors]]</code>, an [[Unsupervised Learning System]] for implementing neighbor searches.<br />
** [[Supervised kNN Classification System]]s: <br />
*** <code>[[sklearn.neighbors.KNeighborsClassifier]]</code>, a [[Classification System]] that implements a [[K-Nearest Neighbors Voting Algorithm]].<br />
*** <code>[[sklearn.neighbors.RadiusNeighborsClassifier]]</code>, a [[Classification System]] that implements a vote among neighbors within a given radius.<br />
*** <code>[[sklearn.neighbors.NearestCentroid]]</code>, a [[Nearest Centroid Classification System]].<br />
** [[Supervised kNN Regression System]]s: <br />
*** <code>[[sklearn.neighbors.KNeighborsRegressor]]</code>, a [[Regression System]] based on [[K-Nearest Neighbors Algorithm]]. <br />
*** <code>[[sklearn.neighbors.RadiusNeighborsRegressor]]</code>, a [[Regression System]] based on neighbors within a fixed radius.<br />
** [[Weighted Graph]]s:<br />
*** <code>[[sklearn.neighbors.kneighbors_graph]]</code>, a [[Weighted graph of k-Neighbor]]s for points in X.<br />
*** <code>[[sklearn.neighbors.radius_neighbors_graph]] </code>, a [[Weighted graph of Neighbor]]s for points in X.<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.manifold]]</code>, a collection of [[Manifold Learning System]]s.<br />
** <code>[[sklearn.tree]]</code>, a collection of [[Decision Tree Learning System]]s.<br />
** <code>[[sklearn.ensemble]]</code>, a collection of [[Decision Tree Ensemble Learning System]]s.<br />
** <code>[[sklearn.metrics]]</code>, a collection of [[Metric]]s [[Subroutine]]s.<br />
** <code>[[sklearn.covariance]]</code>,a collection of [[Covariance Estimator]]s.<br />
** <code>[[sklearn.cluster.bicluster]]</code>, a collection of [[Spectral Biclustering Algorithm]]s.<br />
** <code>[[sklearn.linear_model]]</code>, a collection of [[Linear Model Regression System]]s.<br />
* <B>See:</B> [[kNN System]].<br />
----<br />
----<br />
== References ==<br />
* (Scikit-Learn, 2017) &rArr; "sklearn.neighbors: Nearest Neighbors" http://scikit-learn.org/stable/modules/classes.html#module-sklearn.neighbors Retrieved: 2017-11-12<br />
** QUOTE: The [[sklearn.neighbors module]] implements the [[k-nearest neighbors algorithm]].<P> User guide: See the [http://scikit-learn.org/stable/modules/neighbors.html Nearest Neighbors] section for further details.<br />
*** <code>neighbors.BallTree</code> [[BallTree]] for fast generalized N-point problems<br />
*** <code>neighbors.DistanceMetric</code> DistanceMetric class<br />
*** <code>neighbors.KDTree</code> KDTree for fast generalized N-point problems<br />
*** <code>neighbors.KernelDensity([bandwidth, …])</code> Kernel Density Estimation<br />
*** <code>neighbors.KNeighborsClassifier([…])</code> Classifier implementing the k-nearest neighbors vote.<br />
*** <code>neighbors.KNeighborsRegressor([n_neighbors, …])</code> Regression based on k-nearest neighbors.<br />
*** <code>neighbors.LocalOutlierFactor([n_neighbors, …])</code> Unsupervised Outlier Detection using Local Outlier Factor (LOF)<br />
*** <code>neighbors.RadiusNeighborsClassifier([…])</code> Classifier implementing a vote among neighbors within a given radius<br />
*** <code>neighbors.RadiusNeighborsRegressor([radius, …])</code> Regression based on neighbors within a fixed radius.<br />
*** <code>neighbors.NearestCentroid([metric, …])</code> Nearest centroid classifier.<br />
*** <code>neighbors.NearestNeighbors([n_neighbors, …])</code> Unsupervised learner for implementing neighbor searches.<br />
*** <code>neighbors.kneighbors_graph(X, n_neighbors[, …])</code> Computes the (weighted) graph of k-Neighbors for points in X<br />
*** <code>neighbors.radius_neighbors_graph(X, radius)</code> Computes the (weighted) graph of Neighbors for points in X<br />
=== 2016 ===<br />
* (Scikit-Learn, 2016) &rArr; "1.6. Nearest Neighbors" http://scikit-learn.org/stable/modules/neighbors.html<br />
** QUOTE: [[sklearn.neighbors Module|sklearn.neighbors]] provides functionality for [[unsupervised neighbors-based learning method|unsupervised]] and [[supervised neighbors-based learning method]]s. </s> ... <P> ... The classes in [[sklearn.neighbors Module|sklearn.neighbors]] can handle either [[Numpy array]]s or [[scipy.sparse matrice]]s as input. </s> For [[dense matrice]]s, a large number of possible [[distance metric]]s are supported. </s> For [[sparse matrice]]s, [[arbitrary Minkowski metric]]s are supported for [[search]]es. </s> There are many [[learning routine]]s which rely on [[nearest neighbor]]s at their core. </s> One example is [[kernel density estimation]], discussed in the [[density estimation]] section. </s><br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.metrics_Module&diff=548247sklearn.metrics Module2019-12-23T20:45:24Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[sklearn.metrics Module]] is an [[sklearn]] [[Python module|module]] that contains a collection of [[Metric]]s [[subroutine]]s.<br />
* <B>Context:</B><br />
** It can (often) reference a [[sklearn.metrics Module|sklearn.metrics]] system.<br />
*** <code>[[sklearn.metrics Module|sklearn.metrics]].<span style="font-weight:italic; color:green">Metric_Name(self, arguments)</i></code> or simply <code>[[sklearn.metrics Module|sklearn.metrics]].<span style="font-weight:italic; color:green">Metric_Name()</i></code> <P>where <i>Metric_Name</i> is the name of the selected [[Metric]] [[subroutine]].<br />
** It can range from being [[Classification Metric]]s, to being [[Regression Metric]]s, to being [[Multilabel Ranking Metric]]s, to being [[Clustering Metric]]s, to being [[Biclustering Metric]]s, to being [[Pairwise Metric]]s.<br />
* <B>Example(s)</B><br />
** [[sklearn.metrics.pairwise Submodule]].<br />
** [[sklearn.metrics Classification Submodule]].<br />
** [[sklearn.metrics Regression Submodule]].<br />
** [[sklearn.metrics Multilabel Ranking Submodule]].<br />
** [[sklearn.metrics Biclustering Submodule]].<br />
** [[sklearn.metrics.cluster Submodule]]<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.manifold]]</code>, a collection of [[Manifold Learning System]]s.<br />
** <code>[[sklearn.tree]]</code>, a collection of [[Decision Tree Learning System]]s.<br />
** <code>[[sklearn.ensemble]]</code>, a collection of [[Decision Tree Ensemble Learning System]]s.<br />
** <code>[[sklearn.covariance]]</code>,a collection of [[Covariance Estimator]]s.<br />
** <code>[[sklearn.cluster.bicluster]]</code>, a collection of [[Spectral Biclustering Algorithm]]s.<br />
** <code>[[sklearn.linear_model]]</code>, a collection of [[Linear Model Regression System]]s.<br />
** <code>[[sklearn.neighbors]]</code>, a collection of [[K Nearest Neighbors Algorithm]]s.<br />
** <code>[[sklearn.neural_network]]</code>, a collection of [[Neural Network System]]s.<br />
* <B>See:</B> [[Metric]], [[Pairwise Distance]], [[Clustering]], [[Regression Analysis Task]], [[Classification Task]], [[MSE]], [[RMSE]].<br />
----<br />
----<br />
== References ==<br />
=== 2017A ===<br />
* (Scikit Learn, 2017) &rArr; http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics Retrieved:2017-11-12<br />
** QUOTE: See the Model evaluation: quantifying the quality of predictions section and the Pairwise metrics, Affinities and Kernels section of the user guide for further details. <P> The sklearn.metrics module includes score functions, performance metrics and pairwise metrics and distance computations.<br />
<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.manifold_Module&diff=548246sklearn.manifold Module2019-12-23T20:45:23Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>An [[sklearn.manifold Module]] is an [[sklearn]] [[Python module|module]] that contains a collection of [[Manifold Learning System]]s that implement [[Data Embedding Algorithm]]s.<br />
* <B>Context:</B><br />
** It can (often) reference a [[sklearn.manifold Module|sklearn.manifold]] system.<br />
*** <code>[[sklearn.manifold Module|sklearn.manifold]].<span style="font-weight:italic; color:green">Model_Name(self, arguments)</i></code> or simply <code>[[sklearn.manifold Module|sklearn.manifold]].<span style="font-weight:italic; color:green">Model_Name()</i></code> <P>where <i>Model_Name</i> is the name of the selected [[Data Embedding Algorithm]].<br />
* <B>Example(s)</B><br />
** <code>[[sklearn.manifold.Isomap]]</code>, a [[Manifold Learning System]] that implements an [[Isometric Mapping Algorithm]]. <br />
** <code>[[sklearn.manifold.LocallyLinearEmbedding]]</code>, a [[Manifold Learning System]] that implements a [[Locally-Linear Embedding Algorithm]].<br />
** <code>[[sklearn.manifold.MDS]]</code>, a [[Manifold Learning System]] that implements [[Multidimensional Scaling Algorithm]].<br />
** <code>[[sklearn.manifold.SpectralEmbedding]]</code>, a [[Manifold Learning System]] that implements [[Spectral Embedding Algorithm]] to solve a [[Nonlinear Dimensionality Reduction Task]].<br />
** <code>[[sklearn.manifold.TSNE]]</code>, a [[Manifold Learning System]] that implements a [[t-distributed Stochastic Neighbor Embedding Algorithm]].<br />
** <code>[[sklearn.manifold.locally_linear_embedding]]</code>, A [[Locally-Linear Embedding System]].<br />
** <code>[[sklearn.manifold.smacof]]</code>, a [[Manifold Learning System]] that solves a [[Multidimensional Scaling Task]] using a [[SMACOF Algorithm]].<br />
** <code>[[sklearn.manifold.spectral_embedding]]</code>, a [[Spectral Embedding System]].<br />
* <B>Counter-Example(s):</B><br />
** <code>[[sklearn.tree]]</code>, a collection of [[Decision Tree Learning System]]s.<br />
** <code>[[sklearn.ensemble]]</code>, a collection of [[Decision Tree Ensemble Learning System]]s.<br />
** <code>[[sklearn.metrics]]</code>, a collection of [[Metric]]s [[Subroutine]]s.<br />
** <code>[[sklearn.covariance]]</code>,a collection of [[Covariance Estimator]]s.<br />
** <code>[[sklearn.cluster.bicluster]]</code>, a collection of [[Spectral Biclustering Algorithm]]s.<br />
** <code>[[sklearn.linear_model]]</code>, a collection of [[Linear Model Regression System]]s.<br />
** <code>[[sklearn.neighbors]]</code>, a collection of [[K Nearest Neighbors Algorithm]]s .<br />
** <code>[[sklearn.neural_network]]</code>, a collection of [[Neural Network System]]s.<br />
* <B>See:</B> [[Mapping Task]], [[Scaling Task]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2017A ===<br />
* (Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/classes.html#module-sklearn.manifold Retrieved:2017-11-12<br />
** QUOTE: The [[sklearn.manifold module]] implements [[data embedding technique]]s. <P> <B>User guide:</B> See the [http://scikit-learn.org/stable/modules/manifold.html Manifold learning] section for further details.<br />
*** <code>manifold.Isomap([n_neighbors, n_components, …])</code>, [[Isomap Embedding]]<br />
*** <code>manifold.LocallyLinearEmbedding([…])</code>, [[Locally Linear Embedding]]<br />
*** <code>manifold.MDS([n_components, metric, n_init, …])</code>, [[Multidimensional scaling]]<br />
*** <code>manifold.SpectralEmbedding([n_components, …])</code>, [[Spectral embedding]] for [[non-linear dimensionality reduction]].<br />
*** <code>manifold.TSNE([n_components, perplexity, …])</code>, [[t-distributed Stochastic Neighbor Embedding]].<br />
*** <code>manifold.locally_linear_embedding(X, …[, …])</code>, Perform a [[Locally Linear Embedding analysis]] on the data.<br />
*** <code>manifold.smacof(dissimilarities[, metric, …])</code>, Computes [[multidimensional scaling]] using the [[SMACOF algorithm]].<br />
*** <code>manifold.spectral_embedding(adjacency[, …])</code>, Project the [[sample]] on the first [[eigenvector]]s of the [[graph Laplacian]].<br />
<br />
=== 2017B ===<br />
* (Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/manifold.html Retrieved:2017-11-12<br />
** QUOTE: [[Manifold learning]] is an approach to [[non-linear dimensionality reduction]]. [[Algorithm]]s for [[this task]] are based on the idea that the [[dimensionality]] of many [[data set]]s is only artificially high. <P> (...)<P>[[High-dimensional dataset]]s can be very difficult to visualize. While [[data]] in two or [[three dimension]]s can be plotted to show the [[inherent structure]] of the [[data]], equivalent [[high-dimensional]] [[plot]]s are much less intuitive. To aid visualization of the structure of a [[dataset]], the dimension must be reduced in some way.<P> The simplest way to accomplish this [[dimensionality reduction]] is by taking a [[random projection]] of the [[data]]. Though this allows some degree of visualization of the [[data structure]], the [[randomness]] of the choice leaves much to be desired. In a [[random projection]], it is likely that the more interesting structure within the data will be lost.<P>To address this concern, a number of [[supervised]] and [[unsupervised linear dimensionality reduction framework]]s have been designed, such as [[Principal Component Analysis (PCA)]], [[Independent Component Analysis]], [[Linear Discriminant Analysis]], and others. These [[algorithm]]s define specific rubrics to choose an “interesting” [[linear projection]] of the [[data]]. These methods can be powerful, but often miss important [[non-linear structure]] in the [[data]].<P>[[Manifold Learning]] can be thought of as an attempt to generalize [[linear framework]]s like [[PCA]] to be sensitive to [[non-linear structure]] in [[data]]. Though [[supervised variant]]s exist, the typical [[manifold learning problem]] is [[unsupervised]]: it learns the [[high-dimensional structure]] of the [[data]] from the [[data]] itself, without the use of predetermined [[classification]]s.<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bothttps://www.gabormelli.com/RKB/index.php?title=sklearn.linear_model.LinearRegression&diff=548245sklearn.linear model.LinearRegression2019-12-23T20:45:23Z<p>Replacement Bot: Remove links to pages that are actually redirects to this page.</p>
<hr />
<div>A [[sklearn.linear model.LinearRegression]] is a [[linear least-squares regression system]] within <code>[[sklearn.linear_model]]</code> [[class]].<br />
* <B>AKA:</B> [[sklearn.linear model.LinearRegression|LinearRegression]], [[sklearn.linear model.LinearRegression|linear model.LinearRegression]]<br />
* <B>Context</B><br />
** Usage: <br />
::: 1) Import [[Linear Regression]] model from [[scikit-learn]] : <code>from [[sklearn.linear_model]] import [[sklearn.linear model.LinearRegression|LinearRegression]]</code><br />
::: 2) Create a [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> <br />
::: 3) Create a [[Lasso Regression]] object: <code>model=[[sklearn.linear model.LinearRegression|LinearRegression]]([fit_intercept=True, normalize=False, copy_X=True, n_jobs=1])</code><br />
::: 4) Choose method(s):<br />
::::* Fit model with [[coordinate descent]]: <code> model.fit(X, Y[, check_input]))</code>([[supervised learning]]) or <code>model.fit(X)</code> ([[unsupervised learning]])<br />
::::* Predict Y using the [[linear model]] with estimated coefficients: <code>Y_pred = model.predict(X)</code><br />
::::* Return [[coefficient of determination (R^2)]] of the prediction: <code>model.score(X,Y[, sample_weight=w])</code><br />
::::* Get [[estimator parameter]]s: <code>model.get_params([deep])</code><br />
::::* Set [[estimator parameter]]s: <code>model.set_params(**params)</code> <br />
* <B>Example(s):</B><br />
** [[10-fold]] [[sklearn Boston data evaluation]] [https://github.com/omoreira/GM-Python-Workbook/blob/master/statistical_test/ridge_boston10foldcv.py]<br />
{| class="wikitable" style="margin-left: 50px;border:0px;background:white"<br />
|<B>Input:</B><br />
|<B>Output:</B><br />
|-<br />
|style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:top;width:600px;"| <br />
:from [[sklearn.linear_model]] import [[sklearn.linear model.LinearRegression|LinearRegression]]<br />
:from [[sklearn.model_selection]] import [[cross_val_predict]]<br />
:from [[sklearn.datasets]] import [[load_boston]]<br />
:from [[sklearn.metrics]] import [[explained_variance_score]], [[mean_squared_error]]<br />
:import [[numpy]] as np<br />
:import [[pylab]] as pl<br />
:boston = [[load_boston()]] <span style="font-weight:italic; color:gray;>#Loading boston datasets</span> <br />
:x = boston.data <span style="font-weight:italic; color:gray;># Creating Regression Design Matrix </span><br />
:y = boston.target <span style="font-weight:italic; color:gray;># Creating target dataset</span><br />
:linreg = [[sklearn.linear model.LinearRegression|LinearRegression()]] <span style="font-weight:italic; color:gray;># Create linear regression object </span><br />
:linreg.fit(x,y) <span style="font-weight:italic; color:gray; # Fit linear regression</span><br />
:yp = linreg.predict(x) <span style="font-weight:italic; color:gray;># predicted values</span><br />
:yp_cv = [[cross_val_predict]](linreg, x, y, cv=10) <span style="font-weight:italic; color:gray;>#Calculation 10-Fold CV</span><br />
|[[File:linear_boston10fold.png|400px|]]<br />
:(blue dots correspond to 10-Fold CV) <br />
|-<br />
|style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:top;"| <br />
<span style="font-weight:italic; color:gray;>#Calculaton of RMSE and Explained Variances</span><br />
:Evariance=[[explained_variance_score]](y,yp)<br />
:Evariance_cv=[[explained_variance_score]](y,yp_cv)<br />
:RMSE =np.sqrt([[mean_squared_error]](y,yp))<br />
:RMSECV =sqrt([[mean_squared_error]](y,yp_cv)_<br />
|<br />
:Method: <B>[[Linear Regression]]</B><br />
:[[RMSE]] on the [[dataset]]: <B>4.6795</B><br />
:[[RMSE]] on 10-fold CV: <B>5.8819</B><br />
:[[Explained Variance Regression Score]] on the [[dataset]] : <B>0.7406</B><br />
:[[Explained Variance Regression]] 10-fold CV: <B>0.5902</B><br />
<br />
|}<br />
* <B>Counter-Example(s)</B><br />
** [[sklearn.linear_model.Lasso]] <br />
** [[sklearn.linear_model.Ridge]]<br />
** [[sklearn.linear_model.ARDRegression]]<br />
** [[sklearn.linear_model.BayesianRidge]]<br />
** [[sklearn.linear_model.ElasticNet]]<br />
** [[sklearn.linear_model.HuberRegressor]]<br />
** [[sklearn.linear_model.Lars]]<br />
* <B>See:</B> [[Linear Regression Task]], [[Ordinary Least Squares Linear Regression System]], [[Estimation Task]], [[Coordinate Descent Algorithm]].<br />
----<br />
----<br />
<br />
== References ==<br />
<br />
=== 2017a ===<br />
* http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html<br />
** QUOTE: [[Ordinary least squares Linear Regression]]. <P> <code>class sklearn.linear_model.LinearRegression(fit_intercept=True, normalize=False, copy_X=True, n_jobs=1)[source]</code><br />
<br />
=== 2017b ===<br />
* http://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html<br />
** QUOTE: <br />
# Split the targets into training/testing sets <br />
diabetes_y_train = diabetes.target[:-20]<br />
diabetes_y_test = diabetes.target[-20:]<br />
<BR><br />
# Create linear regression object<br />
regr = [[sklearn.linear model.LinearRegression|linear_model.LinearRegression]]()<br />
<BR><br />
# Train the model using the training sets<br />
regr.fit(diabetes_X_train, diabetes_y_train)<br />
<br />
=== 2017D ===<br />
* (Mobasher,2017) ⇒ Bamshad Mobasher ([[2017]]). [http://facweb.cs.depaul.edu/mobasher/classes/CSC478/Notes/IPython%20Notebook%20-%20Regression.html Example of Regression Analysis Using the Boston Housing Data Set]. Retrieved 2017-10-01<br />
<br />
=== 2017 e. ===<br />
* (Scipy Lectures, 2017) ⇒ http://www.scipy-lectures.org/packages/scikit-learn/#supervised-learning-regression-of-housing-data<br />
** QUOTE: <B>3.6.4.2. Predicting Home Prices: a Simple Linear Regression</B><P> Now we'll use [[scikit-learn]] to perform a [[simple linear regression]] on the [[housing data]]. There are many possibilities of [[regressor]]s to use. A particularly simple one is [[sklearn.linear model.LinearRegression|LinearRegression]]: this is basically a wrapper around an [[ordinary least squares]] calculation.<br />
{| class="wikitable" style="margin-left: 50px;border:0px;font-family:monospace; font-size:10pt;font-weight=bold;text-align:top; width:90%;"<br />
| from sklearn.model_selection import train_test_split<P><br />
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target)<P><br />
from sklearn.linear_model import LinearRegression<P><br />
clf = LinearRegression()<P><br />
clf.fit(X_train, y_train)<P><br />
predicted = clf.predict(X_test)<P><br />
expected = y_test<P><br />
print("RMS: %s" % np.sqrt(np.mean((predicted - expected) ** 2))) <br />
|}<br />
:: We can plot the error: expected as a function of predicted:<br />
:: <code>plt.scatter(expected, predicted)</code> <br />
<br />
=== 2016 ===<br />
* (Thiebaut) ⇒ D. Thiebaut ([[2016]]). [http://www.science.smith.edu/dftwiki/images/c/c7/SKLearnLinearRegression_BostonData.pdf SKLearn Tutorial: Linear Regression on Boston Data]<br />
----<br />
__NOTOC__<br />
[[Category:Concept]]</div>Replacement Bot