Random Sample-based Statistical Measure

A Random Sample-based Statistical Measure is a statistical measure that operates on random samples from statistical populations to produce sample statistic values for statistical inference.

AKA: Sample Estimator, Test Statistic, Summary Statistic, Random Sample-based Statistic Function, Sample-Based Statistical Function, Empirical Statistic.
Context:
- It can typically transform Random Samples into sample statistic values that estimate population parameters.
- It can typically provide Point Estimates of population characteristics through sample calculations.
- It can typically generate Sampling Distributions when applied repeatedly to random samples from the same statistical population.
- It can typically serve as Test Statistics in hypothesis testing procedures with known null distributions.
- It can typically enable Statistical Inference about population properties from sample data.
- It can often exhibit Sampling Variability that decreases with increasing sample size.
- It can often satisfy Statistical Properties such as consistency, sufficiency, and completeness.
- It can often be used to construct Confidence Intervals for population parameters.
- It can often serve multiple purposes including data description, parameter estimation, and hypothesis testing.
- It can often incorporate Finite Sample Corrections when sampling from finite populations.
- It can range from being a Biased Sample Statistic to being an Unbiased Sample Statistic, depending on its expectation property.
- It can range from being a Consistent Sample Statistic to being an Inconsistent Sample Statistic, depending on its asymptotic behavior.
- It can range from being an Efficient Sample Statistic to being an Inefficient Sample Statistic, depending on its variance property.
- It can range from being a Sufficient Sample Statistic to being an Insufficient Sample Statistic, depending on its information content.
- It can range from being a Robust Sample Statistic to being a Non-Robust Sample Statistic, depending on its outlier sensitivity.
- It can range from being a Parametric Sample Statistic to being a Non-Parametric Sample Statistic, depending on its distributional assumptions.
- Function Domain: a Random Sample from a Statistical Population.
- Function Range: a Sample Statistic Value (typically a real number, vector, or matrix).
- It can be associated with a Point Estimator of a population parameter.
- It can have Sampling Error representing deviation from the true parameter value.
- It can follow specific Probability Distributions under sampling assumptions.
- It can be standardized to create Pivotal Quantities for statistical inference.
- It can be combined with other sample statistics to form composite estimators.
- It can be computed using statistical software systems and programming languages.
- ...
Example(s):
- Central Tendency Sample Statistics, such as:
  - Sample Mean Function computing Σxᵢ/n for sample mean values.
  - Sample Median Function finding middle values for sample median values.
  - Sample Mode Function identifying most frequent for sample mode values.
  - Trimmed Sample Mean Function excluding extremes for robust estimation.
  - Winsorized Sample Mean Function limiting extremes for outlier adjustment.
- Dispersion Sample Statistics, such as:
  - Sample Variance Function computing Σ(xᵢ-x̄)²/(n-1) for sample variance values.
  - Sample Standard Deviation Function taking square root of variance for sample standard deviation values.
  - Sample Range Function calculating max-min for sample range values.
  - Sample Interquartile Range Function finding Q3-Q1 for robust spread measures.
  - Sample Mean Absolute Deviation Function averaging |xᵢ-x̄| for deviation measures.
- Order Sample Statistics, such as:
  - Sample Maximum Function selecting largest for maximum values.
  - Sample Minimum Function selecting smallest for minimum values.
  - Sample Percentile Function finding ranked values for percentile values.
  - Sample Quantile Function generalizing percentiles for quantile values.
- Association Sample Statistics, such as:
  - Sample Correlation Coefficient Function measuring linear relationships.
  - Sample Covariance Function computing joint variability.
  - Sample Partial Correlation Function controlling for confounders.
  - Sample Rank Correlation Function assessing monotonic relationships.
- Test Sample Statistics, such as:
  - Sample t-Statistic Function standardizing means with unknown variance.
  - Sample Chi-Square Statistic Function testing categorical independence.
  - Sample F-Statistic Function comparing variance ratios.
  - Sample Z-Statistic Function standardizing with known variance.
  - Sample Likelihood Ratio Statistic comparing model fits.
- Moment Sample Statistics, such as:
  - Sample Skewness Function measuring asymmetry.
  - Sample Kurtosis Function quantifying tail heaviness.
  - Sample Central Moment Function generalizing distribution shape.
  - Sample Raw Moment Function computing power averages.
- Non-Parametric Sample Statistics, such as:
  - Sample Rank Sum Statistic for distribution comparisons.
  - Sample Sign Test Statistic for median testing.
  - Sample Runs Test Statistic for randomness assessment.
  - Sample Kolmogorov-Smirnov Statistic for distribution testing.
- Resampling Sample Statistics, such as:
  - Bootstrap Sample Mean for distribution estimation.
  - Jackknife Sample Variance for bias reduction.
  - Permutation Test Statistic for exact inference.
- Bayesian Sample Statistics, such as:
  - Posterior Mean Estimate incorporating prior information.
  - Maximum A Posteriori Estimate finding modal values.
  - Credible Interval Estimate quantifying uncertainty.
- Time Series Sample Statistics, such as:
  - Sample Autocorrelation Function measuring serial dependence.
  - Sample Partial Autocorrelation Function isolating direct effects.
  - Sample Cross-Correlation Function relating multiple series.
- ...
Counter-Example(s):
- Population Parameter, which represents true values rather than sample estimates.
- Population Statistic Function, such as:
  - Population Mean Function requiring complete population data.
  - Population Variance Function using all population values.
  - Population Median Function based on entire population.
- Theoretical Distribution Function, which describes probability structures rather than empirical estimates.
- Prior Distribution, which represents beliefs before observing sample data.
- Loss Function, which measures prediction errors rather than sample characteristics.
- Probability Function, which assigns probability values rather than computing sample summaries.
- Random Variable, which represents stochastic outcomes rather than calculated statistics.
- Data Generating Process, which produces data rather than summarizing it.
See: Statistical Measure, Population Parameter, Point Estimator, Statistical Inference, Sampling Distribution, Hypothesis Testing, Confidence Interval, Statistical Property, Sampling Theory, Estimation Theory, Test Statistic, Sufficient Statistic, Order Statistic, U-Statistic, M-Estimator, L-Estimator, Maximum Likelihood Estimator, Method of Moments Estimator, Empirical Distribution Function.

References

2016

(Wikipedia, 2016) ⇒ http://wikipedia.org/wiki/summary_statistics Retrieved:2016-4-12.
- In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount of information as simply as possible. Statisticians commonly try to describe the observations in
  - a measure of location, or central tendency, such as the arithmetic mean.
  - a measure of statistical dispersion like the standard deviation.
  - a measure of the shape of the distribution like skewness or kurtosis.
  - if more than one variable is measured, a measure of statistical dependence such as a correlation coefficient.
- A common collection of order statistics used as summary statistics are the five-number summary, sometimes extended to a seven-number summary, and the associated box plot.
  Entries in an analysis of variance table can also be regarded as summary statistics. ^[1]

↑ Upton, G., Cook, I. (2006). Oxford Dictionary of Statistics, OUP. ISBN 978-0-19-954145-4

2015

(Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/statistic Retrieved:2015-2-23.
- A statistic (singular) is a single measure of some attribute of a sample (e.g., its arithmetic mean value). It is calculated by applying a function (statistical algorithm) to the values of the items of the sample, which are known together as a set of data.
  More formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the sample's distribution; that is, the function can be stated before realization of the data. The term statistic is used both for the function and for the value of the function on a given sample.
  A statistic is distinct from a statistical parameter, which is not computable because often the population is much too large to examine and measure all its items. However, a statistic, when used to estimate a population parameter, is called an estimator. For instance, the sample mean is a statistic that estimates the population mean, which is a parameter.
  When a statistic (a function) is being used for a specific purpose, it may be referred to by a name indicating its purpose: in descriptive statistics, a descriptive statistic is used to describe the data; in estimation theory, an estimator is used to estimate a parameter of the distribution (population); in statistical hypothesis testing, a test statistic is used to test a hypothesis. However, a single statistic can be used for multiple purposes – for example the sample mean can be used to describe a data set, to estimate the population mean, or to test a hypothesis.

(Leek & Peng, 2015) ⇒ Jeffrey T. Leek, and Roger D. Peng. (2015). “Statistics: P values are just the tip of the iceberg.” In: Nature, 520(7549).
- QUOTE: There is no statistic more maligned than the P value. Hundreds of papers and blogposts have been written about what some statisticians deride as 'null hypothesis significance testing' (NHST; see, for example, http://go.nature.com/pfvgqe). NHST deems whether the results of a data analysis are important on the basis of whether a summary statistic (such as a P value) has crossed a threshold.

2013

(Wikipedia, 2013) ⇒ http://en.wikipedia.org/wiki/Statistic
- A statistic (singular) is a single measure of some attribute of a sample (e.g., its arithmetic mean value). It is calculated by applying a function (statistical algorithm) to the values of the items of the sample, which are known together as a set of data.
  More formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the sample's distribution; that is, the function can be stated before realization of the data. The term statistic is used both for the function and for the value of the function on a given sample.
  A statistic is distinct from a statistical parameter, which is not computable because often the population is much too large to examine and measure all its items. However, a statistic, when used to estimate a population parameter, is called an estimator. For instance, the sample mean is a statistic that estimates the population mean, which is a parameter.
  When a statistic (a function) is being used for a specific purpose, it may be referred to by a name indicating its purpose: in descriptive statistics, a descriptive statistic is used to describe the data; in estimation theory, an estimator is used to estimate a parameter of the distribution (population); in statistical hypothesis testing, a test statistic is used to test a hypothesis. However, a single statistic can be used for multiple purposes – for example the sample mean can be used to describe a data set, to estimate the population mean, or to test a hypothesis.

2011

http://en.wikipedia.org/wiki/Statistic
- … The term statistic is used both for the function and for the value of the function on a given sample.

2009

http://planetmath.org/encyclopedia/SampleVariance.html
- QUOTE: A statistic, or sample statistic, [math]\displaystyle{ S }[/math] is simply a function, usually real-valued, of a set of (sample) data or observations [math]\displaystyle{ X=(X_1,X_2,...,X_n) }[/math]. [math]\displaystyle{ S = S(X) }[/math]. More formally, let O be the sample space of the data X, then [math]\displaystyle{ S }[/math] is a function from O to some set [math]\displaystyle{ T }[/math], usually a subset of R^k. The data X is usually considered as a vector of iid random variables X_i.

2006

(Dubnicka, 2006k) ⇒ Suzanne R. Dubnicka. (2006). “Introduction to Statistics - Handout 11." Kansas State University, Introduction to Probability and Statistics I, STAT 510 - Fall 2006.
- TERMINOLOGY : Once a random sample has been selected, one typically measures or records the value of one or more variables of interest for each subject in the sample. Collectively, these observations/measurements are the data. Once the data have been collected, one typically summarizes the data in one or more different ways. In general, the types of summaries used are (1) graphical displays and (2) statistics. A statistic is simply a numerical summary measure of a sample. Therefore, a statistic is a property of some sort of the sample. Ideally, if our sample is representative of the population, we will use this statistic as our best guess that the corresponding parameter value. A parameter is a numerical summary meausure of a population; that is, a parameter is a property of a population.
- TERMINOLOGY : Recall that a statistic is a summary measure of the sample which is a set of random variables. Thus, a statistic is also a random variable. This is why is makes sense to compute the expectation and variance of a statistic. Also, as a statistic is a random variable also has a distribution associated with it. The distribution of a statistic is called the sampling distribution of the statistic.

2003

http://www.nature.com/nrg/journal/v4/n9/glossary/nrg1155_glossary.html
- QUOTE: TEST STATISTIC A statistic is any function of a random sample — in particular, of the observations in an experiment. A test statistic is a statistic that is used in a statistical test to discriminate between two competing hypotheses, the so-called null and alternative hypotheses.

[1] Upton, G., Cook, I. (2006). Oxford Dictionary of Statistics, OUP. ISBN 978-0-19-954145-4

[1]

Random Sample-based Statistical Measure

References

2016

2015

2013

2011

2009

2006

2003

Navigation menu

Search