Simple Linear Regression Task

From GM-RKB
Jump to navigation Jump to search

A Simple Linear Regression Task is a linear regression task with a single response variable.

[math]\displaystyle{ y_i=\beta_0+\beta_1x_i+\varepsilon_i\quad }[/math] for [math]\displaystyle{ \quad i=1,\cdots,n \; }[/math]
This is,

[math]\displaystyle{ \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix} = \begin{pmatrix} 1 & x_1 \\ 1 & x_2 \\ \vdots & \vdots\\ 1 & x_n \\ \end{pmatrix}\begin{pmatrix} \beta_0 \\ \beta_1 \\ \end{pmatrix}+\begin{pmatrix} \varepsilon_1 \\ \varepsilon_2 \\ \vdots \\ \varepsilon_n \end{pmatrix} }[/math]

[math]\displaystyle{ \beta_0 }[/math] is called the intercept.
by estimating the best-fitting [math]\displaystyle{ \beta }[/math] parameters that optimizes a objective function of the form:

[math]\displaystyle{ E(f)=\sum _{i=1}^{n}L(y_{i},\beta_0+\beta_1 x_i) }[/math]

[math]\displaystyle{ L(\cdot) }[/math] is an error function that may be derived as a loss function or a likelihood function.



References

2017a

  • (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Simple_linear_regression Retrieved:2017-8-6.
    • In statistics, simple linear regression is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable (conventionally, the x and y coordinates in a Cartesian coordinate system) and finds a linear function (a non-vertical straight line) that, as accurately as possible, predicts the dependent variable values as a function of the independent variables.

      The adjective simple refers to the fact that the outcome variable is related to a single predictor.

      It is common to make the additional hypothesis that the ordinary least squares method should be used to minimize the residuals (vertical distances between the points of the data set and the fitted line). Under this hypothesis, the accuracy of a line through the sample points is measured by the sum of squared residuals, and the goal is to make this sum as small as possible. Other regression methods that can be used in place of ordinary least squares include least absolute deviations (minimizing the sum of absolute values of residuals) and the Theil–Sen estimator (which chooses a line whose slope is the median of the slopes determined by pairs of sample points). Deming regression (total least squares) also finds a line that fits a set of two-dimensional sample points, but (unlike ordinary least squares, least absolute deviations, and median slope regression) it is not really an instance of simple linear regression, because it does not separate the coordinates into one dependent and one independent variable and could potentially return a vertical line as its fit.

      The remainder of the article assumes an ordinary least squares regression.

      In this case, the slope of the fitted line is equal to the correlation between and corrected by the ratio of standard deviations of these variables. The intercept of the fitted line is such that it passes through the center of mass (,) of the data points.

2017b

2013

2011a

2011b