Linear Algebra Test

  • Ordinary Least Squares (OLS)
    • Least Squares Criterion
    • For \(\beta = (\beta_1, \beta_2, ... , \beta_p)^T\), define
      • Matrix Notation

        \begin{equation}
          y = \begin{pmatrix}y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix}
          X = \begin{bmatrix}
            x_{1,1} & x_{1,2} & \cdots & x_{1,p} \\
            x_{2,1} & x_{2,2} & \cdots & x_{2,p} \\ 
            \vdots & \vdots & \ddots & \vdots \\ 
            x_{n,1} & x_{n,2} & \cdots & x_{n,p}
          \end{bmatrix} \beta =
          \begin{pmatrix}\beta_1 \\ \beta_2 \\ \vdots \\ \beta_n
          \end{pmatrix}
        \end{equation}
        

        tex7435ay0.png

    • Solving for OLS Estimate \(\hat{\beta}\) \(\hat{y} = \begin{pmatrix}\hat{y_1} \\ \hat{y_2} \\ \vdots \\ \hat{y_n}\end{pmatrix} = X\beta\) and \(Q(\beta) = \sum_{i=1}^{n}(y_i - \hat{y_i})^2 = (y-\hat{y})^T(y-\hat{y})\) OLS \(\hat{\beta}\) solves \(\frac{\partial Q(\beta)}{\partial\beta_j} = 0, j = 1, 2, 3, ..., p\)

      \begin{equation}
        \begin{align*}
          \frac{\partial Q(\beta)}{\partial\beta_j} &=  \frac{\partial}{\partial\beta_j}(\sum_{i=1}^{n} [y_i - (x_{i,1}\beta_1 + x_{i,2}\beta_2 + ... + x_{i,p}\beta_p)]^2) \\
          &= \sum_{i=1}^n2(-x_{i,j})[y_i - (x_{i,1}\beta_1 + x_{i,2}\beta_2 + ... + x_{i,p}\beta_p)] \\
          &= -2(X_{[j]})^T(y - X\beta)
        \end{align*}
      \end{equation}
      

      tex21330aJQ.png

      where \(X_{[j]}\) is the \(j_{th}\) column of \(X\)

      \begin{equation}
        \frac{\partial Q}{\partial\beta} =
        \begin{bmatrix}
          \frac{\partial Q}{\partial\beta_1} \\
          \frac{\partial Q}{\partial\beta_2} \\
          \vdots \\
          \frac{\partial Q}{\partial\beta_p}
        \end{bmatrix} = -2 \begin{bmatrix}
          X_{[1]}^T(y0 X\beta) \\
          X_{[2]}^T(y0 X\beta) \\
          \vdots \\
          X_{[p]}^T(y0 X\beta) \\
        \end{bmatrix} = -2X^T(y - X\beta)
      
      \end{equation}
      

      tex7435Y3a.png

    • So the OLS Estimates \(\hat{\beta}\) solves the "Normal Equations" \(X^T(y-X\beta) = 0\)

      \begin{equation}
        \begin{align*}
          X^T(y-X\beta) &= 0 \Longleftrightarrow \\
          X^TX\hat{\beta} &= X^Ty \Longleftrightarrow \\
          \hat{\beta} &= (X^TX)^{-1}X^Ty
        \end{align*}
      \end{equation}
      
      

      tex7435lPJ.png

      N.B For \(\hat{\beta}\) to exist (uniquely) \((X^TX)\) must be invertible and \(X\) must have Full Column Rank

      OLS Estimate:

      \begin{equation} \tag{1}
        \begin{align*}
          \hat{\beta} =
          \begin{pmatrix}
            \hat{\beta_1} \\ \hat{\beta_2} \\ \vdots \\ \hat{\beta_p}
          \end{pmatrix} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y} && \text{\bf Fitted Values}
        \end{align*}
      \end{equation}
      
      

      tex16303xWp.png

      \begin{equation} \tag{2}
        \begin{align*}
          \hat{\mathbf{y}} &=
          \begin{pmatrix}
            \hat{y_1} \\ \hat{y_2} \\ \vdots \\ \hat{y_n}
          \end{pmatrix} = \begin{pmatrix}
            x_{1,1}\hat{\beta_1} + \cdots + x_{1,p}\hat{\beta_p} \\
            x_{2,1}\hat{\beta_1} + \cdots + x_{2,p}\hat{\beta_p} \\
            \vdots \\
            x_{n,1}\hat{\beta_1} + \cdots + x_{n,p}\hat{\beta_p} \\
          \end{pmatrix} = \\
          \mathbf{X}\hat{\beta} &= \mathbf{X(X^TX)^{-1}X^Ty = Hy}
        \end{align*}
      \end{equation} \\
      \text{Where} &&H = X(X^TX)^{-1}X^T && \text{is the n x n "Hat Matrix" or orthogonal projection matrix} 
      
      
      
      

      tex16303aaO.png

      The Hat Matrix H projects \(R^n\) onto the column-space of X Residuals:

      \begin{equation} \tag{3}
        \begin{align*}
        \hat{\epsilon} = \begin{pmatrix}
          \hat{\epsilon_1} \\ \hat{\epsilon_2} \\ \vdots \\ \hat{\epsilon_n}
        \end{pmatrix} = \mathbf{y} - \mathbf{\hat{y}} = (\mathbf{I_n} - \mathbf{H})\mathbf{y}
        \end{align*}
      \end{equation}
      
      \begin{equation} \tag{4}
        \begin{align*}
          \text{Normal Equations:} && \mathbf{X^T}(\mathbf{y} - \mathbf{X}\mathbf{\hat{\beta}}) = \mathbf{X}^T\hat{\epsilon} = \matbf{0_p} =
          \begin{pmatrix} 0 \\ \vdots \\ 0\end{pmatrix}
        \end{align*}
      \end{equation}
      

      tex16303cqE.png

      N.B. The Least-Squares Residuals vectors \(\hat{\epsilon}\) is orthogonal to the column space of X

Коментари

Comments powered by Disqus