Tests: The Simple Linear Regression Model

# The Simple Linear Regression Model: The Population Regression Model Example {#simple-population-example}

The jeksterslabRlinreg package has functions to derive expectations for regression models.

jeksterslabRlinreg::mutheta() returns the expected values, that is, the model-implied mean vector.
jeksterslabRlinreg::Sigmatheta() returns the covariance expectations, that is, the model-implied variance-covariance matrix.

In this hypothetical example, we assume that we have population data and we are interested in the association between wages and education. The regressor variable is years of education. The regressand variable is hourly wage in US dollars. The intercept is the predicted wage of an employee with 0 years of education. The slope is the increase in hourly wage in US dollars for one year increase in education.

The the following vectors represent the parameters of the simple linear regression model.

\[\begin{equation} \boldsymbol{\theta}_{\text{mean structure}} = \begin{bmatrix} \beta_1 \\ \mu_x \end{bmatrix} \end{equation}\]

\[\begin{equation} \boldsymbol{\theta}_{\text{covariance structure}} = \begin{bmatrix} \beta_2 \\ \sigma_{\varepsilon}^{2} \\ \sigma_{x}^{2} \end{bmatrix} \end{equation}\]

\(\boldsymbol{\theta}_{\text{mean structure}}\)
Variable	Description	Notation	Value
`beta1`	Intercept	\(\beta_1\)	-4.119577
`mux`	Mean of \(x\)	\(\mu_x\)	13.137516

\(\boldsymbol{\theta}_{\text{covariance structure}}\)
Variable	Description	Notation	Value
`beta2`	Slope	\(\beta_2\)	1.249177
`sigma2epsilon`	Variance of \(\varepsilon\)	\(\sigma_{\varepsilon}^{2}\)	50.442669
`sigma2x`	Variance of \(x\)	\(\sigma_{x}^{2}\)	8.185725

There are times when the intercept does not lend to a meaningful interpretation. A negative wage corresponding to zero years of education does not really make a lot of sense. The value that is interesting in this case is the slope. The slope represents 1.2491767 US dollar increase in hourly wages given 1 additional year of education.

The expected values are derived using jeksterslabRlinreg::mutheta().

result_mutheta <- jeksterslabRlinreg::mutheta(
  beta = beta,
  muX = mux
)

Expected Values
	\(\mu\)
y	12.29150
x	13.13752

The covariance expectations are derive using jeksterslabRlinreg::Sigmatheta().

result_Sigmatheta <- jeksterslabRlinreg::Sigmatheta(
  slopes = beta2,
  sigma2epsilon = sigma2epsilon,
  SigmaX = sigma2x
)

Covariance Expectations
	y	x
y	63.21602	10.225417
x	10.22542	8.185725

The inverse, that is, deriving the parameters from the expectations can also be done using functions from the jeksterslabRlinreg package.

The regression coefficients can be derived using jeksterslabRlinreg::intercept(), and jeksterslabRlinreg::slopes() from the means and the covariances.

result_slopes <- jeksterslabRlinreg::.slopes(
  SigmaX = sigma2x,
  sigmayX = sigmayx
)
result_slopes
#>        slopes
#> [1,] 1.249177
result_intercept <- jeksterslabRlinreg::.intercept(
  slopes = result_slopes,
  muy = muy,
  muX = mux
)
result_intercept
#> [1] -4.119577

From raw data

result_slopesfromrawdata <- jeksterslabRlinreg::slopes(
  X = X,
  y = y
)
result_slopesfromrawdata
#>             slopes
#> education 1.249177
result_interceptfromrawdata <- jeksterslabRlinreg::intercept(
  X = X,
  y = y
)
result_interceptfromrawdata
#>     wages 
#> -4.119577

Standardized slopes can also be obtained using jeksterslabRlinreg::slopesprime()

result_slopesprime <- jeksterslabRlinreg::.slopesprime(
  RX = sqrt(sigma2x),
  ryX = ryx
)
result_slopesprime
#>      std.slopes
#> [1,]  0.1571123

From raw data

result_slopesprimefromrawdata <- jeksterslabRlinreg::slopesprime(
  X = X,
  y = y
)
result_slopesprimefromrawdata
#>           std.slopes
#> education  0.4495095

The error variance can also be derived with the available information using jeksterslabRlinreg::sigma2() .

result_sigma2epsilon <- sigma2epsilon(
  slopes = beta2,
  sigma2y = sigma2y,
  sigmayX = sigmayx,
  SigmaX = sigma2x
)
result_sigma2epsilon
#>          
#> 50.44267

context("Test simple-regression-ram")
test_that("result_mutheta", {
  for (i in 1:nrow(result_mutheta)) {
    expect_equivalent(
      result_mutheta[i, 1],
      mutheta[i, 1]
    )
  }
})
#> Test passed 🥳
test_that("result_Sigmatheta", {
  for (i in 1:nrow(result_Sigmatheta)) {
    for (j in 1:ncol(result_Sigmatheta)) {
      expect_equivalent(
        result_Sigmatheta[i, j],
        Sigmatheta[i, j]
      )
    }
  }
})
#> Test passed 🌈
test_that("results_beta", {
  results_beta <- c(
    result_intercept,
    result_slopes
  )
  for (i in 1:length(results_beta)) {
    expect_equivalent(
      results_beta[i],
      beta[i]
    )
  }
})
#> Test passed 🥇
test_that("results_betafromrawdata", {
  results_betafromrawdata <- c(
    result_interceptfromrawdata,
    result_slopesfromrawdata
  )
  for (i in 1:length(results_betafromrawdata)) {
    expect_equivalent(
      results_betafromrawdata[i],
      beta[i]
    )
  }
})
#> Test passed 😀

Ivan Jacob Agaloos Pesigan

2020-12-31