Descriptive Statistics — descriptives • jeksterslabRlinreg

Descriptive Statistics

descriptives(
  X,
  y,
  varnamesX = NULL,
  varnamey = NULL,
  plot = TRUE,
  moments = TRUE,
  cor = TRUE,
  mardia = TRUE
)

Arguments

X	`n` by `k` numeric matrix. The data matrix \(\mathbf{X}\) (also known as design matrix, model matrix or regressor matrix) is an \(n \times k\) matrix of \(n\) observations of \(k\) regressors, which includes a regressor whose value is 1 for each observation on the first column.
y	Numeric vector of length `n` or `n` by `1` matrix. The vector \(\mathbf{y}\) is an \(n \times 1\) vector of observations on the regressand variable.
varnamesX	Optional. Character vector of length `k`. Variable names for matrix `X`.
varnamey	Optional. Character string. Variable name for vector `y`.
plot	Logical. Display scatter plot matrix.
moments	Logical. Print central moments (means, standard deviations, skewness, and kurtosis).
cor	Logical. Print correlations.
mardia	Logical. Estimate Mardia's multivariate skewness and kurtosis.

Value

Returns a list with the following elements:

X: \(n \times k\) matrix of \(n\) observations of \(k\) regressors, which includes a regressor whose value is 1 for each observation on the first column.
y: \(n \times 1\) matrix of observations on the regressand variable.
data: \(n \times k\) matrix with the following columns \(y, X_2, X_3, \cdots, X_k\).
n: Sample size.
k: Number of regressors which includes a regressor whose value is 1 for each observation on the first column.
p: Number of partial regression coefficients are slopes.
df1: Degrees of freedom 1.
df2: Degrees of freedom 2.
muhatX: Vector of length \(p\) of estimated means of \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\mu}}_{\mathbf{X}} = \left\{ \hat{\mu}_{X_2}, \hat{\mu}_{X_3}, \cdots, \hat{\mu}_{X_k} \right\} \right)\).
muhaty: Estimated mean of the regressand variable \(\left( \hat{\mu}_y \right)\)
muhat: Vector of length \(p\) of estimated means of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\mu}} = \left\{ \hat{\mu}_{y}, \hat{\mu}_{X_2}, \hat{\mu}_{X_3}, \cdots, \hat{\mu}_{X_k} \right\} \right)\).
Rhat: \(k \times k\) matrix of estimated correlations \(\left( \boldsymbol{\hat{R}}_{y, X_{2, 3, \cdots, k}} \right)\).
Rhat.p: \(k \times k\) \(p\)-values associated with the estimated correlation matrix.
RXhat: \(p \times p\) matrix of estimated correlations between regressor variables \(\left( \boldsymbol{\hat{R}}_{X_{2, 3, \cdots, k}} \right)\).
ryXhat: Vector of length \(p\) of estimated correlations between the regressand variables and the regressor variables \(\left( \boldsymbol{\hat{r}}_{y, X_{2, 3, \cdots, k}} = \left\{ \hat{r}_{y, X_2}, \hat{r}_{y, X_3}, \cdots, \hat{r}_{y, X_k} \right\} \right)\).
Sigmahat: \(k \times k\) matrix of estimated covariances \(\left( \boldsymbol{\hat{\Sigma}}_{y, X_{2, 3, \cdots, k}} \right)\).
SigmaXhat: \(p \times p\) matrix of estimated covariances between regressor variables \(\left( \boldsymbol{\hat{\Sigma}}_{X_{2, 3, \cdots, k}} \right)\).
sigmayXhat: Vector of length \(p\) of estimated covariances between the regressand variables and the regressor variables \(\left( \boldsymbol{\hat{\sigma}}_{y, X_{2, 3, \cdots, k}} = \left\{ \hat{\sigma}_{y, X_2}, \hat{\sigma}_{y, X_3}, \cdots, \hat{\sigma}_{y, X_k} \right\} \right)\).
sigma2Xhat: Vector of length \(p\) of estimated variances of \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{X_{2, 3, \cdots, k}}^{2} = \left\{ \hat{\sigma}_{X_2}^{2}, \hat{\sigma}_{X_2}^{2}, \cdots \hat{\sigma}_{X_k}^{2} \right\} \right)\).
sigma2yhat: Estimated variance of \(y\) \(\left( \hat{\sigma}_{y}^{2} \right)\).
sigmaXhat: Vector of length \(p\) of estimated standard deviation of \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{X_{2, 3, \cdots, k}} = \left\{ \hat{\sigma}_{X_2}, \hat{\sigma}_{X_2}, \cdots \hat{\sigma}_{X_k} \right\} \right)\).
sigmayhat: Estimated standard deviation of \(y\) \(\left( \hat{\sigma}_{y} \right)\).
sigma2hat: Vector of length \(k\) of estimated variances of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{y, X_{2, 3, \cdots, k}}^{2} = \left\{ \hat{\sigma}_{y}^{2}, \hat{\sigma}_{X_2}^{2}, \hat{\sigma}_{X_2}^{2}, \cdots \hat{\sigma}_{X_k}^{2} \right\} \right)\).
sigmahat: Vector of length \(k\) of estimated standard deviations of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{y, X_{2, 3, \cdots, k}} = \left\{ \hat{\sigma}_{y}, \hat{\sigma}_{X_2}, \hat{\sigma}_{X_2}, \cdots \hat{\sigma}_{X_k} \right\} \right)\).
skewhat: Vector of length \(k\) of estimated skewness of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\gamma}}_{1} = \left\{ \hat{\gamma}_{1y}, \hat{\gamma}_{1X_{2}}, \hat{\gamma}_{1X_{3}}, \cdots, \hat{\gamma}_{1X_{k}} \right\} \right)\) .
kurthat: Vector of length \(k\) of estimated excess kurtosis of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\gamma}}_{2} = \left\{ \hat{\gamma}_{2y}, \hat{\gamma}_{2X_{2}}, \hat{\gamma}_{2X_{3}}, \cdots, \hat{\gamma}_{2X_{k}} \right\} \right)\) .
mardiahat: Vector is estimates of Mardia's multivariate skewness and kurtosis and their associated test statistics and \(p\)-values.

Author

Ivan Jacob Agaloos Pesigan

Examples

# Simple regression------------------------------------------------
X <- jeksterslabRdatarepo::wages.matrix[["X"]]
X <- X[, c(1, ncol(X))]
y <- jeksterslabRdatarepo::wages.matrix[["y"]]
out <- descriptives(X = X, y = y)
#> 
#> Central Moments:
#>           Mean       SD  Skewness   Kurtosis
#> wages 12.36585  7.89635 1.8502679  4.8600481
#> age   37.93483 11.49428 0.2698983 -0.7668824
#> 
#> Mardia's Estimate of Multivariate Skewness and Kurtosis:
#>                 b1           b1.chisq      b1.correction b1.chisq.corrected 
#>       3.875204e+00       8.325230e+02       1.002329e+00       8.344616e+02 
#>              b1.df               b1.p     b1.p.corrected                 b2 
#>       4.000000e+00      6.923720e-179      2.632596e-179       1.264530e+01 
#>               b2.z               b2.p 
#>       2.084732e+01       1.611861e-96 
#> 
#> Correlations:
#>           wages       age
#> wages 1.0000000 0.2874694
#> age   0.2874694 1.0000000
str(out)
#> List of 27
#>  $ X         : num [1:1289, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:2] "constant" "age"
#>  $ y         : num [1:1289, 1] 11.6 5 12 7 21.1 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr "wages"
#>  $ data      : num [1:1289, 1:2] 11.6 5 12 7 21.1 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:2] "wages" "age"
#>  $ n         : int 1289
#>  $ k         : int 2
#>  $ p         : num 1
#>  $ df1       : num 1
#>  $ df2       : int 1287
#>  $ muhatX    : Named num 37.9
#>   ..- attr(*, "names")= chr "age"
#>  $ muhaty    : Named num 12.4
#>   ..- attr(*, "names")= chr "wages"
#>  $ muhat     : Named num [1:2] 12.4 37.9
#>   ..- attr(*, "names")= chr [1:2] "wages" "age"
#>  $ Rhat      : num [1:2, 1:2] 1 0.287 0.287 1
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:2] "wages" "age"
#>   .. ..$ : chr [1:2] "wages" "age"
#>  $ Rhat.p    : num [1:2, 1:2] NA 6.02e-26 6.02e-26 NA
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:2] "wages" "age"
#>   .. ..$ : chr [1:2] "wages" "age"
#>  $ RXhat     : num [1, 1] 1
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr "age"
#>   .. ..$ : chr "age"
#>  $ ryXhat    : Named num 0.287
#>   ..- attr(*, "names")= chr "age"
#>  $ Sigmahat  : num [1:2, 1:2] 62.4 26.1 26.1 132.1
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:2] "wages" "age"
#>   .. ..$ : chr [1:2] "wages" "age"
#>  $ SigmaXhat : num [1, 1] 132
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr "age"
#>   .. ..$ : chr "age"
#>  $ sigmayXhat: Named num 26.1
#>   ..- attr(*, "names")= chr "age"
#>  $ sigma2Xhat: num [1, 1] 132
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr "age"
#>   .. ..$ : chr "age"
#>  $ sigma2yhat: Named num 62.4
#>   ..- attr(*, "names")= chr "wages"
#>  $ sigmaXhat : Named num 11.5
#>   ..- attr(*, "names")= chr "age"
#>  $ sigmayhat : Named num 7.9
#>   ..- attr(*, "names")= chr "wages"
#>  $ sigma2hat : Named num [1:2] 62.4 132.1
#>   ..- attr(*, "names")= chr [1:2] "wages" "age"
#>  $ sigmahat  : Named num [1:2] 7.9 11.5
#>   ..- attr(*, "names")= chr [1:2] "wages" "age"
#>  $ skewhat   : Named num [1:2] 1.85 0.27
#>   ..- attr(*, "names")= chr [1:2] "wages" "age"
#>  $ kurthat   : Named num [1:2] 4.86 -0.767
#>   ..- attr(*, "names")= chr [1:2] "wages" "age"
#>  $ mardiahat : Named num [1:10] 3.88 832.52 1 834.46 4 ...
#>   ..- attr(*, "names")= chr [1:10] "b1" "b1.chisq" "b1.correction" "b1.chisq.corrected" ...

# Multiple regression----------------------------------------------
X <- jeksterslabRdatarepo::wages.matrix[["X"]]
# age is removed
X <- X[, -ncol(X)]
out <- descriptives(X = X, y = y)
#> 
#> Central Moments:
#>                  Mean         SD    Skewness   Kurtosis
#> wages      12.3658495  7.8963503  1.85026794  4.8600481
#> gender      0.4972847  0.5001867  0.01087395 -2.0029920
#> race        0.1528317  0.3599648  1.93189913  1.7349237
#> union       0.1590380  0.3658535  1.86682302  1.4873335
#> education  13.1450737  2.8138234 -0.29071984  2.9937154
#> experience 18.7897595 11.6628366  0.37610718 -0.6699994
#> 
#> Mardia's Estimate of Multivariate Skewness and Kurtosis:
#>                 b1           b1.chisq      b1.correction b1.chisq.corrected 
#>       1.253254e+01       2.692407e+03       1.002328e+00       2.698674e+03 
#>              b1.df               b1.p     b1.p.corrected                 b2 
#>       5.600000e+01       0.000000e+00       0.000000e+00       5.848385e+01 
#>               b2.z               b2.p 
#>       1.920797e+01       3.174381e-82 
#> 
#> Correlations:
#>                 wages      gender        race        union    education
#> wages       1.0000000 -0.22330183 -0.12783381  0.102246656  0.456517980
#> gender     -0.2233018  1.00000000  0.04327185 -0.088856935 -0.031439159
#> race       -0.1278338  0.04327185  1.00000000  0.080587911 -0.087061729
#> union       0.1022467 -0.08885694  0.08058791  1.000000000  0.003966952
#> education   0.4565180 -0.03143916 -0.08706173  0.003966952  1.000000000
#> experience  0.1731733 -0.02265681 -0.03912910  0.154319024 -0.180103012
#>             experience
#> wages       0.17317330
#> gender     -0.02265681
#> race       -0.03912910
#> union       0.15431902
#> education  -0.18010301
#> experience  1.00000000
str(out)
#> List of 27
#>  $ X         : num [1:1289, 1:6] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:6] "constant" "gender" "race" "union" ...
#>  $ y         : num [1:1289, 1] 11.6 5 12 7 21.1 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr "wages"
#>  $ data      : num [1:1289, 1:6] 11.6 5 12 7 21.1 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:6] "wages" "gender" "race" "union" ...
#>  $ n         : int 1289
#>  $ k         : int 6
#>  $ p         : num 5
#>  $ df1       : num 5
#>  $ df2       : int 1283
#>  $ muhatX    : Named num [1:5] 0.497 0.153 0.159 13.145 18.79
#>   ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ...
#>  $ muhaty    : Named num 12.4
#>   ..- attr(*, "names")= chr "wages"
#>  $ muhat     : Named num [1:6] 12.366 0.497 0.153 0.159 13.145 ...
#>   ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ...
#>  $ Rhat      : num [1:6, 1:6] 1 -0.223 -0.128 0.102 0.457 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:6] "wages" "gender" "race" "union" ...
#>   .. ..$ : chr [1:6] "wages" "gender" "race" "union" ...
#>  $ Rhat.p    : num [1:6, 1:6] NA 4.98e-16 4.14e-06 2.36e-04 2.35e-67 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:6] "wages" "gender" "race" "union" ...
#>   .. ..$ : chr [1:6] "wages" "gender" "race" "union" ...
#>  $ RXhat     : num [1:5, 1:5] 1 0.0433 -0.0889 -0.0314 -0.0227 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "gender" "race" "union" "education" ...
#>   .. ..$ : chr [1:5] "gender" "race" "union" "education" ...
#>  $ ryXhat    : Named num [1:5] -0.223 -0.128 0.102 0.457 0.173
#>   ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ...
#>  $ Sigmahat  : num [1:6, 1:6] 62.352 -0.882 -0.363 0.295 10.143 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:6] "wages" "gender" "race" "union" ...
#>   .. ..$ : chr [1:6] "wages" "gender" "race" "union" ...
#>  $ SigmaXhat : num [1:5, 1:5] 0.25019 0.00779 -0.01626 -0.04425 -0.13217 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:5] "gender" "race" "union" "education" ...
#>   .. ..$ : chr [1:5] "gender" "race" "union" "education" ...
#>  $ sigmayXhat: Named num [1:5] -0.882 -0.363 0.295 10.143 15.948
#>   ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ...
#>  $ sigma2Xhat: Named num [1:5] 0.25 0.13 0.134 7.918 136.022
#>   ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ...
#>  $ sigma2yhat: Named num 62.4
#>   ..- attr(*, "names")= chr "wages"
#>  $ sigmaXhat : Named num [1:5] 0.5 0.36 0.366 2.814 11.663
#>   ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ...
#>  $ sigmayhat : Named num 7.9
#>   ..- attr(*, "names")= chr "wages"
#>  $ sigma2hat : Named num [1:6] 62.352 0.25 0.13 0.134 7.918 ...
#>   ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ...
#>  $ sigmahat  : Named num [1:6] 7.896 0.5 0.36 0.366 2.814 ...
#>   ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ...
#>  $ skewhat   : Named num [1:6] 1.8503 0.0109 1.9319 1.8668 -0.2907 ...
#>   ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ...
#>  $ kurthat   : Named num [1:6] 4.86 -2 1.73 1.49 2.99 ...
#>   ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ...
#>  $ mardiahat : Named num [1:10] 12.5 2692.4 1 2698.7 56 ...
#>   ..- attr(*, "names")= chr [1:10] "b1" "b1.chisq" "b1.correction" "b1.chisq.corrected" ...