Descriptive Statistics

descriptives(
  X,
  y,
  varnamesX = NULL,
  varnamey = NULL,
  plot = TRUE,
  moments = TRUE,
  cor = TRUE,
  mardia = TRUE
)

Arguments

X

n by k numeric matrix. The data matrix \(\mathbf{X}\) (also known as design matrix, model matrix or regressor matrix) is an \(n \times k\) matrix of \(n\) observations of \(k\) regressors, which includes a regressor whose value is 1 for each observation on the first column.

y

Numeric vector of length n or n by 1 matrix. The vector \(\mathbf{y}\) is an \(n \times 1\) vector of observations on the regressand variable.

varnamesX

Optional. Character vector of length k. Variable names for matrix X.

varnamey

Optional. Character string. Variable name for vector y.

plot

Logical. Display scatter plot matrix.

moments

Logical. Print central moments (means, standard deviations, skewness, and kurtosis).

cor

Logical. Print correlations.

mardia

Logical. Estimate Mardia's multivariate skewness and kurtosis.

Value

Returns a list with the following elements:

X

\(n \times k\) matrix of \(n\) observations of \(k\) regressors, which includes a regressor whose value is 1 for each observation on the first column.

y

\(n \times 1\) matrix of observations on the regressand variable.

data

\(n \times k\) matrix with the following columns \(y, X_2, X_3, \cdots, X_k\).

n

Sample size.

k

Number of regressors which includes a regressor whose value is 1 for each observation on the first column.

p

Number of partial regression coefficients are slopes.

df1

Degrees of freedom 1.

df2

Degrees of freedom 2.

muhatX

Vector of length \(p\) of estimated means of \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\mu}}_{\mathbf{X}} = \left\{ \hat{\mu}_{X_2}, \hat{\mu}_{X_3}, \cdots, \hat{\mu}_{X_k} \right\} \right)\).

muhaty

Estimated mean of the regressand variable \(\left( \hat{\mu}_y \right)\)

muhat

Vector of length \(p\) of estimated means of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\mu}} = \left\{ \hat{\mu}_{y}, \hat{\mu}_{X_2}, \hat{\mu}_{X_3}, \cdots, \hat{\mu}_{X_k} \right\} \right)\).

Rhat

\(k \times k\) matrix of estimated correlations \(\left( \boldsymbol{\hat{R}}_{y, X_{2, 3, \cdots, k}} \right)\).

Rhat.p

\(k \times k\) \(p\)-values associated with the estimated correlation matrix.

RXhat

\(p \times p\) matrix of estimated correlations between regressor variables \(\left( \boldsymbol{\hat{R}}_{X_{2, 3, \cdots, k}} \right)\).

ryXhat

Vector of length \(p\) of estimated correlations between the regressand variables and the regressor variables \(\left( \boldsymbol{\hat{r}}_{y, X_{2, 3, \cdots, k}} = \left\{ \hat{r}_{y, X_2}, \hat{r}_{y, X_3}, \cdots, \hat{r}_{y, X_k} \right\} \right)\).

Sigmahat

\(k \times k\) matrix of estimated covariances \(\left( \boldsymbol{\hat{\Sigma}}_{y, X_{2, 3, \cdots, k}} \right)\).

SigmaXhat

\(p \times p\) matrix of estimated covariances between regressor variables \(\left( \boldsymbol{\hat{\Sigma}}_{X_{2, 3, \cdots, k}} \right)\).

sigmayXhat

Vector of length \(p\) of estimated covariances between the regressand variables and the regressor variables \(\left( \boldsymbol{\hat{\sigma}}_{y, X_{2, 3, \cdots, k}} = \left\{ \hat{\sigma}_{y, X_2}, \hat{\sigma}_{y, X_3}, \cdots, \hat{\sigma}_{y, X_k} \right\} \right)\).

sigma2Xhat

Vector of length \(p\) of estimated variances of \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{X_{2, 3, \cdots, k}}^{2} = \left\{ \hat{\sigma}_{X_2}^{2}, \hat{\sigma}_{X_2}^{2}, \cdots \hat{\sigma}_{X_k}^{2} \right\} \right)\).

sigma2yhat

Estimated variance of \(y\) \(\left( \hat{\sigma}_{y}^{2} \right)\).

sigmaXhat

Vector of length \(p\) of estimated standard deviation of \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{X_{2, 3, \cdots, k}} = \left\{ \hat{\sigma}_{X_2}, \hat{\sigma}_{X_2}, \cdots \hat{\sigma}_{X_k} \right\} \right)\).

sigmayhat

Estimated standard deviation of \(y\) \(\left( \hat{\sigma}_{y} \right)\).

sigma2hat

Vector of length \(k\) of estimated variances of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{y, X_{2, 3, \cdots, k}}^{2} = \left\{ \hat{\sigma}_{y}^{2}, \hat{\sigma}_{X_2}^{2}, \hat{\sigma}_{X_2}^{2}, \cdots \hat{\sigma}_{X_k}^{2} \right\} \right)\).

sigmahat

Vector of length \(k\) of estimated standard deviations of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\sigma}}_{y, X_{2, 3, \cdots, k}} = \left\{ \hat{\sigma}_{y}, \hat{\sigma}_{X_2}, \hat{\sigma}_{X_2}, \cdots \hat{\sigma}_{X_k} \right\} \right)\).

skewhat

Vector of length \(k\) of estimated skewness of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\gamma}}_{1} = \left\{ \hat{\gamma}_{1y}, \hat{\gamma}_{1X_{2}}, \hat{\gamma}_{1X_{3}}, \cdots, \hat{\gamma}_{1X_{k}} \right\} \right)\) .

kurthat

Vector of length \(k\) of estimated excess kurtosis of the regressand variable \(y\) and \(X_2, X_3, \cdots, X_k\) \(\left( \boldsymbol{\hat{\gamma}}_{2} = \left\{ \hat{\gamma}_{2y}, \hat{\gamma}_{2X_{2}}, \hat{\gamma}_{2X_{3}}, \cdots, \hat{\gamma}_{2X_{k}} \right\} \right)\) .

mardiahat

Vector is estimates of Mardia's multivariate skewness and kurtosis and their associated test statistics and \(p\)-values.

Author

Ivan Jacob Agaloos Pesigan

Examples

# Simple regression------------------------------------------------ X <- jeksterslabRdatarepo::wages.matrix[["X"]] X <- X[, c(1, ncol(X))] y <- jeksterslabRdatarepo::wages.matrix[["y"]] out <- descriptives(X = X, y = y)
#> #> Central Moments: #> Mean SD Skewness Kurtosis #> wages 12.36585 7.89635 1.8502679 4.8600481 #> age 37.93483 11.49428 0.2698983 -0.7668824 #> #> Mardia's Estimate of Multivariate Skewness and Kurtosis: #> b1 b1.chisq b1.correction b1.chisq.corrected #> 3.875204e+00 8.325230e+02 1.002329e+00 8.344616e+02 #> b1.df b1.p b1.p.corrected b2 #> 4.000000e+00 6.923720e-179 2.632596e-179 1.264530e+01 #> b2.z b2.p #> 2.084732e+01 1.611861e-96 #> #> Correlations: #> wages age #> wages 1.0000000 0.2874694 #> age 0.2874694 1.0000000
str(out)
#> List of 27 #> $ X : num [1:1289, 1:2] 1 1 1 1 1 1 1 1 1 1 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : NULL #> .. ..$ : chr [1:2] "constant" "age" #> $ y : num [1:1289, 1] 11.6 5 12 7 21.1 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : NULL #> .. ..$ : chr "wages" #> $ data : num [1:1289, 1:2] 11.6 5 12 7 21.1 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : NULL #> .. ..$ : chr [1:2] "wages" "age" #> $ n : int 1289 #> $ k : int 2 #> $ p : num 1 #> $ df1 : num 1 #> $ df2 : int 1287 #> $ muhatX : Named num 37.9 #> ..- attr(*, "names")= chr "age" #> $ muhaty : Named num 12.4 #> ..- attr(*, "names")= chr "wages" #> $ muhat : Named num [1:2] 12.4 37.9 #> ..- attr(*, "names")= chr [1:2] "wages" "age" #> $ Rhat : num [1:2, 1:2] 1 0.287 0.287 1 #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:2] "wages" "age" #> .. ..$ : chr [1:2] "wages" "age" #> $ Rhat.p : num [1:2, 1:2] NA 6.02e-26 6.02e-26 NA #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:2] "wages" "age" #> .. ..$ : chr [1:2] "wages" "age" #> $ RXhat : num [1, 1] 1 #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr "age" #> .. ..$ : chr "age" #> $ ryXhat : Named num 0.287 #> ..- attr(*, "names")= chr "age" #> $ Sigmahat : num [1:2, 1:2] 62.4 26.1 26.1 132.1 #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:2] "wages" "age" #> .. ..$ : chr [1:2] "wages" "age" #> $ SigmaXhat : num [1, 1] 132 #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr "age" #> .. ..$ : chr "age" #> $ sigmayXhat: Named num 26.1 #> ..- attr(*, "names")= chr "age" #> $ sigma2Xhat: num [1, 1] 132 #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr "age" #> .. ..$ : chr "age" #> $ sigma2yhat: Named num 62.4 #> ..- attr(*, "names")= chr "wages" #> $ sigmaXhat : Named num 11.5 #> ..- attr(*, "names")= chr "age" #> $ sigmayhat : Named num 7.9 #> ..- attr(*, "names")= chr "wages" #> $ sigma2hat : Named num [1:2] 62.4 132.1 #> ..- attr(*, "names")= chr [1:2] "wages" "age" #> $ sigmahat : Named num [1:2] 7.9 11.5 #> ..- attr(*, "names")= chr [1:2] "wages" "age" #> $ skewhat : Named num [1:2] 1.85 0.27 #> ..- attr(*, "names")= chr [1:2] "wages" "age" #> $ kurthat : Named num [1:2] 4.86 -0.767 #> ..- attr(*, "names")= chr [1:2] "wages" "age" #> $ mardiahat : Named num [1:10] 3.88 832.52 1 834.46 4 ... #> ..- attr(*, "names")= chr [1:10] "b1" "b1.chisq" "b1.correction" "b1.chisq.corrected" ...
# Multiple regression---------------------------------------------- X <- jeksterslabRdatarepo::wages.matrix[["X"]] # age is removed X <- X[, -ncol(X)] out <- descriptives(X = X, y = y)
#> #> Central Moments: #> Mean SD Skewness Kurtosis #> wages 12.3658495 7.8963503 1.85026794 4.8600481 #> gender 0.4972847 0.5001867 0.01087395 -2.0029920 #> race 0.1528317 0.3599648 1.93189913 1.7349237 #> union 0.1590380 0.3658535 1.86682302 1.4873335 #> education 13.1450737 2.8138234 -0.29071984 2.9937154 #> experience 18.7897595 11.6628366 0.37610718 -0.6699994 #> #> Mardia's Estimate of Multivariate Skewness and Kurtosis: #> b1 b1.chisq b1.correction b1.chisq.corrected #> 1.253254e+01 2.692407e+03 1.002328e+00 2.698674e+03 #> b1.df b1.p b1.p.corrected b2 #> 5.600000e+01 0.000000e+00 0.000000e+00 5.848385e+01 #> b2.z b2.p #> 1.920797e+01 3.174381e-82 #> #> Correlations: #> wages gender race union education #> wages 1.0000000 -0.22330183 -0.12783381 0.102246656 0.456517980 #> gender -0.2233018 1.00000000 0.04327185 -0.088856935 -0.031439159 #> race -0.1278338 0.04327185 1.00000000 0.080587911 -0.087061729 #> union 0.1022467 -0.08885694 0.08058791 1.000000000 0.003966952 #> education 0.4565180 -0.03143916 -0.08706173 0.003966952 1.000000000 #> experience 0.1731733 -0.02265681 -0.03912910 0.154319024 -0.180103012 #> experience #> wages 0.17317330 #> gender -0.02265681 #> race -0.03912910 #> union 0.15431902 #> education -0.18010301 #> experience 1.00000000
str(out)
#> List of 27 #> $ X : num [1:1289, 1:6] 1 1 1 1 1 1 1 1 1 1 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : NULL #> .. ..$ : chr [1:6] "constant" "gender" "race" "union" ... #> $ y : num [1:1289, 1] 11.6 5 12 7 21.1 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : NULL #> .. ..$ : chr "wages" #> $ data : num [1:1289, 1:6] 11.6 5 12 7 21.1 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : NULL #> .. ..$ : chr [1:6] "wages" "gender" "race" "union" ... #> $ n : int 1289 #> $ k : int 6 #> $ p : num 5 #> $ df1 : num 5 #> $ df2 : int 1283 #> $ muhatX : Named num [1:5] 0.497 0.153 0.159 13.145 18.79 #> ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ... #> $ muhaty : Named num 12.4 #> ..- attr(*, "names")= chr "wages" #> $ muhat : Named num [1:6] 12.366 0.497 0.153 0.159 13.145 ... #> ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ... #> $ Rhat : num [1:6, 1:6] 1 -0.223 -0.128 0.102 0.457 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:6] "wages" "gender" "race" "union" ... #> .. ..$ : chr [1:6] "wages" "gender" "race" "union" ... #> $ Rhat.p : num [1:6, 1:6] NA 4.98e-16 4.14e-06 2.36e-04 2.35e-67 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:6] "wages" "gender" "race" "union" ... #> .. ..$ : chr [1:6] "wages" "gender" "race" "union" ... #> $ RXhat : num [1:5, 1:5] 1 0.0433 -0.0889 -0.0314 -0.0227 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:5] "gender" "race" "union" "education" ... #> .. ..$ : chr [1:5] "gender" "race" "union" "education" ... #> $ ryXhat : Named num [1:5] -0.223 -0.128 0.102 0.457 0.173 #> ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ... #> $ Sigmahat : num [1:6, 1:6] 62.352 -0.882 -0.363 0.295 10.143 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:6] "wages" "gender" "race" "union" ... #> .. ..$ : chr [1:6] "wages" "gender" "race" "union" ... #> $ SigmaXhat : num [1:5, 1:5] 0.25019 0.00779 -0.01626 -0.04425 -0.13217 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : chr [1:5] "gender" "race" "union" "education" ... #> .. ..$ : chr [1:5] "gender" "race" "union" "education" ... #> $ sigmayXhat: Named num [1:5] -0.882 -0.363 0.295 10.143 15.948 #> ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ... #> $ sigma2Xhat: Named num [1:5] 0.25 0.13 0.134 7.918 136.022 #> ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ... #> $ sigma2yhat: Named num 62.4 #> ..- attr(*, "names")= chr "wages" #> $ sigmaXhat : Named num [1:5] 0.5 0.36 0.366 2.814 11.663 #> ..- attr(*, "names")= chr [1:5] "gender" "race" "union" "education" ... #> $ sigmayhat : Named num 7.9 #> ..- attr(*, "names")= chr "wages" #> $ sigma2hat : Named num [1:6] 62.352 0.25 0.13 0.134 7.918 ... #> ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ... #> $ sigmahat : Named num [1:6] 7.896 0.5 0.36 0.366 2.814 ... #> ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ... #> $ skewhat : Named num [1:6] 1.8503 0.0109 1.9319 1.8668 -0.2907 ... #> ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ... #> $ kurthat : Named num [1:6] 4.86 -2 1.73 1.49 2.99 ... #> ..- attr(*, "names")= chr [1:6] "wages" "gender" "race" "union" ... #> $ mardiahat : Named num [1:10] 12.5 2692.4 1 2698.7 56 ... #> ..- attr(*, "names")= chr [1:10] "b1" "b1.chisq" "b1.correction" "b1.chisq.corrected" ...