Thursday, May 24, 2018

Linear regression: Adjusted variables plot (AVPLOT)

Linear regression: Adjusted variables plot (AVPLOT)



There is dependent variable Y (weight), explanatory variable Xi(height), Xj(age).
First, regress dependent variable Y on all other explanatory variables except variable Xj. Here regress weight on height.
Call:
lm(formula = wt ~ ht, data = d)

Residuals:
    Min     1Q Median       3Q Max
-2.44736 -0.55708  0.01925 0.49941 2.73594

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -8.694768   0.427398 -20.34 <2e-16 ***
ht           0.235050 0.005257   44.71 <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9017 on 183 degrees of freedom
Multiple R-squared:  0.9161,    Adjusted R-squared:  0.9157
F-statistic:  1999 on 1 and 183 DF,  p-value: < 2.2e-16

First, regress dependent variable Xj on all other explanatory variables. Here regress age on height.
Call:
lm(formula = age ~ ht, data = d)

Residuals:
   Min     1Q Median      3Q Max
-18.722  -4.168 -1.222   3.467 20.625

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -74.01432    3.11317 -23.77 <2e-16 ***
ht            1.29736 0.03829   33.88 <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.568 on 183 degrees of freedom
Multiple R-squared:  0.8625,    Adjusted R-squared:  0.8618
F-statistic:  1148 on 1 and 183 DF,  p-value: < 2.2e-16

Finally compare the residuals of the model I and the model II. The below is AVPLOT
Call:
lm(formula = residual_WtHt ~ residual_ageHt)

Residuals:
    Min     1Q Median       3Q Max
-2.48498 -0.53548  0.01508 0.51986 2.77917

Coefficients:
               Estimate Std. Error t value Pr(>|t|)
(Intercept)    2.261e-15 6.624e-02   0.000 1.000
residual_ageHt 5.368e-03  1.014e-02 0.529 0.597

Residual standard error: 0.901 on 183 degrees of freedom
Multiple R-squared:  0.001529,    Adjusted R-squared:  -0.003927
F-statistic: 0.2802 on 1 and 183 DF,  p-value: 0.5972

Also, we could do the process using MLR
Call:
lm(formula = wt ~ age + ht, data = d)

Residuals:
    Min     1Q Median       3Q Max
-2.48498 -0.53548  0.01508 0.51986 2.77917

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -8.297442   0.865929 -9.582 <2e-16 ***
age          0.005368 0.010169   0.528 0.598
ht           0.228086 0.014205  16.057 <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9035 on 182 degrees of freedom
Multiple R-squared:  0.9163,    Adjusted R-squared:  0.9154
F-statistic: 995.8 on 2 and 182 DF,  p-value: < 2.2e-16

Both of the slope determined by AVPLOT (model III) and by MLR are 5.368e-03.



No comments:

Post a Comment