Multiple Linear regression: introduction through modeling
Consider a multiple variate linear regression model (MvLR)
or
Here is a case study of multiple variables linear regression
> lm2<-lm(wt~age+ht, data=d)
> summary(lm2)
Call:
lm(formula = wt ~ age + ht, data = d)
Residuals:
Min 1Q Median 3Q Max
-2.48498 -0.53548 0.01508 0.51986 2.77917
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -8.297442 0.865929 -9.582 <2e-16 ***
age 0.005368 0.010169 0.528 0.598
ht 0.228086 0.014205 16.057 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9035 on 182 degrees of freedom
Multiple R-squared: 0.9163, Adjusted R-squared: 0.9154
F-statistic: 995.8 on 2 and 182 DF, p-value: < 2.2e-16
The best-fit value of β1 (the slope of age) was 0.005368. This means that on average, and after accounting for differences in the other variables, an increase in one unit of age is associated with increase in weight of 0.005368 g. The 95% CI ranges from -0.01469629 to 0.02543229 g (degree freedom is 182, t182,0.975=1.973). Because 95%CI cross zero point, the observed data didn't support the results that increasing age is associated with increase in weight.
The best-fit value of β2 (the slope of height) was 0.228086. This means that on average, and after accounting for differences in the other variables, an increase in one unit of height is associated with increase in weight of 0.228086 g. The 95% CI ranges from 0.2000583 to 0.2561137 g (degree freedom is 182, t182,0.975=1.973). Because 95%CI was far from the zero point, we are 95% confident that increasing height is associated with increase in weight.
The best-fit value of β0 (the intercept) was -8.297442. This is the average value of Y when all the X values are zero. Here, it is minus value. Because that is not true in real world, this means the regression model might not work out of observed data.
No comments:
Post a Comment