BA 303 - BUSINESS STATISTICS – Weeks 6 & 7: Learning Unit 1: Discussion 1: Chapter 8, Problem 14
Using the Excel file Weddings, apply the Excel Regression tool using the wedding cost as the dependent variable and the couple’s income as the independent variable, only for those weddings paid for by the bride and groom. Interpret all key regression results, hypothesis tests, and confidence intervals in the output.
The estimated regression model is,
Wedding cost = 480.4165+ 0.3734*Couple's Income
From the above output we can see that the regression model (i.e. the independent variable) is not significant at 0.05 significance level (as p-value = 0.0684 > 0.05). The independent variable is explaining 39.82% (as R-sqr = 0.3982) of the variation in the dependent variable.
As this is a simple regression model and the regression model is not significant so the independent variable is not a significant predictor of the dependent variable.
The slope parameter estimate is 0.3734 implying that per $1 increase in a Couple's Income, the expected increase in Wedding cost is $0.3734 on average. The 95% confidence interval for the slope is (-0.0369, 0.7836), thus we can be 95% confident that the true slope parameter falls within this interval. As the interval contains value 0 so that is a possible value for population slope indicating that the independent variable is not significant.
The residual plots are given below,
The assumptions of linearity and homoscedasticity are not valid.
The scatterplot between wedding cost and couple’s income is showing a random pattern, instead of a linear trend. Therefore, the assumption of linearity is considered invalid. The scatterplot of the wedding cost against the standard residuals is not concentrated near the zero line but is random. Therefore, the assumption of homoscedasticity is also not valid.
The assumption of normality is not valid.
If the probability plot of a variable is close to a straight line and doesn’t form any other pattern, the variable follows a normal distribution. However, the probability plot of residuals forms a different pattern than a straight line.
A standard residual is considered an outlier if it is either less than -2 or greater than 2, i.e. 2 times the standard deviation of standard residuals which is 1. All the values fall inside this interval. Therefore, all standardized residuals are within ±2 implying that there are no outliers present in the data.
If a couple makes $80,000 together, their predicted budget is,
Wedding cost = 480.4165+ 0.3734*80000 = $ 30352.42