Chapter 4 Regression Models
Quantitative Analysis for Management, 11e (Render)
Chapter 4 Regression Models
1) In regression, an independent variable is sometimes called a response variable.
Answer: FALSE
Diff: 2
Topic: INTRODUCTION
2) One purpose of regression is to understand the relationship between variables.
Answer: TRUE
Diff: 1
Topic: INTRODUCTION
3) One purpose of regression is to predict the value of one variable based on the other variable.
Answer: TRUE
Diff: 1
Topic: INTRODUCTION
4) The variable to be predicted is the dependent variable.
Answer: TRUE
Diff: 1
Topic: INTRODUCTION
5) The dependent variable is also called the response variable.
Answer: TRUE
Diff: 2
Topic: INTRODUCTION
6) A scatter diagram is a graphical depiction of the relationship between the dependent and independent variables.
Answer: TRUE
Diff: 1
Topic: SCATTER DIAGRAMS
7) In a scatter diagram, the dependent variable is typically plotted on the horizontal axis.
Answer: FALSE
Diff: 2
Topic: SCATTER DIAGRAMS
8) There is no relationship between variables unless the data points lie in a straight line.
Answer: FALSE
Diff: 2
Topic: SCATTER DIAGRAMS
9) In any regression model, there is an implicit assumption that a relationship exists between the variables.
Answer: TRUE
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
10) In regression, there is random error that can be predicted.
Answer: FALSE
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
11) Estimates of the slope, intercept, and error of a regression model are found from sample data.
Answer: FALSE
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
12) Error is the difference in the actual value and the predicted value.
Answer: TRUE
Diff: 1
Topic: SIMPLE LINEAR REGRESSION
13) The regression line minimizes the sum of the squared errors.
Answer: TRUE
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
14) In regression, a dependent variable is sometimes called a predictor variable.
Answer: FALSE
Diff: 2
Topic: INTRODUCTION
15) Summing the error values in a regression model is misleading because negative errors cancel out positive errors.
Answer: TRUE
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
16) The SST measures the total variability in the dependent variable about the regression line.
Answer: FALSE
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
17) The SSE measures the total variability in the independent variable about the regression line.
Answer: FALSE
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
18) The SSR indicates how much of the total variability in the dependent variable is explained by the regression model.
Answer: TRUE
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
19) The coefficient of determination takes on values between -1 and + 1.
Answer: FALSE
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
20) The coefficient of determination gives the proportion of the variability in the dependent variable that is explained by the regression equation.
Answer: TRUE
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
21) The correlation coefficient has values between −1 and +1.
Answer: TRUE
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
22) Errors are also called residuals.
Answer: TRUE
Diff: 2
Topic: USING COMPUTER SOFTWARE FOR REGRESSION
23) The regression model assumes the error terms are dependent.
Answer: FALSE
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
24) The regression model assumes the errors are normally distributed.
Answer: TRUE
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
25) The errors in a regression model are assumed to have an increasing mean.
Answer: FALSE
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
26) The errors in a regression model are assumed to have zero variance.
Answer: FALSE
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
27) If the assumptions of regression have been met, errors plotted against the independent variable will typically show patterns.
Answer: FALSE
Diff: 1
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
28) Often, a plot of the residuals will highlight any glaring violations of the assumptions.
Answer: TRUE
Diff: 1
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
29) The error standard deviation is estimated by MSE.
Answer: FALSE
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
30) The standard error of the estimate is also called the variance of the regression.
Answer: FALSE
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
31) An F-test is used to determine if there is a relationship between the dependent and independent variables.
Answer: TRUE
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
32) The null hypothesis in the F-test is that there is a linear relationship between the X and Y variables.
Answer: FALSE
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
33) If the significance level for the F-test is high enough, there is a relationship between the dependent and independent variables.
Answer: FALSE
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
34) When the significance level is small enough in the F-test, we can reject the null hypothesis that there is no linear relationship.
Answer: TRUE
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
35) The coefficients of each independent variable in a multiple regression model represent slopes.
Answer: TRUE
Diff: 2
Topic: MULTIPLE REGRESSION ANALYSIS
36) For statistical tests of significance about the coefficients, the null hypothesis is that the slope is 1.
Answer: FALSE
Diff: 2
Topic: MULTIPLE REGRESSION ANALYSIS
37) Both the p-value for the F-test and r2 can be interpreted the same with multiple regression models as they are with simple linear models.
Answer: TRUE
Diff: 2
Topic: MULTIPLE REGRESSION ANALYSIS
38) The multiple regression model includes several dependent variables.
Answer: FALSE
Diff: 1
Topic: MULTIPLE REGRESSION ANALYSIS
39) In regression, a binary variable is also called an indicator variable.
Answer: TRUE
Diff: 2
Topic: BINARY OR DUMMY VARIABLES
40) Another name for a dummy variable is a binary variable.
Answer: TRUE
Diff: 2
Topic: BINARY OR DUMMY VARIABLES
41) The best model is a statistically significant model with a high r-square and few variables.
Answer: TRUE
Diff: 2
Topic: MODEL BUILDING
42) The adjusted r2 will always increase as additional variables are added to the model.
Answer: FALSE
Diff: 2
Topic: MODEL BUILDING
43) The value of r2 can never decrease when more variables are added to the model.
Answer: TRUE
Diff: 2
Topic: MODEL BUILDING
44) A variable should be added to the model regardless of the impact (increase or decrease) on the adjusted r2 value.
Answer: FALSE
Diff: 2
Topic: MODEL BUILDING
45) Multicollinearity exists when a variable is correlated to other variables.
Answer: TRUE
Diff: 1
Topic: MODEL BUILDING
46) If multicollinearity exists, then individual interpretation of the variables is questionable, but the overall model is still good for prediction purposes.
Answer: TRUE
Diff: 2
Topic: MODEL BUILDING
47) Transformations may be used when nonlinear relationships exist between variables.
Answer: TRUE
Diff: 2
Topic: NONLINEAR REGRESSION
48) A high correlation always implies that one variable is causing a change in the other variable.
Answer: FALSE
Diff: 2
Topic: CAUTIONS AND PITFALLS IN REGRESSION ANALYSIS
49) A dummy variable can be assigned up to three values.
Answer: FALSE
Diff: 2
Topic: BINARY OR DUMMY VARIABLES
50) Which of the following statements is true regarding a scatter diagram?
- A) It provides very little information about the relationship between the regression variables.
- B) It is a plot of the independent and dependent variables.
- C) It is a line chart of the independent and dependent variables.
- D) It has a value between -1 and +
- E) It gives the percent of variation in the dependent variable that is explained by the independent variable.
Answer: B
Diff: 2
Topic: SCATTER DIAGRAMS
51) The random error in a regression equation
- A) is the predicted error.
- B) includes both positive and negative terms.
- C) will sum to a large positive number.
- D) is used the estimate the accuracy of the slope.
- E) is maximized in a least squares regression model.
Answer: B
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
52) Which of the following statements (are) is not true about regression models?
- A) Estimates of the slope are found from sample data.
- B) The regression line minimizes the sum of the squared errors.
- C) The error is found by subtracting the actual data value from the predicted data value.
- D) The dependent variable is the explanatory variable.
- E) The intercept coefficient is not typically interpreted.
Answer: C, D
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
53) Which of the following equalities is correct?
- A) SST = SSR + SSE
- B) SSR = SST + SSE
- C) SSE = SSR + SST
- D) SST = SSC + SSR
- E) SSE = Actual Value – Predicted Value
Answer: A
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
54) The sum of squared error (SSE) is
- A) a measure of the total variation in Y about the mean.
- B) a measure of the total variation in X about the mean.
- C) a measure in the variation of Y about the regression line.
- D) a measure in the variation of X about the regression line.
- E) None of the above
Answer: C
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
55) If computing a causal linear regression model of Y = a + bX and the resultant r2 is very near zero, then one would be able to conclude that
- A) Y = a + bX is a good forecasting method.
- B) Y = a + bX is not a good forecasting method.
- C) a multiple linear regression model is a good forecasting method for the data.
- D) a multiple linear regression model is not a good forecasting method for the data.
- E) None of the above
Answer: B
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
56) Which of the following statements is true about r2?
- A) It is also called the coefficient of correlation.
- B) It is also called the coefficient of determination.
- C) It represents the percent of variation in X that is explained by Y.
- D) It represents the percent of variation in the error that is explained by Y.
- E) It ranges in value from -1 to +
Answer: B
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
57) The coefficient of determination resulting from a particular regression analysis was 0.85. What was the slope of the regression line?
- A) 0.85
- B) -85
- C) 0.922
- D) There is insufficient information to answer the question.
- E) None of the above
Answer: D
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
58) The diagram below illustrates data with a
- A) negative correlation coefficient.
- B) zero correlation coefficient.
- C) positive correlation coefficient.
- D) correlation coefficient equal to +
- E) None of the above
Answer: C
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
59) The correlation coefficient resulting from a particular regression analysis was 0.25. What was the coefficient of determination?
- A) 0.5
- B) -5
- C) 0.0625
- D) There is insufficient information to answer the question.
- E) None of the above
Answer: C
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
AACSB: Analytic Skills
60) The coefficient of determination resulting from a particular regression analysis was 0.85. What was the correlation coefficient, assuming a positive linear relationship?
- A) 0.5
- B) -5
- C) 0.922
- D) There is insufficient information to answer the question.
- E) None of the above
Answer: C
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
AACSB: Analytic Skills
61) Which of the following is an assumption of the regression model?
- A) The errors are independent.
- B) The errors are not normally distributed.
- C) The errors have a standard deviation of zero.
- D) The errors have an irregular variance.
- E) The errors follow a cone pattern.
Answer: A
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
62) Which of the following is not an assumption of the regression model?
- A) The errors are independent.
- B) The errors are normally distributed.
- C) The errors have constant variance.
- D) The mean of the errors is zero.
- E) The errors should have a standard deviation equal to one.
Answer: E
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
63) In a good regression model the residual plot shows
- A) a cone pattern.
- B) an arched pattern.
- C) a random pattern.
- D) an increasing pattern.
- E) a decreasing pattern.
Answer: C
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
64) The problem of nonconstant error variance is detected in residual analysis by which of the following?
- A) a cone pattern
- B) an arched pattern
- C) a random pattern
- D) an increasing pattern
- E) a decreasing pattern
Answer: A
Diff: 3
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
65) The problem of a nonlinear relationship is detected in residual analysis by which of the following?
- A) a cone pattern
- B) an arched pattern
- C) a random pattern
- D) an increasing pattern
- E) a decreasing pattern
Answer: B
Diff: 3
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
66) The mean square error (MSE) is
- A) denoted by s.
- B) denoted by k.
- C) the SSE divided by the number of observations.
- D) the SSE divided by the degrees of freedom.
- E) None of the above
Answer: D
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
67) Which of the following represents the underlying linear model for hypothesis testing?
- A) Y = b0+ b1 X + ε
- B) Y = b0+ b1 X
- C) Y = β0+ β1 X + ε
- D) Y = β0+ β1 X
- E) None of the above
Answer: C
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
68) Which of the following statements is false concerning the hypothesis testing procedure for a regression model?
- A) The F-test statistic is used.
- B) The null hypothesis is that the true slope coefficient is equal to zero.
- C) The null hypothesis is rejected if the adjusted r2is above the critical value.
- D) An α level must be selected.
- E) The alternative hypothesis is that the true slope coefficient is not equal to zero.
Answer: C
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
69) Suppose that you believe that a cubic relationship exists between the independent variable (of time) and the dependent variable Y. Which of the following would represent a valid linear regression model?
- A) Y = b0+ b1 X, where X = time3
- B) Y = b0+ b1 X3, where X = time
- C) Y = b0+ 3b1 X, where X = time3
- D) Y = b0+ 3b1 X, where X = time
- E) Y = b0+ b1 X, where X = time1/3
Answer: A
Diff: 3
Topic: NONLINEAR REGRESSION
70) A prediction equation for starting salaries (in $1,000s) and SAT scores was performed using simple linear regression. In the regression printout shown below, what can be said about the level of significance for the overall model?
- A) SAT is not a good predictor for starting salary.
- B) The significance level for the intercept indicates the model is not valid.
- C) The significance level for SAT indicates the slope is equal to zero.
- D) The significance level for SAT indicates the slope is not equal to zero.
- E) None of the above
Answer: D
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
71) A prediction equation for sales and payroll was performed using simple linear regression. In the regression printout shown below, which of the following statements is/are not true?
- A) Payroll is a good predictor of Sales based on α =05.
- B) There is evidence of a positive linear relationship between Sales and Payroll based on α =05.
- C) Payroll is not a good predictor of Sales based on α =01.
- D) The coefficient of determination is equal to 0.833333.
- E) Payroll is the independent variable.
Answer: D
Diff: 2
Topic: TESTING THE MODEL FOR SIGNIFICANCE
72) A healthcare executive is using regression to predict total revenues. She has decided to include both patient length of stay and insurance type in her model. Insurance type can be grouped into the following categories: Medicare, Medicaid, Managed Care, Self-Pay, and Charity. Which of the following is true?
- A) Insurance type will be represented in the regression model by five binary variables.
- B) Insurance type will be represented in the regression model by six dummy variables.
- C) Insurance type will be represented in the regression model by five dummy variables.
- D) Insurance type will be represented in the regression model by four binary variables.
- E) Neither binary nor dummy variables are necessary for the regression model.
Answer: D
Diff: 2
Topic: BINARY OR DUMMY VARIABLES
73) A healthcare executive is using regression to predict total revenues. She has decided to include both patient length of stay and insurance type in her model. Insurance type can be grouped into three categories: Government-Funded, Private-Pay, and Other. Her model is
- A) Y = b0.
- B) Y = b0+ b1 X1.
- C) Y = b0+ b1X1 + b2 X2.
- D) Y = b0+ b1X1 + b2 X2 + b3 X3.
- E) Y = b0+ b1X1 + b2 X2 + b3 X3 + b4 X4.
Answer: D
Diff: 3
Topic: BINARY OR DUMMY VARIABLES
AACSB: Analytic Skills
74) A healthcare executive is using regression to predict total revenues. She is deciding whether or not to include both patient length of stay and insurance type in her model. Her first regression model only included patient length of stay. The resulting r2 was .83, with an adjusted r2 of .82 and her level of significance was .003. In the second model, she included both patient length of stay and insurance type. The r2 was .84 and the adjusted r2 was .80 for the second model and the level of significance did not change. Which of the following statements is true?
- A) The second model is a better model.
- B) The first model is a better model.
- C) The r2increased when additional variables were added because these variables significantly contribute to the prediction of total revenues.
- D) The adjusted r2 always increases when additional variables are added to the model.
- E) None of the above statements are true.
Answer: B
Diff: 2
Topic: MODEL BUILDING
75) The sum of the squares total (SST)
- A) measures the total variability in Y about the mean.
- B) measures the total variability in X about the mean.
- C) measures the variability in Y about the regression line.
- D) measures the variability in X about the regression line.
- E) indicates how much of the total variability in Y is explained by the regression model.
Answer: A
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
76) Which of the following statements provides the best guidance for model building?
- A) If the value of r2 increases as more variables are added to the model, the variables should remain in the model, regardless of the magnitude of increase.
- B) If the value of the adjusted r2 increases as more variables are added to the model, the variables should remain in the model.
- C) If the value of r2 increases as more variables are added to the model, the variables should not remain in the model, regardless of the magnitude of the increase.
- D) If the value of the adjusted r2 increases as more variables are added to the model, the variables should not remain in the model.
- E) None of the statements provide accurate guidance.
Answer: B
Diff: 2
Topic: MODEL BUILDING
77) An automated process to systematically add or delete independent variables from a regression model is known as
- A) nonlinear transformations.
- B) multicollinearity.
- C) multiple regression.
- D) least squares method.
- E) None of the above
Answer: E
Diff: 2
Topic: MODEL BUILDING
78) Which of the following is not a common pitfall of regression?
- A) If the assumptions are not met, the statistical tests may not be valid.
- B) Nonlinear relationships cannot be incorporated.
- C) Two variables may be highly correlated to one another but one is not causing the other to change.
- D) Concluding that a statistically significant relationship implies practical value.
- E) Using a regression equation beyond the range of X is very questionable.
Answer: B
Diff: 2
Topic: CAUTIONS AND PITFALLS IN REGRESSION ANALYSIS
79) The condition of an independent variable being correlated to one or more other independent variables is referred to as
- A) multicollinearity.
- B) statistical significance.
- C) linearity.
- D) nonlinearity.
- E) The significance level for the F-test is not valid.
Answer: A
Diff: 2
Topic: MODEL BUILDING
80) Which of the following is true regarding a regression model with multicollinearity, a high r2 value, and a low F-test significance level?
- A) The model is not a good prediction model.
- B) The high value of r2 is due to the multicollinearity.
- C) The interpretation of the coefficients is valuable.
- D) The significance level tests for the coefficients are not valid.
- E) The significance level for the F-test is not valid.
Answer: D
Diff: 2
Topic: MODEL BUILDING
81) An air conditioning and heating repair firm conducted a study to determine if the average outside temperature could be used to predict the cost of an electric bill for homes during the winter months in Houston, Texas. The resulting regression equation was:
Y = 227.19 – 1.45X, where Y = monthly cost, X = average outside air temperature
(a) If the temperature averaged 48 degrees during December, what is the forecasted cost of December’s electric bill?
(b) If the temperature averaged 38 degrees during January, what is the forecasted cost of January’s electric bill?
Answer:
(a) $227.19 – $1.45(48) = $157.59
(b) $227.19 – $1.45(38) = $172.09
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
AACSB: Analytic Skills
82) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teachers’ salaries at each elementary school. The researcher uses years of experience to predict salary. The resulting equation was:
Y = 23,313.22 + 1,210.89X, where Y = salary and X = years of experience
(a) If a teacher has 10 years of experience, what is the forecasted salary?
(b) If a teacher has 5 years of experience, what is the forecasted salary?
(c) Based on this equation, for every additional year of service, a teacher could expect his or her salary to increase by how much?
Answer:
(a) $23,313.22 + $1,210.89(10) = $35,422.12
(b) $23,313.22 + $1,210.89(5) = $29,367.67
(c) $1,210.89
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
AACSB: Analytic Skills
83) An air conditioning and heating repair firm conducted a study to determine if the average outside temperature, thickness of the insulation, and age of the heating equipment could be used to predict the electric bill for a home during the winter months in Houston, Texas. The resulting regression equation was:
Y = 256.89 – 1.45X1 – 11.26X2 + 6.10X3, where Y = monthly cost, X1 = average temperature, X2 = insulation thickness, and X3 = age of heating equipment
(a) If December has an average temperature of 45 degrees and the heater is 2 years old with insulation that is 6 inches thick, what is the forecasted monthly electric bill?
(b) If January has an average temperature of 40 degrees and the heating equipment is 12 years old with insulation that is 2 inches thick, what is the forecasted monthly electric bill?
Answer:
(a) $256.89 – 1.45(45) – 11.26(6) + 6.10(2) = $136.28
(b) $256.89 – 1.45(40) – 11.26(2) + 6.10(12) = $249.57
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
AACSB: Analytic Skills
84) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teacher salaries at each elementary school. The researcher uses years of experience to predict salary. The raw data is given in the table below. The resulting equation was:
Y = 19389.21 + 1330.12X, where Y = salary and X = years of experience
Salary | Yrs Exp |
$24,265.00 | 8 |
$27,140.00 | 5 |
$22,195.00 | 2 |
$37,950.00 | 15 |
$32,890.00 | 11 |
$40,250.00 | 14 |
$36,800.00 | 9 |
$30,820.00 | 6 |
$44,390.00 | 21 |
$24,955.00 | 2 |
$18,055.00 | 1 |
$23,690.00 | 7 |
$48,070.00 | 20 |
$42,205.00 | 16 |
(a) Develop a scatter diagram.
(b) What is the correlation coefficient?
(c) What is the coefficient of determination?
Answer:
(a)
(b) .936
(c) .877
Diff: 2
Topic: VARIOUS
AACSB: Analytic Skills
85) A large international sales organization has collected data on the number of employees and the annual gross sales during the last 7 years.
# of employees | sales (in $000s) |
1975 | 100 |
2010 | 110 |
2005 | 122 |
2020 | 130 |
2030 | 139 |
2031 | 152 |
2050 | 164 |
2100 | ? |
(a) Develop a scatter diagram.
(b) Determine the correlation coefficient.
(c) Determine the coefficient of determination.
(d) Determine the least squares trend line.
(e) Determine the predicted value of sales for 2100 employees.
Answer:
(a)
(b) .937
(c) .878
(d) Y = -1663.03 + .889X1
(e) 203.87 or $203,870
Diff: 2
Topic: VARIOUS
AACSB: Analytic Skills
86) A large department store has collected the following monthly data on lost sales revenue due to theft and the number of security guard hours on duty:
Lost Sales Revenue
($000s) |
Total Security Guard hours | Lost Sales Revenue
($000s) |
Total Security Guard hours |
1.0 | 600 | 1.8 | 950 |
1.4 | 630 | 2.1 | 1300 |
1.9 | 1000 | 2.3 | 1350 |
2.0 | 1200 |
(a) Determine the least squares regression equation.
(b) Using the results of part (a), find the estimated lost sales revenues if the total number of security guard hours is 800.
(c) Calculate the coefficient of correlation.
(d) Calculate the coefficient of determination.
Answer:
(a) least squares equation: Y = .3780 + .0014X
(b) Y = .3780 + 0.0014(800) = 1.498
(c) coefficient of correlation = 0.955
(a) coefficient of determination = 0.9121
Diff: 2
Topic: VARIOUS
AACSB: Analytic Skills
87) Bob White is conducting research on monthly expenses for medical care, including over-the-counter medicine. His dependent variable is monthly expenses for medical care while his independent variable is number of family members. Below is his Excel output.
(a) What is the prediction equation?
(b) Based on his model, each additional family member increases the predicted costs by how much?
(c) Based on the significance F-test, is this model a good prediction equation?
(d) What percent of the variation in medical expenses is explained by the size of the family?
(e) Can the null hypothesis that the slope is zero be rejected? Why or why not?
(f) What is the value of the correlation coefficient?
Answer:
(a) Y = 110.47 + 16.83X
(b) $16.83
(c) Yes, because the p-value for the F-test is low.
(d) 48.3% of the variation in medical expenses is explained by family size.
(e) The null hypothesis can be rejected, the slope is not equal to zero based on the low p-value.
(f) 0.695
Diff: 2
Topic: USING COMPUTER SOFTWARE FOR REGRESSION
AACSB: Analytic Skills
88) Consider the regression model Y = 389.10 – 14.6X. If the r2 value is 0.657, what is the correlation coefficient?
Answer: -(0.657)1/2 = -0.811
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
AACSB: Analytic Skills
89) Bob White is conducting research on monthly expenses for medical care, including over the counter medicine. His dependent variable is monthly expenses for medical care while his independent variables are number of family members and insurance type (government funded, private insurance and other). He has coded insurance type as the following:
X2 = 1 if government funded, X3 = 1 if private insurance
Below is his Excel output.
(a) What is the prediction equation?
(b) Based on the significance F-test, is this model a good prediction equation?
(c) What percent of the variation in medical expenses is explained by the independent variables?
(d) Based on his model, what are the predicted monthly expenses for a family of four with private insurance?
(e) Based on his model, what are the predicted monthly expenses for a family of two with government funded insurance?
(f) Based on his model, what are the predicted monthly expenses for a family of five with no insurance?
Answer:
(a) Y = 144.91 + 11.63X1 – 13.70 X2 – 9.11X3
(b) The model is a good prediction equation because the significance level for the F-test is low.
(c) 73.79 percent of the variation in medical expenses is explained by family size and insurance type.
(d) $182.32
(e) $154.47
(f) $203.06
Diff: 3
Topic: VARIOUS
AACSB: Analytic Skills
90) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teacher salaries at each elementary school. The researcher would like to examine the significance of a following quadratic model for predicting salary based on years of experience.
Y = β0 + β1X1 + β2X2 + ε where X1 = Yrs Exp and X2 = Yrs Exp2
Salary | Yrs Exp |
$24,265.00 | 8 |
$27,140.00 | 5 |
$22,195.00 | 2 |
$37,950.00 | 15 |
$32,890.00 | 11 |
$40,250.00 | 14 |
$36,800.00 | 9 |
$30,820.00 | 6 |
$44,390.00 | 21 |
$24,955.00 | 2 |
$18,055.00 | 1 |
$23,690.00 | 7 |
$48,070.00 | 20 |
$42,205.00 | 16 |
(a) What is the adjusted r2?
(b) What is the prediction equation?
Answer:
(a) 0.855
(b) Y = 19015.29 + 1437.49X1 – 4.98 X2
Diff: 3
Topic: NONLINEAR REGRESSION
AACSB: Analytic Skills
91) An air conditioning and heating repair firm conducted a study to determine if the average outside temperature could be used to predict the cost of an electric bill for a home during the winter months in Richmond, VA. It was determined that a quadratic model could be used and the following prediction equation was established:
Y = $1557.76 – 55.05X1 + 0.56X2
where Y = monthly cost, X1 = average temperature, and X2 = average temperature2
(a) If December has an average temperature of 43 degrees what is the forecasted monthly electric bill?
(b) If January has an average temperature of 40 degrees what is the forecasted monthly electric bill?
Answer:
(a) $1557.76 – 55.05(43) +0.56(432) = $226.05
(b) $1557.76 – 55.05(40) +0.56(402) = $251.76
Diff: 3
Topic: NONLINEAR REGRESSION
AACSB: Analytic Skills
92) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teacher salaries at each elementary school. The research has come up with the following prediction equation:
Y = $18012.24 + 1432.37X1 – 4.07 X2 where X1 = Yrs Exp and X2 = Yrs Exp2
(a) If a teacher has 7 years of experience, what is the expected salary?
(b) If teacher has 10 years of experience, what is the expected salary?
Answer:
(a) $18012.24 + 1432.37(7) – 4.07(72) = $27,839.40
(b) $18012.24 + 1432.37(10) – 4.07(102) = $31,928.94
Diff: 3
Topic: NONLINEAR REGRESSION
AACSB: Analytic Skills
93) In regression, the variable to be predicted is called the ________ variable.
Answer: dependent or response
Diff: 1
Topic: INTRODUCTION
94) Explain the purposes of regression models.
Answer: to understand the relationship between variables and to predict the value of one variable using the value of another variable
Diff: 2
Topic: INTRODUCTION
95) Describe the purpose and structure of a scatter diagram.
Answer: A scatter diagram is a graphical method used to investigate the relationship between variables. Normally, the independent variable is plotted on the horizontal axis, and the dependent variable is plotted on the vertical axis.
Diff: 2
Topic: SCATTER DIAGRAMS
96) In regression, the X variable is known as the ________ variable.
Answer: independent or explanatory or predictor.
Diff: 2
Topic: INTRODUCTION and SIMPLE LINEAR REGRESSION
97) What is the formula for r2?
Answer: SSR / SST or 1 – SSE/SST
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
98) If every point lies on the regression line, r2 = ________.
Answer: 1
Diff: 1
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
99) The regression line minimizes the sum of the ________.
Answer: squared errors
Diff: 2
Topic: SIMPLE LINEAR REGRESSION
100) The ________ measures the total variability in Y about the mean.
Answer: SST or sum of squares total
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
101) The ________ measures the variability in Y about the regression line.
Answer: SSE or sum of squared error
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
102) The ________ indicates how much total variability in Y is explained by the regression model.
Answer: SSR or sum of squares due to regression
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
103) SST = SSR + ________.
Answer: SSE or sum of squared error
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
104) Explain what r2 is.
Answer: It is a value between 0 and +1 and measures the proportion of variability in Y that is explained by the regression equation.
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
105) What can be said about an r2 value of 0.96?
Answer: This indicates that 96% of the variation in the dependent variable is being explained by the regression equation and there is a strong correlation between the variables.
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
106) Explain what the correlation coefficient is.
Answer: It is a value between -1 and +1 that measures the strength of the linear relationship between the X and Y variables.
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
107) What can be said about a correlation coefficient of -1?
Answer: This is a perfect negative correlation where all of the values lie in a straight line. The negative value indicates that as X increases in value, Y decreases in value.
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
108) What can be said about a correlation coefficient of +1?
Answer: This is a perfect positive correlation where all of the values lie in a straight line. The positive value indicates that as X increases in value, so does Y.
Diff: 2
Topic: MEASURING THE FIT OF THE REGRESSION MODEL
109) Another name for the “Multiple R” that is given in Excel is ________.
Answer: correlation coefficient or coefficient of correlation
Diff: 2
Topic: USING COMPUTER SOFTWARE FOR REGRESSION
110) Describe a residual plot.
Answer: A residual plot is a plot of the error terms against the independent variable. Residual plots that show patterns often indicate violations in the assumptions of the regression model.
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
111) The standard deviation of the regression is also called ________.
Answer: the standard error of the estimate
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
112) List the four assumptions of the regression model.
Answer: (1) The errors are independent. (2) The errors are normally distributed. (3) The errors have a mean of zero. (4) The errors have a constant variance (regardless of the value of X).
Diff: 2
Topic: ASSUMPTIONS OF THE REGRESSION MODEL
113) What is the difference between simple linear regression models and multiple regression models?
Answer: Multiple regression models have more than one independent variable.
Diff: 1
Topic: MULTIPLE REGRESSION ANALYSIS
114) To include qualitative data in regression analysis, you must first create a ________ variable.
Answer: dummy or binary or indicator
Diff: 2
Topic: BINARY OR DUMMY VARIABLES
115) For each qualitative variable, he number of dummy variables must equal ________ the number of categories of the qualitative variable.
Answer: one less than
Diff: 2
Topic: BINARY OR DUMMY VARIABLES
116) As more variables are added to the model, what happens to the r2 value?
Answer: It usually increases; it cannot decrease.
Diff: 2
Topic: MODEL BUILDING
117) Discuss the relationship between r2 and adjusted r2.
Answer: The value of r2 can never decrease when more variables are added to the model; however, the adjusted r2 may decrease when more variables are added to the model. This occurs because the adjusted r2 takes into account the number of independent variables in the model.
Diff: 2
Topic: MODEL BUILDING
118) When the independent variables are correlated with each other, ________ is said to exist.
Answer: multicollinearity or collinearity
Diff: 2
Topic: MODEL BUILDING
119) With a nonlinear relationship, a ________ is necessary to turn a nonlinear model into a linear model.
Answer: transformation
Diff: 2
Topic: NONLINEAR REGRESSION
120) List four pitfalls of regression.
Answer: If the assumptions are not met, the statistical test may not be valid. Correlation does not necessarily mean causation. If multicollinearity is present, the model is still good for prediction but interpretation of the individual coefficients is questionable. Interpretation outside of the range of X values is questionable. The regression equation should not be used to predict a value of Y when X is zero. Using the F-test and concluding a linear relationship is helpful in predicting Y does not mean that this is the best relationship. A statistically significant relationship does not mean practical value.
Diff: 2
Topic: CAUTIONS AND PITFALLS IN REGRESSION ANALYSIS