# Chapter 4 Regression Models

Quantitative Analysis for Management, 11e (Render)

Chapter 4   Regression Models

1) In regression, an independent variable is sometimes called a response variable.

Diff: 2

Topic:  INTRODUCTION

2) One purpose of regression is to understand the relationship between variables.

Diff: 1

Topic:  INTRODUCTION

3) One purpose of regression is to predict the value of one variable based on the other variable.

Diff: 1

Topic:  INTRODUCTION

4) The variable to be predicted is the dependent variable.

Diff: 1

Topic:  INTRODUCTION

5) The dependent variable is also called the response variable.

Diff: 2

Topic:  INTRODUCTION

6) A scatter diagram is a graphical depiction of the relationship between the dependent and independent variables.

Diff: 1

Topic:  SCATTER DIAGRAMS

7) In a scatter diagram, the dependent variable is typically plotted on the horizontal axis.

Diff: 2

Topic:  SCATTER DIAGRAMS

8) There is no relationship between variables unless the data points lie in a straight line.

Diff: 2

Topic:  SCATTER DIAGRAMS

9) In any regression model, there is an implicit assumption that a relationship exists between the variables.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

10) In regression, there is random error that can be predicted.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

11) Estimates of the slope, intercept, and error of a regression model are found from sample data.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

12) Error is the difference in the actual value and the predicted value.

Diff: 1

Topic:  SIMPLE LINEAR REGRESSION

13) The regression line minimizes the sum of the squared errors.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

14) In regression, a dependent variable is sometimes called a predictor variable.

Diff: 2

Topic:  INTRODUCTION

15) Summing the error values in a regression model is misleading because negative errors cancel out positive errors.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

16) The SST measures the total variability in the dependent variable about the regression line.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

17) The SSE measures the total variability in the independent variable about the regression line.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

18) The SSR indicates how much of the total variability in the dependent variable is explained by the regression model.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

19) The coefficient of determination takes on values between -1 and + 1.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

20) The coefficient of determination gives the proportion of the variability in the dependent variable that is explained by the regression equation.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

21) The correlation coefficient has values between −1 and +1.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

22) Errors are also called residuals.

Diff: 2

Topic:  USING COMPUTER SOFTWARE FOR REGRESSION

23) The regression model assumes the error terms are dependent.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

24) The regression model assumes the errors are normally distributed.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

25) The errors in a regression model are assumed to have an increasing mean.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

26) The errors in a regression model are assumed to have zero variance.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

27) If the assumptions of regression have been met, errors plotted against the independent variable will typically show patterns.

Diff: 1

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

28) Often, a plot of the residuals will highlight any glaring violations of the assumptions.

Diff: 1

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

29) The error standard deviation is estimated by MSE.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

30) The standard error of the estimate is also called the variance of the regression.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

31) An F-test is used to determine if there is a relationship between the dependent and independent variables.

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

32) The null hypothesis in the F-test is that there is a linear relationship between the X and Y variables.

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

33) If the significance level for the F-test is high enough, there is a relationship between the dependent and independent variables.

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

34) When the significance level is small enough in the F-test, we can reject the null hypothesis that there is no linear relationship.

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

35) The coefficients of each independent variable in a multiple regression model represent slopes.

Diff: 2

Topic:  MULTIPLE REGRESSION ANALYSIS

36) For statistical tests of significance about the coefficients, the null hypothesis is that the slope is 1.

Diff: 2

Topic:  MULTIPLE REGRESSION ANALYSIS

37) Both the p-value for the F-test and r2 can be interpreted the same with multiple regression models as they are with simple linear models.

Diff: 2

Topic:  MULTIPLE REGRESSION ANALYSIS

38) The multiple regression model includes several dependent variables.

Diff: 1

Topic:  MULTIPLE REGRESSION ANALYSIS

39) In regression, a binary variable is also called an indicator variable.

Diff: 2

Topic:  BINARY OR DUMMY VARIABLES

40) Another name for a dummy variable is a binary variable.

Diff: 2

Topic:  BINARY OR DUMMY VARIABLES

41) The best model is a statistically significant model with a high r-square and few variables.

Diff: 2

Topic:  MODEL BUILDING

Diff: 2

Topic:  MODEL BUILDING

43) The value of r2 can never decrease when more variables are added to the model.

Diff: 2

Topic:  MODEL BUILDING

44) A variable should be added to the model regardless of the impact (increase or decrease) on the adjusted r2 value.

Diff: 2

Topic:  MODEL BUILDING

45) Multicollinearity exists when a variable is correlated to other variables.

Diff: 1

Topic:  MODEL BUILDING

46) If multicollinearity exists, then individual interpretation of the variables is questionable, but the overall model is still good for prediction purposes.

Diff: 2

Topic:  MODEL BUILDING

47) Transformations may be used when nonlinear relationships exist between variables.

Diff: 2

Topic:  NONLINEAR REGRESSION

48) A high correlation always implies that one variable is causing a change in the other variable.

Diff: 2

Topic:  CAUTIONS AND PITFALLS IN REGRESSION ANALYSIS

49) A dummy variable can be assigned up to three values.

Diff: 2

Topic:  BINARY OR DUMMY VARIABLES

50) Which of the following statements is true regarding a scatter diagram?

1. A) It provides very little information about the relationship between the regression variables.
2. B) It is a plot of the independent and dependent variables.
3. C) It is a line chart of the independent and dependent variables.
4. D) It has a value between -1 and +
5. E) It gives the percent of variation in the dependent variable that is explained by the independent variable.

Diff: 2

Topic:  SCATTER DIAGRAMS

51) The random error in a regression equation

1. A) is the predicted error.
2. B) includes both positive and negative terms.
3. C) will sum to a large positive number.
4. D) is used the estimate the accuracy of the slope.
5. E) is maximized in a least squares regression model.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

52) Which of the following statements (are) is not true about regression models?

1. A) Estimates of the slope are found from sample data.
2. B) The regression line minimizes the sum of the squared errors.
3. C) The error is found by subtracting the actual data value from the predicted data value.
4. D) The dependent variable is the explanatory variable.
5. E) The intercept coefficient is not typically interpreted.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

53) Which of the following equalities is correct?

1. A) SST = SSR + SSE
2. B) SSR = SST + SSE
3. C) SSE = SSR + SST
4. D) SST = SSC + SSR
5. E) SSE = Actual Value – Predicted Value

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

54) The sum of squared error (SSE) is

1. A) a measure of the total variation in Y about the mean.
2. B) a measure of the total variation in X about the mean.
3. C) a measure in the variation of Y about the regression line.
4. D) a measure in the variation of X about the regression line.
5. E) None of the above

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

55) If computing a causal linear regression model of Y = a + bX and the resultant r2 is very near zero, then one would be able to conclude that

1. A) Y = a + bX is a good forecasting method.
2. B) Y = a + bX is not a good forecasting method.
3. C) a multiple linear regression model is a good forecasting method for the data.
4. D) a multiple linear regression model is not a good forecasting method for the data.
5. E) None of the above

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

56) Which of the following statements is true about r2?

1. A) It is also called the coefficient of correlation.
2. B) It is also called the coefficient of determination.
3. C) It represents the percent of variation in X that is explained by Y.
4. D) It represents the percent of variation in the error that is explained by Y.
5. E) It ranges in value from -1 to +

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

57) The coefficient of determination resulting from a particular regression analysis was 0.85.  What was the slope of the regression line?

1. A) 0.85
2. B) -85
3. C) 0.922
4. D) There is insufficient information to answer the question.
5. E) None of the above

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

58) The diagram below illustrates data with a

1. A) negative correlation coefficient.
2. B) zero correlation coefficient.
3. C) positive correlation coefficient.
4. D) correlation coefficient equal to +
5. E) None of the above

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

59) The correlation coefficient resulting from a particular regression analysis was 0.25.  What was the coefficient of determination?

1. A) 0.5
2. B) -5
3. C) 0.0625
4. D) There is insufficient information to answer the question.
5. E) None of the above

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

AACSB:  Analytic Skills

60) The coefficient of determination resulting from a particular regression analysis was 0.85.  What was the correlation coefficient, assuming a positive linear relationship?

1. A) 0.5
2. B) -5
3. C) 0.922
4. D) There is insufficient information to answer the question.
5. E) None of the above

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

AACSB:  Analytic Skills

61) Which of the following is an assumption of the regression model?

1. A) The errors are independent.
2. B) The errors are not normally distributed.
3. C) The errors have a standard deviation of zero.
4. D) The errors have an irregular variance.
5. E) The errors follow a cone pattern.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

62) Which of the following is not an assumption of the regression model?

1. A) The errors are independent.
2. B) The errors are normally distributed.
3. C) The errors have constant variance.
4. D) The mean of the errors is zero.
5. E) The errors should have a standard deviation equal to one.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

63) In a good regression model the residual plot shows

1. A) a cone pattern.
2. B) an arched pattern.
3. C) a random pattern.
4. D) an increasing pattern.
5. E) a decreasing pattern.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

64) The problem of nonconstant error variance is detected in residual analysis by which of the following?

1. A) a cone pattern
2. B) an arched pattern
3. C) a random pattern
4. D) an increasing pattern
5. E) a decreasing pattern

Diff: 3

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

65) The problem of a nonlinear relationship is detected in residual analysis by which of the following?

1. A) a cone pattern
2. B) an arched pattern
3. C) a random pattern
4. D) an increasing pattern
5. E) a decreasing pattern

Diff: 3

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

66) The mean square error (MSE) is

1. A) denoted by s.
2. B) denoted by k.
3. C) the SSE divided by the number of observations.
4. D) the SSE divided by the degrees of freedom.
5. E) None of the above

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

67) Which of the following represents the underlying linear model for hypothesis testing?

1. A) Y = b0+ b1 X + ε
2. B) Y = b0+ b1 X
3. C) Y = β0+ β1 X + ε
4. D) Y = β0+ β1 X
5. E) None of the above

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

68) Which of the following statements is false concerning the hypothesis testing procedure for a regression model?

1. A) The F-test statistic is used.
2. B) The null hypothesis is that the true slope coefficient is equal to zero.
3. C) The null hypothesis is rejected if the adjusted r2is above the critical value.
4. D) An α level must be selected.
5. E) The alternative hypothesis is that the true slope coefficient is not equal to zero.

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

69) Suppose that you believe that a cubic relationship exists between the independent variable (of time) and the dependent variable Y.  Which of the following would represent a valid linear regression model?

1. A) Y = b0+ b1 X, where X = time3
2. B) Y = b0+ b1 X3, where X = time
3. C) Y = b0+ 3b1 X, where X = time3
4. D) Y = b0+ 3b1 X, where X = time
5. E) Y = b0+ b1 X, where X = time1/3

Diff: 3

Topic:  NONLINEAR REGRESSION

70) A prediction equation for starting salaries (in \$1,000s) and SAT scores was performed using simple linear regression. In the regression printout shown below, what can be said about the level of significance for the overall model?

1. A) SAT is not a good predictor for starting salary.
2. B) The significance level for the intercept indicates the model is not valid.
3. C) The significance level for SAT indicates the slope is equal to zero.
4. D) The significance level for SAT indicates the slope is not equal to zero.
5. E) None of the above

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

71) A prediction equation for sales and payroll was performed using simple linear regression. In the regression printout shown below, which of the following statements is/are not true?

1. A) Payroll is a good predictor of Sales based on α =05.
2. B) There is evidence of a positive linear relationship between Sales and Payroll based on α =05.
3. C) Payroll is not a good predictor of Sales based on α =01.
4. D) The coefficient of determination is equal to 0.833333.
5. E) Payroll is the independent variable.

Diff: 2

Topic:  TESTING THE MODEL FOR SIGNIFICANCE

72) A healthcare executive is using regression to predict total revenues. She has decided to include both patient length of stay and insurance type in her model. Insurance type can be grouped into the following categories: Medicare, Medicaid, Managed Care, Self-Pay, and Charity. Which of the following is true?

1. A) Insurance type will be represented in the regression model by five binary variables.
2. B) Insurance type will be represented in the regression model by six dummy variables.
3. C) Insurance type will be represented in the regression model by five dummy variables.
4. D) Insurance type will be represented in the regression model by four binary variables.
5. E) Neither binary nor dummy variables are necessary for the regression model.

Diff: 2

Topic:  BINARY OR DUMMY VARIABLES

73) A healthcare executive is using regression to predict total revenues. She has decided to include both patient length of stay and insurance type in her model. Insurance type can be grouped into three categories: Government-Funded, Private-Pay, and Other. Her model is

1. A) Y = b0.
2. B) Y = b0+ b1 X1.
3. C) Y = b0+ b1X1 + b2 X2.
4. D) Y = b0+ b1X1 + bX2 + b3 X3.
5. E) Y = b0+ b1X1 + bX2 + b3 X3 + b4 X4.

Diff: 3

Topic:  BINARY OR DUMMY VARIABLES

AACSB:  Analytic Skills

74) A healthcare executive is using regression to predict total revenues. She is deciding whether or not to include both patient length of stay and insurance type in her model. Her first regression model only included patient length of stay. The resulting r2 was .83, with an adjusted r2 of .82 and her level of significance was .003. In the second model, she included both patient length of stay and insurance type. The r2 was .84 and the adjusted r2 was .80 for the second model and the level of significance did not change. Which of the following statements is true?

1. A) The second model is a better model.
2. B) The first model is a better model.
3. C) The r2increased when additional variables were added because these variables significantly contribute to the prediction of total revenues.
5. E) None of the above statements are true.

Diff: 2

Topic:  MODEL BUILDING

75) The sum of the squares total (SST)

1. A) measures the total variability in Y about the mean.
2. B) measures the total variability in X about the mean.
3. C) measures the variability in Y about the regression line.
4. D) measures the variability in X about the regression line.
5. E) indicates how much of the total variability in Y is explained by the regression model.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

76) Which of the following statements provides the best guidance for model building?

1. A) If the value of r2 increases as more variables are added to the model, the variables should remain in the model, regardless of the magnitude of increase.
2. B) If the value of the adjusted r2 increases as more variables are added to the model, the variables should remain in the model.
3. C) If the value of r2 increases as more variables are added to the model, the variables should not remain in the model, regardless of the magnitude of the increase.
4. D) If the value of the adjusted r2 increases as more variables are added to the model, the variables should not remain in the model.
5. E) None of the statements provide accurate guidance.

Diff: 2

Topic:  MODEL BUILDING

77) An automated process to systematically add or delete independent variables from a regression model is known as

1. A) nonlinear transformations.
2. B) multicollinearity.
3. C) multiple regression.
4. D) least squares method.
5. E) None of the above

Diff: 2

Topic:  MODEL BUILDING

78) Which of the following is not a common pitfall of regression?

1. A) If the assumptions are not met, the statistical tests may not be valid.
2. B) Nonlinear relationships cannot be incorporated.
3. C) Two variables may be highly correlated to one another but one is not causing the other to change.
4. D) Concluding that a statistically significant relationship implies practical value.
5. E) Using a regression equation beyond the range of X is very questionable.

Diff: 2

Topic:  CAUTIONS AND PITFALLS IN REGRESSION ANALYSIS

79) The condition of an independent variable being correlated to one or more other independent variables is referred to as

1. A) multicollinearity.
2. B) statistical significance.
3. C) linearity.
4. D) nonlinearity.
5. E) The significance level for the F-test is not valid.

Diff: 2

Topic:  MODEL BUILDING

80) Which of the following is true regarding a regression model with multicollinearity, a high r2 value, and a low F-test significance level?

1. A) The model is not a good prediction model.
2. B) The high value of r2 is due to the multicollinearity.
3. C) The interpretation of the coefficients is valuable.
4. D) The significance level tests for the coefficients are not valid.
5. E) The significance level for the F-test is not valid.

Diff: 2

Topic:  MODEL BUILDING

81) An air conditioning and heating repair firm conducted a study to determine if the average outside temperature could be used to predict the cost of an electric bill for homes during the winter months in Houston, Texas. The resulting regression equation was:

Y = 227.19 – 1.45X, where Y = monthly cost, X = average outside air temperature

(a)           If the temperature averaged 48 degrees during December, what is the forecasted cost of December’s electric bill?

(b)           If the temperature averaged 38 degrees during January, what is the forecasted cost of January’s electric bill?

(a) \$227.19 – \$1.45(48) = \$157.59

(b) \$227.19 – \$1.45(38) = \$172.09

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

AACSB:  Analytic Skills

82) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teachers’ salaries at each elementary school. The researcher uses years of experience to predict salary. The resulting equation was:

Y = 23,313.22 + 1,210.89X, where Y = salary and X = years of experience

(a)           If a teacher has 10 years of experience, what is the forecasted salary?

(b)           If a teacher has 5 years of experience, what is the forecasted salary?

(c)           Based on this equation, for every additional year of service, a teacher could expect his or her salary to increase by how much?

(a) \$23,313.22 + \$1,210.89(10) = \$35,422.12

(b) \$23,313.22 + \$1,210.89(5) = \$29,367.67

(c) \$1,210.89

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

AACSB:  Analytic Skills

83) An air conditioning and heating repair firm conducted a study to determine if the average outside temperature, thickness of the insulation, and age of the heating equipment could be used to predict the electric bill for a home during the winter months in Houston, Texas. The resulting regression equation was:

Y = 256.89 – 1.45X1 – 11.26X2 + 6.10X3, where Y = monthly cost, X1 = average temperature, X2 = insulation thickness, and X3 = age of heating equipment

(a)           If December has an average temperature of 45 degrees and the heater is 2 years old with insulation that is 6 inches thick, what is the forecasted monthly electric bill?

(b)           If January has an average temperature of 40 degrees and the heating equipment is 12 years old with insulation that is 2 inches thick, what is the forecasted monthly electric bill?

(a) \$256.89 – 1.45(45) – 11.26(6) + 6.10(2) = \$136.28

(b) \$256.89 – 1.45(40) – 11.26(2) + 6.10(12) = \$249.57

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

AACSB:  Analytic Skills

84) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teacher salaries at each elementary school. The researcher uses years of experience to predict salary. The raw data is given in the table below. The resulting equation was:

Y = 19389.21 + 1330.12X, where Y = salary and X = years of experience

 Salary Yrs Exp \$24,265.00 8 \$27,140.00 5 \$22,195.00 2 \$37,950.00 15 \$32,890.00 11 \$40,250.00 14 \$36,800.00 9 \$30,820.00 6 \$44,390.00 21 \$24,955.00 2 \$18,055.00 1 \$23,690.00 7 \$48,070.00 20 \$42,205.00 16

(a) Develop a scatter diagram.

(b) What is the correlation coefficient?

(c) What is the coefficient of determination?

(a)

(b) .936

(c) .877

Diff: 2

Topic:  VARIOUS

AACSB:  Analytic Skills

85) A large international sales organization has collected data on the number of employees and the annual gross sales during the last 7 years.

 # of employees sales (in \$000s) 1975 100 2010 110 2005 122 2020 130 2030 139 2031 152 2050 164 2100 ?

(a)           Develop a scatter diagram.

(b)           Determine the correlation coefficient.

(c)           Determine the coefficient of determination.

(d)           Determine the least squares trend line.

(e)           Determine the predicted value of sales for 2100 employees.

(a)

(b)           .937

(c)           .878

(d)           Y = -1663.03 + .889X1

(e)           203.87 or \$203,870

Diff: 2

Topic:  VARIOUS

AACSB:  Analytic Skills

86) A large department store has collected the following monthly data on lost sales revenue due to theft and the number of security guard hours on duty:

 Lost Sales Revenue (\$000s) Total Security Guard hours Lost Sales Revenue (\$000s) Total Security Guard hours 1.0 600 1.8 950 1.4 630 2.1 1300 1.9 1000 2.3 1350 2.0 1200

(a)           Determine the least squares regression equation.

(b)           Using the results of part (a), find the estimated lost sales revenues if the total number of security guard hours is 800.

(c)           Calculate the coefficient of correlation.

(d)           Calculate the coefficient of determination.

(a)           least squares equation: Y = .3780 + .0014X

(b)           Y = .3780 + 0.0014(800) = 1.498

(c)           coefficient of correlation = 0.955

(a)           coefficient of determination = 0.9121

Diff: 2

Topic:  VARIOUS

AACSB:  Analytic Skills

87) Bob White is conducting research on monthly expenses for medical care, including over-the-counter medicine. His dependent variable is monthly expenses for medical care while his independent variable is number of family members.  Below is his Excel output.

(a)  What is the prediction equation?

(b)  Based on his model, each additional family member increases the predicted costs by how much?

(c)  Based on the significance F-test, is this model a good prediction equation?

(d)  What percent of the variation in medical expenses is explained by the size of the family?

(e)  Can the null hypothesis that the slope is zero be rejected? Why or why not?

(f)  What is the value of the correlation coefficient?

(a)  Y = 110.47 + 16.83X

(b)  \$16.83

(c)  Yes, because the p-value for the F-test is low.

(d)  48.3% of the variation in medical expenses is explained by family size.

(e)  The null hypothesis can be rejected, the slope is not equal to zero based on the low p-value.

(f)  0.695

Diff: 2

Topic:  USING COMPUTER SOFTWARE FOR REGRESSION

AACSB:  Analytic Skills

88) Consider the regression model Y = 389.10 – 14.6X.  If the r2 value is 0.657, what is the correlation coefficient?

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

AACSB:  Analytic Skills

89) Bob White is conducting research on monthly expenses for medical care, including over the counter medicine. His dependent variable is monthly expenses for medical care while his independent variables are number of family members and insurance type (government funded, private insurance and other). He has coded insurance type as the following:

X2 = 1 if government funded, X3 = 1 if private insurance

Below is his Excel output.

(a)           What is the prediction equation?

(b)           Based on the significance F-test, is this model a good prediction equation?

(c)           What percent of the variation in medical expenses is explained by the independent variables?

(d)           Based on his model, what are the predicted monthly expenses for a family of four with private insurance?

(e)           Based on his model, what are the predicted monthly expenses for a family of two with government funded insurance?

(f)            Based on his model, what are the predicted monthly expenses for a family of five with no insurance?

(a)           Y = 144.91 + 11.63X1 – 13.70 X2 – 9.11X3

(b)           The model is a good prediction equation because the significance level for the F-test is low.

(c)           73.79 percent of the variation in medical expenses is explained by family size and insurance type.

(d)           \$182.32

(e)           \$154.47

(f)            \$203.06

Diff: 3

Topic:  VARIOUS

AACSB:  Analytic Skills

90) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teacher salaries at each elementary school. The researcher would like to examine the significance of a following quadratic model for predicting salary based on years of experience.

Y = β0 + β1X1 + β2X2 + ε       where X1 = Yrs Exp and X2 = Yrs Exp2

 Salary Yrs Exp \$24,265.00 8 \$27,140.00 5 \$22,195.00 2 \$37,950.00 15 \$32,890.00 11 \$40,250.00 14 \$36,800.00 9 \$30,820.00 6 \$44,390.00 21 \$24,955.00 2 \$18,055.00 1 \$23,690.00 7 \$48,070.00 20 \$42,205.00 16

(a)           What is the adjusted r2?

(b)           What is the prediction equation?

(a)           0.855

(b)           Y = 19015.29 + 1437.49X1 – 4.98 X2

Diff: 3

Topic:  NONLINEAR REGRESSION

AACSB:  Analytic Skills

91) An air conditioning and heating repair firm conducted a study to determine if the average outside temperature could be used to predict the cost of an electric bill for a home during the winter months in Richmond, VA. It was determined that a quadratic model could be used and the following prediction equation was established:

Y = \$1557.76 – 55.05X1 + 0.56X2

where Y = monthly cost, X1 = average temperature, and  X2 = average temperature2

(a)           If December has an average temperature of 43 degrees what is the forecasted monthly electric bill?

(b)           If January has an average temperature of 40 degrees what is the forecasted monthly electric bill?

(a) \$1557.76 – 55.05(43) +0.56(432) = \$226.05

(b) \$1557.76 – 55.05(40) +0.56(402) = \$251.76

Diff: 3

Topic:  NONLINEAR REGRESSION

AACSB:  Analytic Skills

92) A large school district is reevaluating its teachers’ salaries. They have decided to use regression analysis to predict mean teacher salaries at each elementary school. The research has come up with the following prediction equation:

Y = \$18012.24 + 1432.37X1 – 4.07 X2          where X1 = Yrs Exp and X2 = Yrs Exp2

(a)           If a teacher has 7 years of experience, what is the expected salary?

(b)           If teacher has 10 years of experience, what is the expected salary?

(a) \$18012.24 + 1432.37(7) – 4.07(72) = \$27,839.40

(b) \$18012.24 + 1432.37(10) – 4.07(102) = \$31,928.94

Diff: 3

Topic:  NONLINEAR REGRESSION

AACSB:  Analytic Skills

93) In regression, the variable to be predicted is called the ________ variable.

Diff: 1

Topic:  INTRODUCTION

94) Explain the purposes of regression models.

Answer:  to understand the relationship between variables and to predict the value of one variable using the value of another variable

Diff: 2

Topic:  INTRODUCTION

95) Describe the purpose and structure of a scatter diagram.

Answer:  A scatter diagram is a graphical method used to investigate the relationship between variables.  Normally, the independent variable is plotted on the horizontal axis, and the dependent variable is plotted on the vertical axis.

Diff: 2

Topic:  SCATTER DIAGRAMS

96) In regression, the X variable is known as the ________ variable.

Answer:  independent or explanatory or predictor.

Diff: 2

Topic:  INTRODUCTION and SIMPLE LINEAR REGRESSION

97) What is the formula for r2?

Answer:  SSR / SST or 1 – SSE/SST

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

98) If every point lies on the regression line, r2 = ________.

Diff: 1

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

99) The regression line minimizes the sum of the ________.

Diff: 2

Topic:  SIMPLE LINEAR REGRESSION

100) The ________ measures the total variability in Y about the mean.

Answer:  SST or sum of squares total

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

101) The ________ measures the variability in Y about the regression line.

Answer:  SSE or sum of squared error

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

102) The ________ indicates how much total variability in Y is explained by the regression model.

Answer:  SSR or sum of squares due to regression

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

103) SST = SSR + ________.

Answer:  SSE or sum of squared error

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

104) Explain what r2 is.

Answer:  It is a value between 0 and +1 and measures the proportion of variability in Y that is explained by the regression equation.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

105) What can be said about an r2 value of 0.96?

Answer:  This indicates that 96% of the variation in the dependent variable is being explained by the regression equation and there is a strong correlation between the variables.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

106) Explain what the correlation coefficient is.

Answer:  It is a value between -1 and +1 that measures the strength of the linear relationship between the X and Y variables.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

107) What can be said about a correlation coefficient of -1?

Answer:  This is a perfect negative correlation where all of the values lie in a straight line. The negative value indicates that as X increases in value, Y decreases in value.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

108) What can be said about a correlation coefficient of +1?

Answer:  This is a perfect positive correlation where all of the values lie in a straight line. The positive value indicates that as X increases in value, so does Y.

Diff: 2

Topic:  MEASURING THE FIT OF THE REGRESSION MODEL

109) Another name for the “Multiple R” that is given in Excel is ________.

Answer:  correlation coefficient or coefficient of correlation

Diff: 2

Topic:  USING COMPUTER SOFTWARE FOR REGRESSION

110) Describe a residual plot.

Answer:  A residual plot is a plot of the error terms against the independent variable. Residual plots that show patterns often indicate violations in the assumptions of the regression model.

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

111) The standard deviation of the regression is also called ________.

Answer:  the standard error of the estimate

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

112) List the four assumptions of the regression model.

Answer:  (1) The errors are independent. (2) The errors are normally distributed.  (3) The errors have a mean of zero. (4) The errors have a constant variance (regardless of the value of X).

Diff: 2

Topic:  ASSUMPTIONS OF THE REGRESSION MODEL

113) What is the difference between simple linear regression models and multiple regression models?

Answer:  Multiple regression models have more than one independent variable.

Diff: 1

Topic:  MULTIPLE REGRESSION ANALYSIS

114) To include qualitative data in regression analysis, you must first create a ________ variable.

Answer:  dummy or binary or indicator

Diff: 2

Topic:  BINARY OR DUMMY VARIABLES

115) For each qualitative variable, he number of dummy variables must equal ________ the number of categories of the qualitative variable.

Diff: 2

Topic:  BINARY OR DUMMY VARIABLES

116) As more variables are added to the model, what happens to the r2 value?

Answer:  It usually increases; it cannot decrease.

Diff: 2

Topic:  MODEL BUILDING

117) Discuss the relationship between r2 and adjusted r2.

Answer:  The value of r2 can never decrease when more variables are added to the model; however, the adjusted r2 may decrease when more variables are added to the model.  This occurs because the adjusted r2 takes into account the number of independent variables in the model.

Diff: 2

Topic:  MODEL BUILDING

118) When the independent variables are correlated with each other, ________ is said to exist.

Diff: 2

Topic:  MODEL BUILDING

119) With a nonlinear relationship, a ________ is necessary to turn a nonlinear model into a linear model.

Diff: 2

Topic:  NONLINEAR REGRESSION

120) List four pitfalls of regression.

Answer:  If the assumptions are not met, the statistical test may not be valid. Correlation does not necessarily mean causation. If multicollinearity is present, the model is still good for prediction but interpretation of the individual coefficients is questionable. Interpretation outside of the range of X values is questionable. The regression equation should not be used to predict a value of Y when X is zero. Using the F-test and concluding a linear relationship is helpful in predicting Y does not mean that this is the best relationship. A statistically significant relationship does not mean practical value.

Diff: 2

Topic:  CAUTIONS AND PITFALLS IN REGRESSION ANALYSIS