To Notes

IT 403 -- Oct 5, 2016

Review Exercises

  1. In the Correlation document, look at the scatterplots of bivariate datasets with various correlations.
  2. Estimate the correlation r in these situations:
    1. Height of father, height of son.
      i. -0.30    ii. 0.05    iii. 0.70    iv. 0.99    Ans: 0.70
    2. IQ of husband, IQ of wife.
      i. -0.70    ii. 0.00    iii. 0.60    iv. 1.00    Ans: 0.60
    3. Height of husband, height of wife if men always married women that were exactly 6 inches shorter.
      i. -0.60    ii. 0.60    iii. 0.99    iv. 1.00    Ans: 1.00
    4. Weight of husband, weight of wife if men always married women that weighed 70% of their husbands weight.
      i. 0.00    ii. 0.50    iii. 0.70    iv. 1.00     Ans: 1.00
  3. Match the correlation to the dataset:
    1. GPA in freshman year, GPA in sophomore year.   Ans: 0.70
    2. GPA in freshman year, GPA in senior year.    Ans: 0.30
    3. Length and weight of 2 by 4 boards.   Ans: 0.99
    -0.50   0.005   0.30   0.70   0.99
  4. What would happen to the correlation r if
    1. x were replaced by x + 10.
    2. y were replaced by 2 times y + 8.
    3. x and y were interchanged.
    Ans: in all three cases, the correlation would remain the same.
  5. How large must r be to be considered meaningful?
    Ans: See the table in the Correlation document.
  6. Why is the computed value of r the same whether the SD or the SD+ is used for the x and y standard deviations?
  7. Use SPSS to compute the pairwise correlations of the variables in the Nielsen Dataset. The rows of this dataset are the ratings for various television shows. Interpret the correlations.
  8. What is a bivariate normal dataset? Which statistics parsimoniously describe a bivariate normal dataset?
    Ans: x   y   SDx   SDy   rxy.
  9. How many statistics would you need to parsimoniously describe a multivariate normal dataset with three variables x, y, and z? a multivariate normal dataset with k variables?
    Ans: For three variables: three sample means, three sample SDs, three sample correlations rxy, rxz, ryz: 9 total summary statistics. For k variables: k sample means, k SDs, k(k-1)/2 sample correlations: (k2 + 3k) / 2 total summary statistics.
  10. Compute the correlation r of this dataset "by hand" using SPSS  but not using Analyze >> Correlate >> Bivariate. If you use SD+ for x and y, don't forget the correction factor n / (n - 1). Check your answer with Analyze >> Correlate >> Bivariate.
    x y
    1 1
    2 3
    3 2
    4 4

    Ans: r = 0.8.

Linear Correlation

Linear Regression

Project 3

The Regression Fallacy

Additional Regression Problem

Probability

Random Variables

We will discuss random variables next time on Oct 12.