To Notes

IT 403 -- Oct 12, 2016

Review Exercises

  1. What is the regression fallacy?
    Ans: In a pretest/posttest situation, the regression fallacy is the mistaken notion that if someone does well on the pretest, he or she should do equally well on the posttest; if that person does poorly on the pretest, he or she should do equally poorly on the posttest. In fact, unless the pretest and posttest scores are perfectly correlated (r = 1), someone that obtains a pretest score k SDxs above x, will, on the average, obtain a posttest score r × k SDys above the average. In other words, the posttest score will be lower than the pretest score, on the average. The situation is the opposite if the person obtains a below average score on the pretest. If the pretest score is k SDxs below x, then, on the average, the posttest score will be r × k SDys below the average. In other words, the posttest score will be higher than the pretest score, on the average.
  2. What is the RMSE for a regression model?
    Ans: The root mean squared error (RMSE is the standard deviation of the residuals. In particular, RMSE is the SD of the residuals in a thin rectangle that contains a specific x value. RMSE is defined as √(1 - r2). If r < 1, r < SDy
  3. Find the errors with this probability distribution function:

      Outcome       Probability   
    3 60%     
    5 30%     
    6 0%     
    7 -10%     
    9 110%     

    Ans: Probabilities cannot be negative; probabilties cannot be more than 100%; the probabilities in the table must sum to 100%.

Learning Outcomes for Today

Probability

Practice Problems

  1. What is wrong with this argument? Either the Cubs will win the World Series or they won't. Therefore the probability that the Cubs will win the World Series is 50%.
    Ans: Just because there are two outcomes doesn't mean they are equally likely. Some outcomes are very likely; they have probabilities close to 1. Some outcomes are unlikely; they have outcomes close to 0. Other outcomes have probabilities close to 0.5.
  2. What is wrong with this strategy? Double down after after each loss. Eventually you win and recoup your losses. For example:
    -1 - 2 - 4 - 8 - 16 + 32 = 1.
    Now start over with 1 and repeat the double down strategy.
    Ans: The problem with this strategy is that, eventually, you will either reach the casino betting limit or you will run out of money.
  3. A bookmaker offers 7 to 4 odds that the Cubs will win the World Series. If this is a fair bet, what is the probability p that the Cubs will win the World Series?
    Ans: The expected amount that you win is 7p + (-4)(1 - p); this expression is 0 because we are assuming that is a fair bet. Now solve for p:
         7p + (-4)(1 - p) = 0
         7p - 4 + 4p = 0
         11p = 4
         p = 4 / 11 = 0.364 = 36%.

Random Variables

Expected Value

The Multiplication Rule for Independent Events

The Addition Rule for Mutually Exclusive Events

The Standard Deviation of a Random Variable

Practice Problems

  1. Recall that the expected value for the Rainfall on a Tropical Island example is 1.1 inches. Here is the probability distribution:

       Rainfall       Probability  
    0 0.3
    1 0.4
    2 0.2
    3 0.1


    Compute the variance and standard deviation of this random variable.
    Ans: (0 - 1.1)2 0.3 + (1 - 1.1)2 0.4 + (2 - 1.1)2 0.2 + (3 - 1.1)2 0.1
    = 1.21 × 0.3 + 0.01 × 0.4 + 0.81 × 0.2 + 3.61 × 0.1 = 0.89
    The standard deviation is sqrt(0.89) = 0.943
  2. Compute the variance and standard deviation of a Bernoulli random variable.
    Ans: The expected value of a Bernoulli random variable is 0(1 - p) + 1p = p.

    Variance = (0 - p)2 (1-p) + (1 - p)2 p = p2(1-p) + (1 - 2p + p2)p
    = p2 - p3 + p - 2p2 + p3 = p - p2 = p(1 - p)
    The standard deviation is the square root of the variance = √p(1 - p)

  3. Use your result from Problem 2 to obtain the mean and standard deviation of the number of heads obtained in a single coin flip.
    Ans: E(x) = p = 0.5; σx = √0.5(1-0.5) = √0.25 = 0.5

Properties of Random Variables

  1. E(cx) = c E(x)
  2. E(x + y) = E(x) + E(y)
  3. E(x1 + ... + xn) = E(x1) + ... + E(xn)
  4. Var(x) = E(x2) - E(x)2
  5. Definition:   x and y are independent if E(xy) = E(x)E(y).
    If x and y are independent, Var(x + y) = Var(x) + Var(y)

Here are the derivations.

Practice Problem

  1. Compute the variance and SD of a Bernoulli random variable with the formula Var(x) = E(x2) - E(x)2.
         Var(x) = E(x2) - E(x)2 = (02 (1 - p) + ... + 12 p) - p2
                  = p - p2 = p(1 - p)
    This is the same result that we obtained earlier.

Sums of Random Variables

Practice Problems

  1. Compute the expected value and standard deviation of the random variable in Practice Problem 1 for 365 days.

    Ans: Recall that E(x) = 1.1 and σx = 0.943; E(S) = nE(x) = 365 × 1.1 = 401.5
    σS = σxn = 0.943√365 = 18.0

  2. Compute the expected value and standard deviation of a Bernoulli random variable (Practice Problem 2) for the sum of n trials.
    Ans: E(S) = nE(x) = np
    σS = σxn = √p(1-p)n = √np(1-p)
  3. Suppose that my true percentage of making free throws is 70%. Out of 100 attempts, what is the expected value and standard deviation of the number of free throws made? Find a 95% confidence interval for the number of free throws made.
    Ans: E(S) = np = 100 × 0.7 = 70.
    Var(S) = np(1 - p) = 100 × 0.7 × (1 - 0.7) = 21.
    σS = √21 = 4.58.
    The confidence interval is
    [E(S) - 1.96 × σS, E(S) + 1.96 × σS]
    [70 - 1.96 × 4.58, 70 + 1.96 × 4.58] = [61.02, 78.98].

Averages of Random Variables

This section will be discussed on Oct 24.

The Law of Averages

The Central Limit Theorem

Factorials and Counting Combinations