Browsing resource, all submissions are temporary.

The χ² and F Distributions

The F distribution is constructed by combining two χ² distributions. To demonstrate this, type the following commands in R to generate two set of 10⁵ random numbers:

set.seed(4689326) 
x1 <- rchisq(1e5,3)
x2 <- rchisq(1e5,5)

a. The random numbers in x1 follow the distribution with degrees of freedom.

Tries 0/2

The random numbers in x2 follow the distribution with degrees of freedom.

Tries 0/2

b. What is the mean and sample standard deviation of x1? Give your answers to at least 4 significant figures.

Mean of x1 = , sd =

Tries 0/2

What is the mean and sample standard deviation of x2? Give your answers to at least 4 significant figures.

Mean of x2 = , sd =

Tries 0/2

c. Generate a new set of random numbers by combining x1 and x2 as follows:

y <- (x1/3)/(x2/5)

What is the mean and sample standard deviation of y? Give your answers to at least 4 significant figures.

Mean of y = , sd =

Tries 0/2

d. Plot a histogram for the random numbers in y. First try the simplest command hist(y,breaks=100,freq=FALSE). Do you like what you see? Next, we want to focus on the region between 0 and 10. Set 100 break points in that region and one point outside using the R command points <- c(seq(0,10,length.out=100),max(y)). Then plot the histogram in the region of interest using the command hist(y,breaks=points,freq=FALSE,xlim=c(0,10)).

Which of the following histogram is closest to the histogram of y?

Tries 0/2

The random numbers in y follows the F distribution with parameters d₁=3 and d₂=5, where d₁ is called the numerator degree of freedom, and d₂ is called the denominator degree of freedom. In general, if X₁ is a random variable following a χ² distribution with df=d₁, X₂ is a random variable following a χ² distribution with df=d₂, and X₁ and X₂ are independent. The random variable

$Y = \frac{X_1/d_1}{X_2/d_2}$

follows the F distribution with dfs d₁ and d₂. In R, the 4 functions associated with the F distributions are df(), pf(), qf() and rf().

Optional Exercise: Plot the histogram of y together with the F distribution with d₁=3 and d₂=5:

hist(y,breaks=points,freq=FALSE,xlim=c(0,10))
curve(df(x,3,5),col="red",add=TRUE)