Browsing resource, all submissions are temporary.
The F distribution is constructed by combining two χ2 distributions. To demonstrate this, type the following commands in R to generate two set of 105 random numbers:
set.seed(4689326) x1 <- rchisq(1e5,3) x2 <- rchisq(1e5,5)
a. The random numbers in x1 follow the normal chi-square t distribution with degrees of freedom.
The random numbers in x2 follow the normal chi-square t distribution with degrees of freedom.
b. What is the mean and sample standard deviation of x1? Give your answers to at least 4 significant figures.
Mean of x1 = , sd =
What is the mean and sample standard deviation of x2? Give your answers to at least 4 significant figures.
Mean of x2 = , sd =
c. Generate a new set of random numbers by combining x1 and x2 as follows:
y <- (x1/3)/(x2/5)
What is the mean and sample standard deviation of y? Give your answers to at least 4 significant figures.
Mean of y = , sd =
d. Plot a histogram for the random numbers in y. First try the simplest command hist(y,breaks=100,freq=FALSE). Do you like what you see? Next, we want to focus on the region between 0 and 10. Set 100 break points in that region and one point outside using the R command points <- c(seq(0,10,length.out=100),max(y)). Then plot the histogram in the region of interest using the command hist(y,breaks=points,freq=FALSE,xlim=c(0,10)).
hist(y,breaks=100,freq=FALSE)
points <- c(seq(0,10,length.out=100),max(y))
hist(y,breaks=points,freq=FALSE,xlim=c(0,10))
Which of the following histogram is closest to the histogram of y?
The random numbers in y follows the F distribution with parameters d1=3 and d2=5, where d1 is called the numerator degree of freedom, and d2 is called the denominator degree of freedom. In general, if X1 is a random variable following a χ2 distribution with df=d1, X2 is a random variable following a χ2 distribution with df=d2, and X1 and X2 are independent. The random variable
follows the F distribution with dfs d1 and d2. In R, the 4 functions associated with the F distributions are df(), pf(), qf() and rf().
Optional Exercise: Plot the histogram of y together with the F distribution with d1=3 and d2=5:
hist(y,breaks=points,freq=FALSE,xlim=c(0,10)) curve(df(x,3,5),col="red",add=TRUE)