Browsing resource, all submissions are temporary.
Normal distribution is widely used in statistics to construct confidence intervals and calculate p-values. However, not all distributions are normal. You have seen from one of Week 4's exercises that the distribution of the correlation coefficient r is not normal. It is possible to transform the variable to make the distribution closer to a normal curve. To demonstrate this, we have created a data set for the distribution of the correlation coefficient r from that Week 4 problem.
The data is generated by a similar method as you did in Week 4: perform 10,000 simulations of choosing 25 students of the Stat 100 final score-ExamAve data and calculate the sample r. We use a smaller sample size 25 instead of 100 in order to make the resulting sample r distribution further from the normal curve. The data can be loaded to R (after saving it in your R's work space) by the command
rdat <- read.csv("rdensity.csv")
There is only one column, labelled "r". It is more convenient to copy the column to a numeric vector 'r' using the command
r <- rdat$r
You can make a density plot by the command
plot(density(r))
You should see a plot similar to the following. It is clear that the distribution of r is left-skewed.
We would like to transform r to make the distribution closer to a normal curve. There are several candidates: r2, r3, r4, er, √r , 1/r, log(r), ... Here log means the natural log (base e). Try the following plots:
plot(density(r^2)) plot(density(r^3)) plot(density(r^4)) plot(density(sqrt(r))) plot(density(exp(r))) plot(density(log(r))) plot(density(1/r))
a. Which of the following transformations make the resulting distribution curve substantially closer to a normal curve? (select all that apply) log(r) √r 1/r r3 r4
A lazy dude doesn't want to type the plot() command every time he wants to explore a new transfromation. He creates the following function:
t <- function(r,f,...) { plot(density(f(r,...))) }
Note that ... is a special function argument. If you forget about it, review Section 15.5 of the textbook.
b. Which of the following function calls is equivalent to the command plot(density(log(r)))? t(log,r) t(log(r)) t(r,log(r)) t(r,log) none of the above
plot(density(log(r)))
c. In order to use the t() function to plot power-law transformations, the lazy dude creates the following function:
t()
p <- function(x,n) { x^n }
Which of the following function calls is equivalent to plot(density(1/r))? t(1/r) t(r,-1,p) t(r,p) t(r,p,-1) none of the above
plot(density(1/r))
d. Another lazy dude scoffs at this t() function in handing the power-law transformations and its inability in dealing with more general transformations such as 1/(1+r). He creates the following t() function that is both simpler and able to handle more general transformations:
t <- function(x) { plot(density(x)) }
With this new t() function, which of the following function calls is equivalent to plot(density(1/(1+r)))? t(1/(1+r)) t(r,1/(1+r)) t(1/(1+r)=x) t(1/(1+r),r) none of the above
plot(density(1/(1+r)))