KEY VERSION

Below are a set of excercises to get you practicing with R.

A. Manipulating Vector Data

Using subsetting syntax (square brackets), write code that completes the following using the vector y.

y <- c(3,2,15,-1,22,1,9,17,5)
  1. Display only the first value of y.
#ANSWER:
y[1]  #first value
## [1] 3
  1. Display the last value of y, in a way that would work if y were any length.
#ANSWER:
y[length(y)]
## [1] 5
  1. Display only the values in y that are greater than the mean of y.
#ANSWER:
y[y>mean(y)]
## [1] 15 22  9 17

B. Manipulating Data Frames

*Write code that completes the following using the built in PlantGrowth data set:

  1. Print the data to the screen, and examine the data thoroughly.
#ANSWER:
PlantGrowth
##    weight group
## 1    4.17  ctrl
## 2    5.58  ctrl
## 3    5.18  ctrl
## 4    6.11  ctrl
## 5    4.50  ctrl
## 6    4.61  ctrl
## 7    5.17  ctrl
## 8    4.53  ctrl
## 9    5.33  ctrl
## 10   5.14  ctrl
## 11   4.81  trt1
## 12   4.17  trt1
## 13   4.41  trt1
## 14   3.59  trt1
## 15   5.87  trt1
## 16   3.83  trt1
## 17   6.03  trt1
## 18   4.89  trt1
## 19   4.32  trt1
## 20   4.69  trt1
## 21   6.31  trt2
## 22   5.12  trt2
## 23   5.54  trt2
## 24   5.50  trt2
## 25   5.37  trt2
## 26   5.29  trt2
## 27   4.92  trt2
## 28   6.15  trt2
## 29   5.80  trt2
## 30   5.26  trt2
  1. Look at the help file that describes the data.
#ANSWER:
?PlantGrowth
  1. Sort whole dataset by the weights
#ANSWER:
PlantGrowth[order(PlantGrowth$weight),]  #sort whole data frame 
##    weight group
## 14   3.59  trt1
## 16   3.83  trt1
## 1    4.17  ctrl
## 12   4.17  trt1
## 19   4.32  trt1
## 13   4.41  trt1
## 5    4.50  ctrl
## 8    4.53  ctrl
## 6    4.61  ctrl
## 20   4.69  trt1
## 11   4.81  trt1
## 18   4.89  trt1
## 27   4.92  trt2
## 22   5.12  trt2
## 10   5.14  ctrl
## 7    5.17  ctrl
## 3    5.18  ctrl
## 30   5.26  trt2
## 26   5.29  trt2
## 9    5.33  ctrl
## 25   5.37  trt2
## 24   5.50  trt2
## 23   5.54  trt2
## 2    5.58  ctrl
## 29   5.80  trt2
## 15   5.87  trt1
## 17   6.03  trt1
## 4    6.11  ctrl
## 28   6.15  trt2
## 21   6.31  trt2
  1. Calculate the proportion of plants that weight more than 5.
#ANSWER:
sum(PlantGrowth$weight >5)/nrow(PlantGrowth)
## [1] 0.5666667
# or
sum(PlantGrowth$weight >5)/length(PlantGrowth$weight)
## [1] 0.5666667
  1. Compare the means and standard deviations of the two treatments and the controls. Use indexing, Boolean logic, and the mean() and sd() functions.
#ANSWER:
# mean and sd of control
mean(PlantGrowth[PlantGrowth$group=="ctrl", 1])
## [1] 5.032
sd(PlantGrowth[PlantGrowth$group=="ctrl", 1])
## [1] 0.5830914
# mean and sd of treatment 1
mean(PlantGrowth[PlantGrowth$group=="trt1", 1]) 
## [1] 4.661
sd(PlantGrowth[PlantGrowth$group=="trt1", 1])
## [1] 0.7936757
# mean and sd of treatment 2
mean(PlantGrowth[PlantGrowth$group=="trt2", 1])
## [1] 5.526
sd(PlantGrowth[PlantGrowth$group=="trt2", 1])
## [1] 0.4425733
  1. Create a new data frame that just contains the control data
#ANSWER:
newdata <- PlantGrowth[PlantGrowth$group=="ctrl", ]  
newdata
##    weight group
## 1    4.17  ctrl
## 2    5.58  ctrl
## 3    5.18  ctrl
## 4    6.11  ctrl
## 5    4.50  ctrl
## 6    4.61  ctrl
## 7    5.17  ctrl
## 8    4.53  ctrl
## 9    5.33  ctrl
## 10   5.14  ctrl
  1. In the new data frame, permanently change the third entry of the weight column from 5.18 to 4.1
#ANSWER:
newdata[3,"weight"]<-4.1