KEY VERSION

Below are a set of excercises to get you practicing with R

A. Practice the analysis pipeline using control statements!

Write a script that completes the following tasks:

Be sure to write your name and the date at the top of the script in a comment

  1. Clear your working environment

  2. Load the data possum_trimmed.csv from D2L.

  3. look at the structure of the possum data using str()

  4. Set the graphing parameters so that you have a 2 row x 4 (use = par(mfrow=c(2,4))) column graphing window (8 graphs total).

  5. Write a for loop that runs through each column of possum except the last one (belly size), checks if that column is numeric, and if so:

  1. plots the relationship between that variable (as x) and belly size (as y),

This is a complicated one, but here is some psuedocode to help you out

# read in the data using read.csv()

# check the structure using str()

# set the graphing window parameters using par() and the mfrow argument

# start the for loop (run it from 1 to 9, which is ncol(possum)-1)

# using an if statement, test if the ith column is numeric

#if the column is numeric, create a plot with the ith column as the x and the last column as the y 

#end the if statement

#end the for loop
#ANSWER

#Name
#Date
rm(list=ls())
possum<-read.csv("possum_trimmed.csv")
str(possum)
## 'data.frame':    104 obs. of  10 variables:
##  $ sex     : chr  "m" "f" "f" "f" ...
##  $ age     : int  8 6 6 6 2 1 2 6 9 6 ...
##  $ hdlngth : num  94.1 92.5 94 93.2 91.5 93.1 95.3 94.8 93.4 91.8 ...
##  $ skullw  : num  60.4 57.6 60 57.1 56.3 54.8 58.2 57.6 56.3 58 ...
##  $ totlngth: num  89 91.5 95.5 92 85.5 90.5 89.5 91 91.5 89.5 ...
##  $ taill   : num  36 36.5 39 38 36 35.5 36 37 37 37.5 ...
##  $ footlgth: num  74.5 72.5 75.4 76.1 71 73.2 71.5 72.7 72.4 70.9 ...
##  $ earconch: num  54.5 51.2 51.9 52.2 53.2 53.6 52 53.9 52.9 53.4 ...
##  $ eye     : num  15.2 16 15.5 15.2 15.1 ...
##  $ belly   : num  36 33 34 34 33 32 34.5 34 33 32 ...
par(mfrow=c(2,4))
for (i in 1:9){
if(is.numeric(possum[,i])){
  plot(y=possum[,10], x=possum[,i], xlab=colnames(possum)[i],ylab="Belly Size")
  abline(lm(possum[,10]~possum[,i]))
  }
}

B. More practice with for loops

Using the built in data set USPersonalExpenditure write for loop that takes the mean amount spent on each category from 1940-1960. Use the result to make a barplot of the mean expenditures for each category.

hints: First create an empty vector named uspe.means (use= uspe.means <- vector()). In the for loop, populate the vector with the means of each category, AND name each vector element using ‘names(uspe.means)’. Then use barplot() to make the graphic.

#ANSWER: 
uspe.means <- vector()
for(i in 1:5){
  uspe.means[i] <- mean(USPersonalExpenditure[i,])
  names(uspe.means)[i]<-row.names(USPersonalExpenditure)[i]
}
barplot(uspe.means,col="goldenrod")

C. Writing your own functions

Write a function which:

Takes a file name and two numbers (the defaults for the two numbers should be 1 and 2)

Reads in a table of data (assume that the file is comma delimited)

Plots the columns represented by the two numbers against each other

*Hints: Use the read.csv() function Use print() to check the values of intermediate results (to see if your function is working) Use the test file ”primates.csv" to check your program.

plot.columns <- function(filename, col1=1, col2=2) {
   data <- read.csv(filename, header=T, row.names=1) 
   print(data)
   plot(data[,col1], data[,col2])
}

plot.columns(filename="primates.csv")
##               Bodywt Brainwt
## Potar monkey    10.0     115
## Gorilla        207.0     406
## Human           62.0    1320
## Rhesus monkey    6.8     179
## Chimp           52.2     440