Getting started with R

07/25/2010

Getting Help

R is incredibly frustrating to learn because it is impossible to do a Google search and have anything meaningful come back for the query “R”.   RSeek may provide some help in this regard.

Installing Libraries

To use R in Linux, I recommend installing the latest version of the RKWard GUI program.

As an example here we’ll display a stock chart using the quantmod library.  To install a library add yourself to the staff group (sudo usermod -a -G staff ${USER}) run install.packages:

install.packages("Defaults")
install.packages("quantstrat", repos="http://R-Forge.R-project.org", type="source")
install.packages("PerformanceAnalytics", repos="http://R-Forge.R-project.org", type="source")

In the code above c is used to combine arguments into a list and is a very common R command.  You’ll also notice the use of repos to specify a named argument.

You can install packages from source if needed:

install.packages("xtime", lib="/home/bmccann/R/i486-pc-linux-gnu-library/2.10", repos = NULL, type="source")
install.packages("xts", lib="/home/bmccann/R/i486-pc-linux-gnu-library/2.10", repos = NULL, type="source")

You can also update packages as needed:

update.packages()

In order to load any libraries that you’ve installed, you’ll need to run library:

# load the quantmod library
library(quantmod)
library(TTR)
library(blotter)
library(PerformanceAnalytics)
library(FinancialInstrument)

# get the S&P 500 data from Yahoo!
getSymbols("^GSPC", from = "1900-01-01", to = Sys.Date())

# chart the past 5 years of the S&P 500 with 50-day SMA
chartSeries(GSPC, TA="addSMA(n = 50)", subset='last 5 years', theme='white')

# chart the past 5 years of the S&P 500 with 200 and 300 day SMA
sma200<-SMA(close, n=200)
sma300<-SMA(close, n=300)
chartSeries(close, subset='last 5 years', theme='white')
addTA(sma200, on = 1, col=c('red','red'))
addTA(sma300, on = 1, col=c('blue','blue'))

# 50-day SMA in a more manual fashion to demonstrate adding custom lines to chart
close<-Cl(GSPC)
sma<-SMA(close, n=50)
chartSeries(close, subset='last 5 years', theme='white')
addTA(sma, on = 1, col=c('red','red'))

# Reset the workspace
rm(list=ls(all=TRUE, pos=.blotter), pos=.blotter)
rm(list=ls(all=TRUE, pos=.instrument), pos=.instrument)

If you get a message about a public key the next time you run “sudo apt-get update“:

W: GPG error: http://cran.stat.ucla.edu lucid/ Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY D67FC6EAE2A11821

Then you should run:

gpg --keyserver keyserver.ubuntu.com --recv D67FC6EAE2A11821
gpg --export --armor D67FC6EAE2A11821 | sudo apt-key add -

Performance Problems

You may notice that performance is harder to get right in R.  The standard for-loop solutions used in other languages are very inefficient in R and should be vectorized.  This Stack Overflow discussion gives some insight into R performance problems.  You can time you code by surrounding it with system.time:

time <- system.time({
  # code here
})
print(paste("The code took", time["elapsed"], "seconds."))

Testing

RUnit is helpful in testing R code.  I wrote an RUnit test suite for blotter, which may be a helpful example.

Be Sociable, Share!