Using Python’s Pandas inside IPython Notebook

02/14/2012

IPython is a cool shell to run Python from and Pandas is a Python library for holding tabular data similar to R’s data frame.

To install the software run:

sudo apt-get install python-pip libzmq-dev python-dev g++ libfreetype6-dev libpng12-dev libblas-dev liblapack-dev gfortran cython libhdf5-serial-dev
sudo pip install ipython
sudo pip install tornado
sudo pip install pyzmq
sudo pip install pygments
sudo pip install numpy
sudo pip install matplotlib
sudo pip install scipy
sudo pip install patsy
sudo pip install statsmodels
sudo pip install pandas
sudo pip install pytz
sudo pip install numexpr
sudo pip install tables
sudo pip install jinja2

To run IPython notebook run:

ipython notebook --pylab inline

As an example, you can run the following code in the IPython web notebook to draw a chart of the S&P 500:

import datetime
import matplotlib.pyplot as plt
from pandas.io.data import DataReader

sp500 = DataReader("^GSPC", "yahoo", start=datetime.datetime(2000, 1, 1)) # returns a DataFrame
top = plt.subplot2grid((3,1), (0, 0), rowspan=2)
top.plot(sp500.index, sp500["Adj Close"])
bottom = plt.subplot2grid((3,1), (2,0))
bottom.bar(sp500.index, sp500.Volume)
plt.gcf().set_size_inches(18,8)