Miskatonic University Press

Canadian library statistics visualized with R and Google Motion Charts

r librarystats

Here's an example of using the googleVis package for R, which makes it easy to use the Google Visualization API.

(If you're reading this through an RSS feed and don't see an interactive chart below, come see the full blog post. You'll want to play with it.)

I already had some cleaned-up Association of Research Library statistics sitting around in arl-1989-2009.csv, and then all it took was this in R (and setting up canada.toplot could have been done in one line):

> install.packages('googleVis')
> arl <- read.csv("http://www.miskatonic.org/files/arl-1989-2009.csv")
> canada <- subset(arl, REGION == 10)
> canada.toplot <- canada[, c(1, 3, 39, 41, 42, 44, 55, 57, 66, 67, 70)]
> M <- gvisMotionChart(canada.toplot, idvar="INAM", timevar="YEAR")
> plot(M) # to plot it locally
> cat (M$html$chart, file="chart.html") # so I could include it here

The variable names are what the ARL uses. They are:

  • TOTCIRC = total circulation
  • PRFSTF = number of professional staff
  • NPRFSTF = number of non-professional staff
  • TOTSTF = total staff
  • TOTSAL = total salaries
  • TOTEXP = total expenditures
  • TOTSTU = total number of students
  • GRADSTU = number of graduate students
  • FAC = number of faculty

Try starting off with TOTSTU against FAC or TOTSTU against TOTCIRC, There are lots of other variables that could be plotted but to keep it manageable I just picked out some I thought would be interesting. Try turning on the log view and seeing how that changes things.

(It's funny how Library and Achives Canada jumps in at the end out of nowhere with a very large number of staff. Did they just recently join the ARL? Is my data wrong?)