Today, Brian Ripley commited the revision 50 000 into R svn repository.

------------------------------------------------------------------------
r50000 | ripley | 2009-10-09 10:34:17 +0200 (Fri, 09 Oct 2009) | 1 line
Changed paths:
   M /branches/R-2-10-branch/src/library/stats/R/plot.lm.R

port r49999 from trunk
------------------------------------------------------------------------
r49999 | ripley | 2009-10-09 10:33:28 +0200 (Fri, 09 Oct 2009) | 2 lines
Changed paths:
   M /trunk/src/library/stats/R/plot.lm.R

workaround for PR#13899 (that in the report is broken and fails make check!)

so it is time to celebrate and have some fun with the svn log to analyze the 50 000 commits ... with R of course.

data extraction

First we need to grab the full svn log, using command line svn, something like this:

$ svn log -v https://svn.r-project.org/R > rsvn.log

... or you can download it from my website if you don't have svn on your machine

now we need to read the data into R :

we might also be interested in release date, version number and size of the distribution of each R release that is archived on CRAN, which we can get like this :

graphics

now we can do some graphics. I'm using lattice here because I am familiar with it, but I'm sure interesting plots could be done using ggplot2, in fact checkout this post from Yihui Xie using ggplot2

First I need to define some helper panel functions I'll use in the plots below

Number of commits per day

commits_day.png

... split by author

commits_author_day.png

The number of commits per month

commits_month.png

... split by author

commits_author_month.png

blogroll