I hate to say it… but Microsoft really have produced an excellent tool for sysadmins and performance testers alike. That tool is called LogParser. I’ve mentioned it before but that was more IIS specific. LogParser can be pointed at just about any type of log file. So with a bit of Perl (or equivalent) to cleanup, and LogParser I was able to analyze Gigabytes of IBM WebSEAL access logs on a production system.
Go to my fileshare at http://90kts.com/uploads and download/install the following files, or google for the latest binaries, using the defaults.
LogParser.msi
gzip-1.3.12-1-setup.exe
Create a working directory e.g. C:\working
Copy the parser.pl to your working directory
parser.pl code looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | open(IN, "<$ARGV[0]"); open(OUT, ">processed.csv"); print OUT "host, user, datetime, method, file, http, code, bytes"; while(<in>) { my($line) = $_; chomp($line); $line =~ s/\[|\]|\+1100|"//g; $line =~ s/\s+/,/g; $line =~ s/2009:/2009 /g; $line =~ s/,\-,/,/g; $line =~ s/,$//g; print OUT "$line\n"; } close(IN); close(OUT); |
Copy a gz’d log fie to the working directory and open a cmd shell there
WebSEAL logs look a little like this:
10.1.2.3 - TKOOPS [05/Feb/2009:06:54:11 +1100] "GET HTTP://productionsite.com/wps/CacheProxyServlet/colorPalette/default/browserVendor/Microsoft/browserName/Internet+Explorer/browserVersion/6.0/locale/en/forwardurl/wps/themes/html/images/portlet-table-heading-bg-right.gif HTTP/1.1" 304 0
Run the following commands:
gunzip request.log.2009-02-06-06-59-36_DIR.gz
parser.pl request.log.2009-02-06-06-59-36_DIR
To view the number of requests grouped by 5 minute intervals and HTTP result code run this from the cmd shell:
"C:\Program Files\Log Parser 2.2\LogParser.exe" "SELECT QUANTIZE(datetime,300) AS TimeRange, code AS HttpCode, Count(*) AS Count FROM C:\working\processed.csv GROUP BY TimeRange, code" -i:CSV -iTsFormat:"dd/MMM/yyyy hh:mm:ss" -o:DATAGRID
To view the number of requests grouped by 5 minute intervals and User ID run this from the cmd shell:
"C:\Program Files\Log Parser 2.2\LogParser.exe" "SELECT QUANTIZE(datetime,300) AS TimeRange, user AS User, Count(*) AS Count FROM C:\working\processed.csv GROUP BY TimeRange, user" -i:CSV -iTsFormat:"dd/MMM/yyyy hh:mm:ss" -o:DATAGRID
Note these two commands will send results to a GUI datagrid, from which you can copy results into excel and chart if you wish. If you don’t want the datagrid and just want stdout then omit the -o:DATAGRID tag from the command. Example output is shown as follows:

LogParser also has its own charting equivalent, which I haven’t explored. My parser is the bit that cleans up the file and adds the headers etc. Hopefully you can get enough from these instructions to tackle it on your own. For all you perl haters out there, I already know the answer to this question. Is perl dead yet? Feel free to post your python, ruby, shell equivalents.
PS. we ended up modifying the parser.pl used above to iterate through a directory full of gz files as such:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | $dirname = $ARGV[0] || 'C:\working'; opendir(DIR, $dirname) or die "can't open directory $dirname: $!"; while (defined($file = readdir(DIR))) { if($file =~ /\.gz/) { $unzip = `c:\\PROGRA~1\\GnuWin32\\bin\\gzip -d $dirname\\$file`; ($filename) = ($file =~ /(.+?)\.gz/); open(IN, "<$dirname\\$filename"); open(OUT, ">$filename.log"); print OUT "host, user, datetime, method, file, http, code, bytes"; while(<in>) { my($line) = $_; chomp($line); $line =~ s/\[|\]|\+1100|"//g; $line =~ s/\s+/,/g; $line =~ s/2009:/2009 /g; $line =~ s/,\-,/,/g; $line =~ s/,$//g; print OUT "$line\n"; } close(IN); close(OUT); unlink("$filename"); } } closedir(DIR); |

Logger download at:
http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07&displaylang=en
Its great tool,had used it sometime back,has the excellent processing power.I remember using this tool for processing around 5 GB IIS Log files for troubleshooting.