I recently had a conversation with a colleague about how one might determine the concurrency for a given page or transaction at any point in time. The system under test was a typical web server and we had access to the web logs for analysis.
That is, we wanted to know how many virtual users were hitting a web page at the same time without using a sledgehammer to open the walnut so to speak… To that end we fell back to simple unix commands like awk, sort and uniq.
The typical web log entry we were looking for looked like this:
10.1.20.10 - Unauth [16/Dec/2008:11:59:16 +1100] "GET HTTP://system.under.test/secure.html HTTP/1.1" 200 692
On windows we would use a command similar to this to generate our report:
gawk "{print $4,$7}" web01-request.log | grep "\/secure" | uniq -c | sort -T C:\temp | gawk "{print $1}" | uniq -c
The first awk prints the date time column and url column which we then grep for any entries containing our target word ’secure’ (we could have done this in the awk statement too).
Then we use uniq -c to generate a unique count of rows that have the same date time, sort that (using a temp file), awk again on the first column and uniq -c again. The output looks like this:
2218 1
1483 2
263 3
258 4
38 5
36 6
2 7
5 8
1 12
So we can then determine for example that there were 1,483 occurrences where 2 virtual users hit the target at the same time. There were 258 occurrences where 4 virtual users hit the target at the same time and so on.
You can then graph this using the google chart API. No excel required =)
