Tracing JMeter transactions under load

JMeter Transaction Tracing

Imagine you’re running a JMeter load test on Flood IO’s distributed cloud based infrastructure and your response times start to waver, or worse, your failed rate of requests per minute starts to climb.

You check your Flood IO charts and they’re demonstrating failures under load.

Previously, as with most other cloud based load testing tools you’d be left wondering, what happened to that request? What was actually returned by the server? You might begin to grep the mountains of logs on your app servers looking for a clue, but wouldn’t it be nice to have Flood IO show you exactly what was returned from a client point of view?

All JMeter tests run on Flood IO now have the ability to inspect the slowest successful and failed transaction request and response headers, live, whilst your test is running.

In addition, you also have the ability to view request and response data returned by the server for failed transactions. This offers you much better fault finding capability under load.

You don’t have to configure anything else in your test plan. We do this automatically with some innovation inside JMeter and our own reporting engine. Coming soon will be the ability to do the same for Gatling tests as well as more detailed TCP network analysis. This feature is available to all users, with live data available for the duration of your test / before you start your next test and whilst your grid is still running.

About these ads

Guide to JMeter Regular Expressions

This is a guide to using the JMeter regular expression extractor, to help correlate response values with future request parameters. It introduces you to regular expressions and how they’re implemented in JMeter. We’ll take a look at common scenarios we see on Flood IO and answer common support questions.

You may have heard the term correlation used by performance testers, but what does it mean?

Correlate. have a mutual relationship or connection, in which one thing affects or depends on another.

Instead of the science of correlation and causation, in load testing language, this often refers to the act of correlating a dynamic value from a response with a value in a future request. If you are converting from LoadRunner to JMeter then you will know what I mean, if not, then read on!

Examples of dynamic data in response bodies

  • CSRF Tokens: the server sends a unique authenticity token in the body of a response, which is then used by the browser when it submits a form in the next request.
    first response
    <<<<<<<<<<<<<<<<<<<<<<<<
    <meta content="authenticity_token" name="csrf-param" />
    <meta content="VLDkNA3oFiFP0ap9zOPkWwAwLxmKwFpZ57JlUVZA//E=" name="csrf-token" />
    
    next request
    >>>>>>>>>>>>>>>>>>>>>>>>
    Content-Disposition: form-data; name="authenticity_token"
    VLDkNA3oFiFP0ap9zOPkWwAwLxmKwFpZ57JlUVZA//E=
    
  • __ VIEWSTATE: Microsoft® ASP.NET web applications persist changes to the state a form in the body of a response, which is used on each susbequent postback to the server.
    first response
    <<<<<<<<<<<<<<<<<<<<<<<<
    <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="qUJAF651pJ8WTL0r7dcB+HwCIu5roI89rxbRJhCalaDd5WwuJBR4XnqrSL+1ntHDz4JXmZX3J+uH1Z0yMMqHoN9lwc4qsduHqyB5IkMPTQtH7R7RLf3y+0JrfvE48s10Jo2WZ5X6kc3QM2jbBzG2VR1Fbnn9ZN9IV5nbN7Jc4+UQ3O8PuqpY+vG6hLdWsZOzo6FXAVa5ibL57KW7pPcUDzO+Zzi196o0WTz79HVUf2eQVK9uZEX4kWHOJcNmUcd8kyTU+Xobex3z0jnc29axbnsFZbbRnLLUeZw0Nnycn50qN7pafSBsEe2xG8FRGPdzVi6KNNfCLm7V/FGkJiDbFeopvkBNXXHx/gwJs1UONXqQm/YhTNraTb2B0fKzfbHcJ/2lTco+jQ/fPbNr6rPWrg+DGC5DohH1MdXb9Rtw9LDzJ0SUPC5B7kf+6uswY3jkrQPKYnp9hrhdqwvygjMe55Df0t6Um9voXv/vRR8HK60EZQQ8tYcG0Qnot/fYZw+Kt9lkY47mEZjL9YXoJDgQlNHK2uFBzCKB+L3CU1v7TanITLqWrOf7n6nujpUiQ5J0hbY9iPQpyvsyntA/cHGYr3vjF2OxmvurWAoZFA4f0r1Y2Ig0X16bz6OFJnY5IuJrzJuKTnGHHlTMyRkbbtSqnDN2yxYttHOQfrgXe5Y8pRK05vEHLtk8wBsZHl9xxzRJcBt6Gz+XcIFNa/5BiI8+cbiSP6ssdCYHsqumKevMotKFHdLRwY2OVijtKBrJFSDhKbtBPP3RM8zzc2KtY11+PKQGN58=" />
    
    next request
    >>>>>>>>>>>>>>>>>>>>>>>>
    Content-Disposition: form-data; name="__VIEWSTATE"
    qUJAF651pJ8WTL0r7dcB+HwCIu5roI89rxbRJhCalaDd5WwuJBR4XnqrSL+1ntHDz4JXmZX3J+uH1Z0yMMqHoN9lwc4qsduHqyB5IkMPTQtH7R7RLf3y+0JrfvE48s10Jo2WZ5X6kc3QM2jbBzG2VR1Fbnn9ZN9IV5nbN7Jc4+UQ3O8PuqpY+vG6hLdWsZOzo6FXAVa5ibL57KW7pPcUDzO+Zzi196o0WTz79HVUf2eQVK9uZEX4kWHOJcNmUcd8kyTU+Xobex3z0jnc29axbnsFZbbRnLLUeZw0Nnycn50qN7pafSBsEe2xG8FRGPdzVi6KNNfCLm7V/FGkJiDbFeopvkBNXXHx/gwJs1UONXqQm/YhTNraTb2B0fKzfbHcJ/2lTco+jQ/fPbNr6rPWrg+DGC5DohH1MdXb9Rtw9LDzJ0SUPC5B7kf+6uswY3jkrQPKYnp9hrhdqwvygjMe55Df0t6Um9voXv/vRR8HK60EZQQ8tYcG0Qnot/fYZw+Kt9lkY47mEZjL9YXoJDgQlNHK2uFBzCKB+L3CU1v7TanITLqWrOf7n6nujpUiQ5J0hbY9iPQpyvsyntA/cHGYr3vjF2OxmvurWAoZFA4f0r1Y2Ig0X16bz6OFJnY5IuJrzJuKTnGHHlTMyRkbbtSqnDN2yxYttHOQfrgXe5Y8pRK05vEHLtk8wBsZHl9xxzRJcBt6Gz+XcIFNa/5BiI8+cbiSP6ssdCYHsqumKevMotKFHdLRwY2OVijtKBrJFSDhKbtBPP3RM8zzc2KtY11+PKQGN58=
    

How do I know if I need to correlate response values with future requests?

An experienced tester will have an eye for this, and by looking at network activity in their favourite proxy recorder or network debug tool, well known parameters such as VIEWSTATE or JSESSIONID, or timestamp and tokens should stand out like the proverbial. Other parameters may be more subtle to detect, especially single characters or obfuscated parameters that result from some form of javascript parsing / execution.

The only way to know for sure, is to adapt the old proverb

Measure twice and DIFF!

By that we mean take a snapshot or recording of your transaction twice, with the same user and use some form of file comparison tool to detect differences. If you can’t do that, then often the Mark I Human Eyeball will suffice.

Correlation using JMeter

Using the above CSRF token example, let’s take a look at how it’s done in JMeter.

We’d make our first request with a HTTP Request Sampler

Thread Group
  HTTP Request
    Server Name or IP: flood.io
    Path: /

To that request, we’d add a Regular Expression Extractor

Thread Group
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: authenticity_token
      Regular Expression: content="(.+?)" name="csrf-token"
      Template: $1$
      Match No. (0 for Random): 1
      Default Value:

The Reference Name will store the results of the expression in the JMeter variable ${authenticity_token}

Your First Regular Expression

If you don’t know any regex, take heart, you can get started with some basics. Consider the following expression:

content="(.+?)" name="csrf-token"

All regular expressions are pattern matching. This expression says:

  1. match the characters content=" literally (case sensitive)
  2. then capture the 1st group in brackets (.+?)
  3. inside that group, match any character .+? between one and unlimited times, as few times as possible, expanding as needed (otherwise known as lazy)
  4. then match the characters " name="csrf-token" literally (case sensitive)

Visually the expression looks like this:

 

Now that we have a regular expression, we specify a template for using it with $1$. This means the variable ${authenticity_token} will be populated with the first matched group only.

Our Match Number is set to 1, we want the first instance of this matched string, as there could be multiple matches on a page.

Some Variations on Regex

The JMeter manual has some useful regular expressions which you can familiarise yourself with. Following is an an example which fleshes out some of these concepts.

Often you will need to extract multiple attributes from a HTML tag, for example the ID and Name attributes associated with a particular class.

<input class="cats" id="meow" name="Buster">
<input class="cats" id="purr" name="Mac">
<input class="cats" id="roar" name="Sooky">

Consider the following regular expression extractor:

Thread Group
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: cat
      Regular Expression: class="cats" id="(.+?)" name="(.+?)"
      Template: $2$ says $1$
      Match No. (0 for Random): 1
      Default Value: 

This would yield the following results:

cat=Buster says meow
cat_g=2
cat_g0=class="cats" id="meow" name="Buster"
cat_g1=meow
cat_g2=Buster

What does that all mean? We stored the results of the expression class="cats" id="(.+?)" name="(.+?)" in a JMeter variable called ${cat}. The template we used was the 2nd group followed by the string saysfollowed by the first 1st group. So in effect ${cat} now equals Buster says meow

JMeter also breaks the expression up into other variables which is handy when we only want parts of the matched expression. For example ${cat_g1} says the 1st group equals meow and likewise ${cat_g2}equals the 2nd group Buster. Indeed we can also get the entire matched string from the regular expression via $cat_g0.

What happens if we wanted all the matches on a page? This is where Match No. comes into play. In previous examples we used the 1st match found on the page.

Match No. is not a zero based index! If you specify 0 then you will get a random match from the page.

So this expression:

Thread Group
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: cat
      Regular Expression: class="cats" id=".+?" name="(.+?)"
      Template: $1$
      Match No. (0 for Random): 0
      Default Value: 

Would yield the following results:

cat=Sooky
cat_g=1
cat_g0=class="cats" id="roar" name="Sooky"
cat_g1=Sooky

In this case we only matched one (group) and used a random match on the page, so this iteration ${cat}equals Sooky.

In further iterations, we would see any random value from BusterSooky and Mac.

The other trick we might like to do is return all matches on the page.

This expression:

Thread Group
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: cat
      Regular Expression: class="cats" id="(.+?)" name="(.+?)"
      Template: $1$
      Match No. (0 for Random): -1
      Default Value: 

Yields the following results:

cat=Mac
cat_1=meow
cat_1_g=2
cat_1_g0=class="cats" id="meow" name="Buster"
cat_1_g1=meow
cat_1_g2=Buster
cat_2=purr
cat_2_g=2
cat_2_g0=class="cats" id="purr" name="Mac"
cat_2_g1=purr
cat_2_g2=Mac
cat_3=roar
cat_3_g=2
cat_3_g0=class="cats" id="roar" name="Sooky"
cat_3_g1=roar
cat_3_g2=Sooky
cat_matchNr=3

There’s a new variable present called ${cat_MatchNr} which equals 3. As the name suggests, this is the total amount of matches on the page. This can be quite handy for a number of reasons. For example if we wanted to loop through all the matches in the response and include them in the next request, we could do something like this using a BeanShell PreProcessor

The following BeanShell script executes a basic for loop, from 1 up to cat_matchNr which equals 3, and for each iteration of the loop, adds a HTTP request parameter as follows:

Thread Group
  HTTP Request
    -> BeanShell PreProcessor
      Script:
        int count = Integer.parseInt(vars.get("cat_matchNr"));

        for (int i=1; i<=count; i++)
        {
          says = vars.get("cat_" + i + "_g1");
          cat = vars.get("cat_" + i + "_g2");
          sampler.addArgument(cat + "_says", says);
        }

This yields the following request:

GET http://wheres.my.kitten.com/?Buster_says=meow&Mac_says=purr&Sooky_says=roar

Extracting from the Header

Sometimes the value you are after does not exist in the response body, it might exist in the response header.

It’s easy to do this, just change the response field to check to Headers

So this expression:

Thread Group
  HTTP Request
    -> Regular Expression Extractor
      Response Field to Check: Headers
      Reference Name: auth_token
      Regular Expression: token":"([^"]+)"
      Template: $1$
      Match No. (0 for Random): 1

Would yield the following results:

auth_token=abcd1234
auth_token_g=1
auth_token_g0=token":"abcd1234"
auth_token_g1=abcd1234

Extracting values from JSON

Flood IO supports the use of JMeter plugins which make it very simple to extract JSON values from a response body.

Consider the following response body from a typical HTTP response containing JSON:

{"ok":true,"status":200,"name":"Armory","version":{"number":"0.19.8","snapshot_build":false},"tagline":"You Know, for Search"}

This expression:

Thread Group
  HTTP Request
    -> jp@gc - JSON Path Extractor
      Name: name
      JSON Path: $.name

    -> jp@gc - JSON Path Extractor
      Name: version_number
      JSON Path: $.version.number

Would yield the following results:

name=Armory
version_number=0.19.8

It doesn’t get more simple than that!

The examples we’ve shown on this page are probably enough to get you started. Feel free to contact support with more specific questions.

 

Flood IO Enterprise

Do you have your own data centre?

Stuck behind a firewall?

Want to test on premise?

We already provide a great product online at flood.io

Now you can download Flood IO Enterprise as a Virtual Machine to run on premise in your own data centre or test environment.

Flood IO Virtual Machines are built on VMWare® Virtual Hardware Version 10. These images are compatible with Workstation 10.x, Player 6.x, Fusion 6.x and ESXi 5.5 or later.

It’s easy to get started, download the the image and launch it in VMWare®. You can run it on Windows, Linux or OSX. You have the option of running nodes independently or clustered. The same great online product, packaged neatly for convenient deployment in your own environment.

Licensing is simple. The time limited version for any registered account can be used for free. Unlimited versions are available for business plans. Contact us for further details or a quote.

 

Dogfooding your Performance Test Platform

When it comes to self assessment, Captain Riker sums it up:

“The game isn’t big enough unless it scares you a little.” – William Riker

At Flood IO we take performance seriously and are always looking to improve. We find an increasing number of customers come to us for advice and not just execution of their test plans. A lot of that advice centres around how to improve first visit (empty cache) performance. Luckily you don’t need an expensive tool to figure this out. Tools such as Google Page Speed and Yahoo YSlow offer simple, straight forward advice on how to optimise web page performance.

There’s been a lot of media attention on this lately, just follow the #webperf hash tag on Twitter for example. If you’re not already familiar with the tools, go check them out. They’re easy to install and you can test your own site with a simple browser extension.

Which brings me back to the topic of this post. With all the SaaS performance test vendors out there and barrage of advice being offered, are they dogfooding their own product?

We took a look at 12 vendors that advertise in this space including ourselves to see how we stand up.

The Sites

We chose from the following SaaS performance test sites to run the comparison tests:

Flood, Loadstorm, Apicasystem, Loader, Telerik, Neotys, Neustar, Loadimpact, Blazemeter, Blitz, Soasta and Smartbear.

The Comparison

We simulated a first visit to the home page of each product with an empty browser cache, using the free service of GTMetrix. This site runs reports from Vancouver Canada using Firefox (Desktop) 25.0.1, Page Speed 1.12.16 and YSlow 3.1.7. Tests were automated using the GTMetrix API and a simple bash script we cooked up for this purpose. This makes the test repeatable and can be expanded to include other sites if interested.

The Results

For each dimension of the test we rank sites from best to worst in terms of their individual score. You can view the summary results here which includes links to the individual GTMetrix formatted reports.

Page Load Times (ms)

This is a measure of page load time in milliseconds and is perhaps the most contentious dimension measured. Typical caveats to this score will be things like repeatability (hitting cold or warm servers), geographic distance from test to origin, TCP connection overheads, SSL negotiation and so on. In any case, Loadstorm was ranked number one here, with the majority under 3 seconds.

1639 http://loadstorm.com
2104 http://apicasystem.com
2215 http://loader.io
2585 http://www.telerik.com
2688 http://loadimpact.com
2816 https://flood.io
2876 http://www.neotys.com
3019 http://neustar.com
3153 http://soasta.com
3472 http://blitz.io
3683 http://smartbear.com
3848 http://blazemeter.com

Page Requests

AKA the how bad does my site suck on a high latency mobile network score or more commonly the I don’t know how to combine files and minimise requests score.

80% of the end-user response time is spent on the front-end. Most of this time is tied up in downloading all the components in the page: images, stylesheets, scripts, Flash, etc. Reducing the number of components in turnreduces the number of HTTP requests required to render the page. This is the key to faster pages.

It’s an area in which Flood IO is quite aggressive in terms of optimizations. In fact a second visit to our home page with a primed cache is just 2 requests. We take care to combine assets into single files, use CSS image sprites where possible and generally reduce the amount of HTTP requests required. Very important for high latency links like mobile users and often overlooked by clients including our competitors.

12 https://flood.io
30 http://loadstorm.com
38 http://www.telerik.com
40 http://loadimpact.com
41 http://apicasystem.com
45 http://www.neotys.com
46 http://loader.io
55 http://blitz.io
58 http://neustar.com
75 http://smartbear.com
87 http://soasta.com
110 http://blazemeter.com

Page Weight (B)

AKA the but I thought everyone was on 100 Gigabit Ethernet score.

Put simply, what is the total page weight in Bytes for all content on your page for a first visit with an empty cache.

Apparently web page size has grown +151% in the last 3 years with the average page size now 1575 KB. Flood IO is in the middle of that range with 719 KB and some of the worst offenders creeping above the average. It makes you wonder, just how good do those images need to be as part of your market-eering?

261625 http://loadstorm.com
544241 http://loadimpact.com
545797 http://apicasystem.com
594202 http://www.neotys.com
597154 http://www.telerik.com
654106 http://loader.io
690677 http://neustar.com
719168 https://flood.io
922117 http://blitz.io
1094106 http://blazemeter.com
1574796 http://smartbear.com
1857024 http://soasta.com

Google Page Speed

AKA the how does Google page speed rank my site score.

Google’s Page Speed best practices cover many of the steps involved in page load time, including resolving DNS names, setting up TCP connections, transmitting HTTP requests, downloading resources, fetching resources from cache, parsing and executing scripts, and rendering objects on the page.

It’s a useful comparison between sites because it’s effectively a static analysis against a common set of rules. The higher the score the better. Flood IO came equal first with Loadimpact and the majority of sites score well in the +90% range. Some sites can definitely improve though.

96 https://flood.io
96 http://loadimpact.com
93 http://www.telerik.com
93 http://loadstorm.com
92 http://soasta.com
92 http://smartbear.com
92 http://blazemeter.com
89 http://loader.io
87 http://apicasystem.com
84 http://www.neotys.com
68 http://neustar.com
52 http://blitz.io

Note: this is slightly different to the Google PageSpeed Insights score not listed here, but this also gives you the ability to view your score from both mobile and desktop devices. It’s worth trying for yourself. Flood IO scores an impressive 98/100 for both platforms.

Yahoo YSlow

AKA the I don’t believe Google, what does Yahoo think score.

We’ve long been a fan of this browser extension. Some dimensions of the score you might need to take with a grain of salt, and it is generally a little more harsh but once again, as a static analysis for comparison, it’s a great measure of performance. Looks like Flood IO comes out on top again, but no surprise since we focus a lot of our attention in improving these scores.

96 https://flood.io
93 http://apicasystem.com
90 http://loader.io
89 http://loadstorm.com
81 http://www.neotys.com
81 http://loadimpact.com
75 http://blazemeter.com
72 http://www.telerik.com
70 http://soasta.com
68 http://neustar.com
67 http://smartbear.com
63 http://blitz.io

Steps to Improve

So what have we been doing to improve our own score lately? We like to use another free service provided by WebPageTest which lets us test these different performance dimensions in our own time and from different locations around the world. Example results are shown here

We’ve since:

  1. ensured our resources are compressed and served via a CDN where possible.
  2. minified our resources to the extent we still maintain our core functionality, look and feel that we’re happy with.
  3. taken steps to improve server response time by removing or tuning back end system calls through application profiling.
  4. leveraged browser caching as much as possible including correct use of expires and cache-control headers.
  5. optimised images as much as possible including use of CSS image sprites to further reduce number of requests to the origin servers.

The waterfall chart for our first visit performance is quite telling and gives us ideas for further optimizations, although how much is enough?!

First impressions count, and we hope this post demonstrates that Flood IO is indeed eating its own dog food.

– reposted with permission from flood.io

Understanding the JMeter Cache

Overview

Web Caching can be a complicated topic and is vital to web application performance. The main reasons for caching are to reduce latency and network traffic between a client and origin server. Most representations such as HTML, JavaScript, CSS and Images are cacheable and implementations of caching are often misunderstood. In this post we’ll deal with the most obvious cache, that is, the browser cache and how JMeter emulates this browser behaviour.

JMeter is not a Browser

A common misconception of first time users of JMeter is that it functions exactly like a browser. It doesn’t. JMeter like most performance test tools can emulate browser behaviour in a number of ways. Static resource fetching, caching, cookie and header management are some of the main components to browser emulation. Let’s talk about caching and how that can be implemented in JMeter.

Browser Cache

You’re probably already familiar with your browser’s local cache. This lets you keep representations of content on local storage. When you hit your browser’s back button for example, chances are you’re seeing cached content. Based on simple rules, the browser cache typically checks representations are fresh according to the current browser session. If representations are determined to be stale, then new requests will be made to the origin server.

Other Caches

Other caches, such as proxies, can exist between your browser cache and the origin server. It’s important to be aware of these as intermediary caching obviously impacts the workload profile of the origin server itself. Often these are outside of your control, and are particularly prevalent in any form of testing outside of a controlled network environment.

Cache Management

Freshness and Validation are two important concepts of cache management. Fresh content is typically served straight from the browser cache, whilst validated content will avoid requesting the same content if it hasn’t changed. Cache management, in terms of determining if content is stale or fresh, is mostly implemented via HTTP headers.

Pragma: no-cache

Pragma headers specify optional behaviour. The Pragma: no-cache header is only defined for backwards compatibility with HTTP/1.0 and is equivalent to the Cache-Control: no-cache header directive. Some caches will simply ignore the Pragma directive so it’s not a great way of ensuring content is not cached.

Cache-Control

Cache-Control headers were introduced in HTTP/1.1 to specify directives that must be followed by all caching implementations in the request/response chain. The common Cache-Control directives you will come across are:

  • public
  • private
  • no-cache
  • no-store
  • no-transform
  • must-revalidate
  • proxy-revalidate
  • max-age
  • s-maxage

Expires

The Expires header gives the date/time after which the response is considered stale. The presence of an Expires header field with a date value of some time in the future on a response that otherwise would by default be non-cacheable indicates that the response is cacheable, unless indicated otherwise by a Cache-Control header field.

Last Modified

The Last-Modified header field indicates the date and time at which the origin server believes the variant was last modified.

ETag

The ETag header field provides the current value of the entity tag for the requested variant.

JMeter Cache Manager

The JMeter HTTP Cache Manager is used to add caching functionality to HTTP requests within its scope. Essentially it will check response headers and respect the majority of headers related to cache management around Expires, ETag and Cache-Control directives.

LRU Map

It implements a per-user/thread Map which has a maximum size and uses a Least Recently Used algorithm to remove items from the Map when the maximum size is reached and new items are added.

The default size is 5000 items, an item being stored as the URL along with Last Modified, Expires and ETag header information. It’s possible to increase this value, however setting an appropriate value for this is difficult with no feedback from JMeter itself. Flood IO provides a Response Code timeline chart, so if you only see response code 200, or a much higher rate of response code 200 compared to expected response codes 304 and 204 when using the Cache Manager you may need to adjust.

Expected Response Codes

When using the JMeter Cache Manager you can expect to see HTTP response codes 304 or 204.

When JMeter makes a conditional GET request but the document has not been modified, a server will typically respond with a 304 response code and no response body.

Thread Name: Thread Group 1-1
Sample Start: 2013-10-29 20:20:07 EST
Load time: 3
Latency: 0
Size in bytes: 150
Headers size in bytes: 150
Body size in bytes: 0
Sample Count: 1
Error Count: 0
Response code: 304
Response message: Not Modified

When JMeter simulates using content directly from its cache, it will record a 204 response code and no response body.

Thread Name: Thread Group 1-1
Sample Start: 2013-10-29 20:20:07 EST
Load time: 0
Latency: 0
Size in bytes: 0
Headers size in bytes: 0
Body size in bytes: 0
Sample Count: 1
Error Count: 0
Response code: 204
Response message: No Content

Confusingly, this does not mean the server has fulfilled the request nor responded with an entity-body, in order to return updated meta information as per the HTTP/1.1 specification. The HTTPClient4 Implementation in JMeter uses a constant from the HttpURLConnection which sets this value. So rest assured, no actual requests are made to the origin server in this case.

Testing the Directives

Putting this all together, we can test a simple scenario which has two iterations with a Cache Manager, to simulate the first visit (with an empty browser cache) and a second visit (with a primed cache). The target site is a simple stub using Ruby / Sinatra to exercise different cache control headers.

no-cache

If a no-cache directive is set, JMeter will keep this in cache, but set the expires date to null which will trigger a revalidation for each request. You will generally see a HTTP response code 304 for this type of conditional request. This is consistent with the HTTP/1.1 specification which must NOT use the response to satisfy a subsequent request without successful revalidation with the origin server.

no-store

If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it.JMeter does in fact cache this type of response so it is probably not consistent with the HTTP/1.1 specification.

This has since been fixed via bug ID 55721 thanks to prompt support by the JMeter core team!

remaining directives

The remaining directives including Expires / ETag headers all behave consistently with the HTTP/1.1 specification. In general, if the “Use Cache-Control/Expires header” option is selected in the Cache Manager, JMeter will issue a HTTP response code 204 for content in its cache, otherwise it will make a conditional request and you will see a HTTP response code 304 where appropriate.

A Real Example: squarefoot.hk

Let’s look at a realistic example using a real estate site based in Hong Kong. This example is interesting because it has a wide range of content and is fairly heavy in terms of a first visit to the site.

YSlow shows that the page has a total of 136 HTTP requests and a total weight of 933.6K bytes with an empty cache.

Subsequent visits to the site should see 17 requests with total weight of 165.1K bytes with a primed cache.

Using a simple JMeter plan with a HTTP Cache Manager, 1 thread, 1 iteration with 2 requests to the home page produces the following summary statistics when sampled via a debugging proxy (Charles):

First Visit Empty Cache Requests 100 Responses 1.03 MB

Second Visit Primed Cache Requests 37 Responses 123 KB

Not too bad in terms of anticipated drop in requests and response size, however not the same order of magnitude we observed with the simple YSlow comparison.

Part of the reason for this is that we’re getting JMeter to automatically parse the HTML file and send HTTP/HTTPS requests for all images, Java applets, JavaScript files, CSSs, etc. referenced in the file. This is not 100% accurate but it allows us to also use a pool of concurrent connections to get embedded resources, which provides a more realistic simulation of browser behaviour, evident in the following waterfall chart when using this approach:

What this means is that you will need to determine which resources were not downloaded using this approach and include them manually in your test plan if they are significant. There may also be 3rd party domains which we don’t want to test e.g.:

http://b.scorecardresearch.com

http://reagroup.122.2o7.net


http://secure-sg.imrworldwide.com

That’s why it’s extremely useful running your JMeter test plans through a debugging proxy like Charles orFiddler to get an idea of what your script is doing for a single user over multiple iterations. You can also do this more simply with a View Results Tree listener in JMeter, but the waterfall view is particularly nice to inspect network traffic.

The other reason for discrepancies between first and second visits to the site is that the contents regularly change. Different images are displayed each time we visit the site so those items won’t already be cached, hence a new request to the origin server. In the case of YSlow, it is comparing the exact same request/response chain.

TL;DR

The JMeter HTTP Cache Manager is used to add caching functionality to HTTP requests within its scope. Essentially it stores a copy of the URL along with Last Modified, Expires and ETag headers taken from the response in a Hash Map which will evict items based on a Least Recently Used algorithm.

It will typically record HTTP response codes 204 or 304 for items which are cacheable and meet freshness or validation criteria. Response body will be empty for these items so you need to be careful with any assertions on content.

It’s extremely useful running your JMeter test plans through a debugging proxy like Charles or Fiddler to get an idea of what your script is doing for a single user over multiple iterations. Alternatively use a View Results Tree listener in JMeter.

Convert HAR files to JMeter Test Plans

Overview

Any seasoned performance and automation tester will be aware the Number One Myth, that record & playback is rarely an accurate representation of user behavior. There is no such thing as a script-less process. Performance testers need to understand the sometimes complex chain of requests and responses their web application makes, most likely at the protocol level. It would be foolish to think otherwise.

However a recorded script is a great starting point for those cognizant of this fallacy. Recordings can save you a lot of time, particularly around preparation of complicated forms and parameters. That’s why we’ve created a HAR to JMX online converter at flood.io/har2jmx

Recording HAR files

Forget about proxy recorders and nuisance SSL certificates. You already have a powerful tool right in front of you. Your browser!

.har is a common filename extension denoting an HTTP Archive file. It is a JSON-formatted file that contains a trace of your web browser’s interaction with a given site. Most modern browsers such as Chrome and Firefox have the ability to record network activity and export the trace as a HAR file or copy its contents to the clipboard.

See below for a quick demonstration of tracing network requests and responses via Chrome for our ownscript challenge

Converting HAR files to JMX

Our flood.io/har2jmx will convert your copied HAR string into a JMeter formatted test plan (.jmx), saving you the hassle of creating it manually. It’s easy to try, see below for a quick demonstration.

Open in Jmeter

Once converted, just open the downloaded test plan in JMeter and resume editing. The converter will do things like automatically create POST parameters and detect if requests were made with XML HTTP Request Headers. Behind the scenes, we’re just using our popular ruby-jmeter gem. It’s all open source, so you could easily expand on the example here.

See below for a quick demonstration of the converted test plan.

Let Us Know

We’d love to know what you think. Have you tried our script challenge? Why don’t you test your scripting prowess and share the results.

Scaling WordPress from Zero to Hero

Overview

WordPress is arguably one of the “largest self-hosted blogging tool[s] in the world, used on millions of sites and seen by tens of millions of people every day.”

In this post we’re going to take a default wordpress installation from zero to hero in terms of performance. We’ll discover common bottlenecks with an iterative approach to performance testing using flood.io and theruby-jmeter DSL, whilst highlighting some tips along the way.

All the results and code used are available on github so the keen can re-create and explore testing themselves.

Setup

We’re using an AWS cloud formation template which builds a basic single instance, LAMP stack to host wordpress. Once it’s setup you should see a page similar to the following:

We’re using the ruby-jmeter DSL which lets us easily write test plans in Ruby and includes the necessary integration to get JMeter tests up and running on the cloud with flood.io. Our test plan is simple and straightforward; the guts of it shown here:

visit name: 'app_home', url: '/wordpress/'
visit name: 'app_sample_page', url: '/wordpress/sample-page/'
visit name: 'app_search_random_word', url: '/wordpress/?s=${random_word}'

You can also download the JMeter version of the test plan.

Essentially we visit the home page, then a sample page (with more content) and then search for a random word. In terms of coverage, this lets us hit static and dynamic content, as well as make trips to the database. In the default configuration of wordpress this will also include PHP bytecode interpreted by the Apache / PHP modules. This gives us basic coverage of typical user transactions.

More realistic scenarios would obviously explore more content, and ideally have data seeded in the data base that represents production volumes. We’re effectively running on an empty database.

Performance Model

Being agile, we don’t have any performance requirements :troll: and our fictional product manager has approached us and said they need the site to support 1M concurrent users!

When we asked for more information to help clarify this we were met with empty looks and told “Look, it just needs to support one million users in one hour OK?”.

Static Analaysis

In any case, in the absence of historical data .. “this is a new site!” .. and lacking any guidance in terms of target volumetrics we used YSlow to give us a static analysis of how the site might perform with just 1 user.

There’s important information we can glean from this without running a single performance test.

Throughput

We can see a first visit to the site with an empty (browser) cache makes 17 requests for 268 KB of content. A second visit to the site with a primed cache still makes 17 requests but for 18 KB of content. So in our 1M users per hour scenario, assuming they all visit the site with an empty browser cache means:

1M users x 268 KB content = 255 GB of traffic per hour at 581 Mbps (100000268/10248/3600).

That throughput would be a fair challenge for our single server setup .. hint; we’ll need to scale out to accommodate that.

Further more, 1M users x 17 requests per page = 283K requests per minute .. hint; we’ll need to wind that back for these types of volumes!

Concurrency

At this stage we’re starting to build a picture of what throughput might look like for a notional target of 1M users per hour, but we still don’t know what concurrency we’d be targeting. A fall back option for many testers is to divide the number of unique users by the average visit duration for a given period.

For example, 1M unique visitors per hour / ( 60 minutes / 15 minutes per visit ) would equal 250,000 concurrent users.

This approach has its disadvantages but it’s a start.

Targets

At this point we don’t have any better information so we lock down those figures as arbitrary targets:

250K concurrent users, 1M uniques per hour with up to 280K requests per minute.

A bad performance model is better than NO performance model. The model is always something you can test and adjust as feedback comes in from test results and/or subject matter experts.

Our 50 User Baseline

A baseline is useful as a point of reference before we leap off the deep end when running performance tests. Baselines are simply relative lines in the sand, not absolute. They can be set at different levels of load. A best case load scenario is useful to establish how the target site runs under negligible load. 50 users seems harmless enough in the context of our 250K concurrency target so our first test is exactly that.

We run 50 concurrent users for 10 minutes and observe a mean response time of 358 milliseconds at 359 kbps. No errors are observed so we’re left with warm tummy feelings that this will fly all the way to 250K users.

Our First 200 User Load Test

Now we have a best case scenario, we attempt our first load test. 4 times the baseline seems reasonable so we launch a load test with 200 concurrent users.

Everything looked great in the first 10 minutes until we reach 200 users and mean response time blew out to+15 seconds at 853 kbps. With sweaty hands we console in and find the culprit:

[root@ip-172-31-3-33 ~]# service mysqld status
mysqld dead but subsys locked

The database has run out of connections. This highlights a single point of failure for us. Sure we can increase max_connections, but we’re just going to keep hitting that limit as concurrency goes up. We submit our first request back to the infrastructure team requesting a dedicated database server, with much more RAM and properly tuned configuration.

“Infrastructure team are on holidays .. can’t action your request until next week and even then, no promises on additional hardware to support.”

Luckily there’s some tuning you can do that doesn’t involve infrastructure provisioning just yet, and besides, why ask for more hardware if you don’t know how much capacity you will need? It’s a long bow to draw describing capacity requirements from 200 to 250,000 users …

Application Tuning

We know that the current max connections of 100 users is being reached when the web site has 200 users. If you fall to the temptation of just increasing limits, you may find yourself in an endless game of “whack the beaver”.

A better idea is to reduce demand on the database in the first place, as you can always revisit these settings later. I like this bracketed approach rather than opening all the pipes to begin with but it doesn’t always suit.

The Guerrilla Manifesto teaches us that “You never remove a bottleneck, you just move it.”

Consider …

In any collection of processes (whether manufacturing or computing) the bottleneck is the process with the longest service demand. In any set of processes with inhomogeneous service time requirements, one of them will require the maximum service demand. That’s the bottleneck. If you reduce that service demand (e.g., via performance tuning), then another process will have the new maximum service demand. These are sometimes referred to as the primary and the secondary bottlenecks. The performance tuning effort has simply re-ranked the bottleneck.

There’s some obvious tuning candidates from our original YSlow report. Namely, the number of requests being made, the size of the requests and the transport compression or lack of being used. We know from experience that the more concurrent requests being served by the web server (Apache in this case), the greater the demand on down stream components such as the PHP engine and database. So when dealing with web stacks, the first obvious point is to reduce this demand.

A common strategy to deal with this is caching. Temporary storage of content such as HTML pages can help reduce bandwidth, request demand and service times by offloading content typically to memory or disk. Caching can be implemented at both the client and server level. In this case, reducing the number of requests made to the server from user’s browsers, reducing the payload of each request (and hopefully response time), and caching frequently accessed content at the web server without round trips to back end components can dramatically increase perceived performance.

The WordPress community has a ton of plugins that help deal with this and we chose W3 total cache to implement some of these desired caching characteristics. There’s a bunch of things it does reasonably well, namely setting of cache-control headers, caching of PHP and database objects in memory/disk and minification/compression of static assets. Considering it is a plugin that can be activated with relatively no experience in the configuration of these components is a huge plus.

After installing, activating and fixing some other issues related to the plugin we were able to revisit our earlier performance model with YSlow:

This time around the payload is significantly less weighing in at 118 KB for a first visit, and 5 KB for subsequent visits, so we’ll definitely take a chunk off our bandwidth bill for higher concurrency/volume tests. We’re still making the same amount of requests as before but this time making conditional requests for content, which in theory should be served more quickly by the web server (Apache). We’re also making better use of HTTP compression, namely gzip encoding.

What the YSlow chart can’t show is the benefits of caching at both page and database layers that are now in effect.

Our Second 200 User Load Test

Time for another load test.

Great, this time we averaged 152 milliseconds response time and halved the throughput whilst being able to get to 200 concurrent users with no errors. So instead of “fixing” the perceived bottleneck at the database layer, we simply reduced demand on the database at the web layer with some caching / compression which resulted in better response times for users at the same concurrent load as when it failed earlier.

Our First 1000 User Stress Test

We decide to up the ante and run a 1000 concurrent user stress test.

Once again, everything was looking great up until around 800 users where the target site started hittingmax_connections on the database again. So now the bottleneck has shifted back to the database layer. Is there anything left we can do before calling the infrastructure team?

Web App Acceleration

Thankfully there’s some even more powerful in-memory caching we can achieve in front of the web application with tools like varnish cache. These tools are very simple to install out of the box, for example:

yum install varnish

Provided your caching strategy is relatively simple and not complicated by things like authenticated users or changing URIs, tools like Varnish are very effective at further reducing load, this time saving round trips to Apache itself.

Our Second 1000 User Stress Test

We launch another stress test to see what difference Varnish makes.

Great stuff! We’re now reaching 2K concurrent users off a single instance before we start to see resource contention on the server itself and response time degradation under load.

Next Steps

It’s at this point we need to consider scaling up and out our infrastructure. We will certainly need a dedicated, highly available database server. We also know that in our worst case scenario (web and database on same instance) we’re serving up to 2K concurrent users without response time degradation. We’re still a long way off the arbitrary 280K concurrent users but we have an empirical baseline from which to test and improve.

Testing for this is what flood.io does best. Look at some results for 100K user benchmarks from a single region using JMeter and Gatling. You could execute this type of load from 8 regions for truly massive concurrency / volume. We understand performance testing is an iterative approach and don’t govern how many tests you run or constrain the number of users per test. It’s no holds barred testing at an affordable price and free to register / try for yourself.

Feel free to run any of these tests yourself and explore the next steps. How many web servers will we need?

Getting Started with Gatling

Overview

Gatling

Gatling is a high performance load testing tool which uses asynchronous event-driven IO. This makes it very efficient for high concurrency scenarios, allowing customers to get many thousands of simulated users from a single machine. We’ve tested as many as 40K simulated users from a single flood node. Simulation scripts are written in Scala with a user friendly DSL that is easily version controlled. At Flood we love the power and simplicity that Gatling affords and often use it for testing truly massive concurrent volumes.

Your First Script

We recommend you read the guide to producing your first script with Gatling. Once you have tested your scripts locally and you’d like to run them on flood.io, follow these important steps.

Parameterizing Threads, Rampup and Duration

Flood lets you pass in parameters to your scripts so that you can dynamically set things like the number of threads, the rampup and duration times in seconds. To access these from your scripts, include the following in your simulation class:

  val threads = Integer.getInteger("threads", 1)
  val rampup = Integer.getInteger("rampup", 60).toLong
  val duration = Integer.getInteger("duration", 120).toLong

Then when you configure your scenario you’ll be able to access these as standard variables within your test plan.

Flood Results

To ensure your results are aggregated per test, and include response code and throughput analysis, you will need to call request and response information extractors from your HTTP configuration element as follows:

  .requestInfoExtractor((request: Request) => { uuid })
  .responseInfoExtractor(response => Option(response.getHeader("Content-Length")).getOrElse("0") :: List(response.getStatusCode.toString))

Example Script

Putting it all together, and example script might look the following:

import com.excilys.ebi.gatling.core.Predef._
import com.excilys.ebi.gatling.http.Predef._
import com.excilys.ebi.gatling.jdbc.Predef._
import com.excilys.ebi.gatling.http.Headers.Names._
import akka.util.duration._
import bootstrap._

class Benchmark extends Simulation {

  val threads = Integer.getInteger("threads", 2)
  val rampup = Integer.getInteger("rampup", 10).toLong
  val duration = Integer.getInteger("duration", 120).toLong

  val uuid = List(System.getProperty("uuid", "test"))

  val httpConf = httpConfig
    .baseURL("http://s1.site-staging.flood.io:8000")
    .acceptHeader("text/javascript, text/html, application/xml, text/xml, */*")
    .acceptEncodingHeader("gzip,deflate,sdch")
    .connection("keep-alive")
    .userAgentHeader("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.46 Safari/536.5")
    .requestInfoExtractor((request: Request) => { uuid })
    .responseInfoExtractor(response => Option(response.getHeader("Content-Length"))
    .getOrElse("0") :: List(response.getStatusCode.toString))

  val get_slow = exec(http("get_slow")
    .get("/slow"))
    .pause(15 seconds)

  val get_cacheable = exec(http("get_cacheable")
    .get("/plain_text.html"))
    .pause(15 seconds)

  val get_non_cacheable = exec(http("get_non_cacheable")
    .get("/non_cacheable")
    .check(regex("""Little Blind (\w+)""").saveAs("response_value"))
    .check(regex("""Little Blind Text""")))
    .pause(15 seconds)

  val post_slow = exec(http("post_slow")
    .post("/slow_post?id=${counter}"))
    .pause(15 seconds)

  val scn = scenario("Gatling 1 Benchmark")
    .during(duration, "counter") {
      randomSwitch(
        20 -> get_slow,
        40 -> get_cacheable,
        30 -> get_non_cacheable,
        10 -> post_slow)
    }

  setUp(scn.users(threads).ramp(rampup).protocolConfig(httpConf))
}

Feeders

A common requirement of test plans is to access static test data from another source. Whilst most people start out with CSV feeders this quickly becomes complex to manage for distributed load testing scenarios where you have many load generators (flood nodes) potentially needing access to a single source of data. Instead of carving up your test data into discrete chunks, we built a custom Redis backed solution which lets you effortlessly share data between flood nodes.

If you just want to access random data, which does not need to be unique per user, you can upload a standard CSV file to your grid and access it via our high performance test data API:

exec(http("get_test_data")
  .get("http://54.252.206.143:8080/SRANDMEMBER/postcodes?type=text")
  .check(regex("""(\d+)""").saveAs("postcode"))

Alternatively, if you’d like to use the Gatling provided Redis feeder you can also upload a standard CSV file to your grid, but make sure it is a Sorted list, the default queue strategy for Gatling. Each time the feed is called, the first record of the feeder is removed from the queue and injected into the session.

Then import the relevant package:

import com.excilys.ebi.gatling.redis.Predef.redisFeeder

Define a Redis client pool within your test plan:

val redisPool = new RedisClientPool("54.252.206.143", 6379)

You will then have access to the redis feeder within your scenario:

.feed(redisFeeder(redisPool, "random_words"))

Example Script with Redis Feeder

Putting it all together, your script should look like this:

import com.excilys.ebi.gatling.core.Predef._
import com.excilys.ebi.gatling.http.Predef._
import com.excilys.ebi.gatling.jdbc.Predef._
import com.excilys.ebi.gatling.http.Headers.Names._
import com.excilys.ebi.gatling.redis.Predef.redisFeeder
import akka.util.duration._
import bootstrap._
import com.redis._
import serialization._

class Redis extends Simulation {

  val threads = Integer.getInteger("threads", 2)
  val rampup = Integer.getInteger("rampup", 10).toLong
  val duration = Integer.getInteger("duration", 120).toLong

  val uuid = List(System.getProperty("uuid", "test"))

  val redisPool = new RedisClientPool("54.252.206.143", 6379)

  val httpConf = httpConfig
    .baseURL("http://s1.site-staging.flood.io:8000")
    .acceptHeader("text/javascript, text/html, application/xml, text/xml, */*")
    .acceptEncodingHeader("gzip,deflate,sdch")
    .connection("keep-alive")
    .userAgentHeader("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.46 Safari/536.5")
    .requestInfoExtractor((request: Request) => { uuid })
    .responseInfoExtractor(response => Option(response.getHeader("Content-Length"))
    .getOrElse("0") :: List(response.getStatusCode.toString))

  val scn = scenario("Gatling Redis Benchmark")
    .feed(redisFeeder(redisPool, "random_words"))
    .during(duration, "counter") {
      exec(http("get_slow")
        .get("/?postcode=${random_words}"))
        .pause(1 seconds)
    }

  setUp(scn.users(threads).ramp(rampup).protocolConfig(httpConf))
}

Announcing the first major pre-release version of the Gridinit JMeter DSL

Tim Koopmans:

If you’re tired of using the JMeter GUI or looking at hairy XML files then have a look at the solution available as a Ruby gem called “gridinit-jmeter“. Now at its first major pre release version.

Originally posted on Gridinit:

For some time now, we’ve been beavering away on the first major version of our popular DSL for JMeter available as the Ruby gem gridinit-jmeter. In fact since we released the gem in November 2012 it’s been downloaded over 5000 times as well as people contributing to the code base with suggestions and improvements. We think that’s fantastic for the load testing community, thanks for the help!

That’s why we’re excited to announce the first major pre-release version for Gridinit-JMeter. You can preview the pre-release by installing:

gem install gridinit-jmeter --pre

Included in this release are some much needed syntax changes, better tests and more coverage of core JMeter functionality.

Syntax Changes

We’ve simplified options for DSL methods, by simply passing a hash of options as follows:

test do
  threads count: 10 do
    visit name: 'Google Search', url: 'http://google.com'
  end
end.run

Have a look at our growing list…

View original 517 more words

High Concurrency JMeter Tests

Tim Koopmans:

Some notes on concurrency and workload in general, and how to simulate that with JMeter and Gridinit.

Originally posted on Gridinit:

Concurrency, what is it? It can simply be defined as

The property of systems in which several computations are executing simultaneously, and potentially interacting with each other.

Concurrency is often used to define workload for load testing, as in concurrent users. Too often it’s the only input defined. In reality there are a number of factors which contribute to workload and affect concurrency itself. Some of these I will talk about in this post.

How to calculate target concurrent users?

A common method is to divide the number of unique users by the average visit duration for a given period.

For example, 12,000 unique visitors per hour / ( 60 minutes / 15 minutes per visit ) would equal 3,000 concurrent users.

Problems with this approach include:

  • Assumption that visitors are spread evenly across the 1 hour period sampled. Visitors are more likely to follow some form of Poisson process

View original 1,352 more words