HTTP benchmarking using wrk. Parsing output to CSV or JSON using Python

Maarten Smeets
1 0
Read Time:8 Minute, 48 Second

wrk is a modern HTTP benchmarking tool. Using a simple CLI interface you can put simple load on HTTP services and determine latency, response times and the number of successfully processed requests. It has a LuaJIT scripting interface which provides extensibility. A distinguishing feature of wrk compared to for example ab (Apache Bench) is that it requires far less CPU at higher concurrency (it uses threads very efficiently). It does have less CLI features when compared to ab. You need to do scripting to achieve specific functionality. Also you need to compile wrk for yourself since no binaries are provided, which might be a barrier to people who are not used to compiling code.

Parsing the wrk output is a challenge. It would be nice to have a feature to output the results in the same units as CSV or JSON file. More people asked this question and the answer was: do some LuaJIT scripting to achieve that. Since I’m no Lua expert and to be honest, I don’t have any people in my vicinity that are, I decided to parse the output using Python (my favorite language for data processing and visualization) and provide you with the code so you don’t have to repeat this exercise.

You can see example Python code of this here.

wrk output

See for example the following output of running wrk against a custom API:

Command: wrk --timeout 20s -d20s -c65536 -t5 http://localhost:8080/people

Running 20s test @ http://localhost:8080/people
  5 threads and 65536 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   877.63ms  645.81ms   4.68s    69.42%
    Req/Sec   270.73    164.19     1.72k    72.01%
  21424 requests in 20.10s, 10.00MB read
  Socket errors: connect 64519, read 0, write 0, timeout 0
  Non-2xx or 3xx responses: 12271
Requests/sec:   1065.82
Transfer/sec:    509.59KB

If you want to obtain data which can be used for analysis, it helps if the results are in the same units. This is not the case with wrk. For example, the Req/Sec field contains the average 270.73 but the max 1.72k. If you want to have both of them in the same units, 1.72k needs to be multiplied with 1000. The same applies to the Latency where the average is 877.63ms and the Max 4.68s. For really short durations, it can even go to us (microseconds). Here the factor is also 1000 but when the latency increases, this can go to minutes and hours, for which you have to multiply by 60. The Transfer/sec amount is for small amounts in Bytes but can go to KB, MB, GB, etc. The factor to be used here is 1024. For my analyses I wanted all durations to be in milliseconds. All amounts of data in Bytes and all amounts in absolute numbers without suffix.

Looking at the wrk source here I found the cause of this challenge. A C file describing the units. This source is input for parsing since it indicates the scope of the units.

  • For numbers: base, k, M, G, T, P
  • For amounts of data: K, M, G, T, P
  • For durations: um, ms, s, m, h

The sentences in the wrk output are more or less structured so suitable to do some regular expressions on to extract the numbers and suffixes.

Obtaining numbers and suffixes

I used regular expressions to obtain the numbers and suffixes and output them as a dict, a Python datatype for an associative array.

Regular expressions are far easier to write than to read. I parse every line and check whether it is one of the following:

Running 20s test @ http://localhost:8080/people
  5 threads and 65536 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   877.63ms  645.81ms   4.68s    69.42%
    Req/Sec   270.73    164.19     1.72k    72.01%
  21424 requests in 20.10s, 10.00MB read
  Socket errors: connect 64519, read 0, write 0, timeout 0  Non-2xx or 3xx responses: 12271
Requests/sec:   1065.82
Transfer/sec:    509.59KB

These lines contain the most relevant data. When I have the line I try to be as flexible as possible in obtaining the numbers. For example, the decimal separator can be there or not and a suffix can be there or not. Thus for example for an optional suffix I use \w?*. It can be there and if it is there, it can be multiple characters, like for example ms or MB. Also the case might differ for the suffix. Some values are not always there in the wrk output such as the Socket errors line. For this I fill in some defaults (0 for no errors)

Of course when executing the command, I have information available like the number of threads, connections, duration and information about what I am testing. 

The output of the function is something like:

{'lat_avg': 877.63, 'lat_stdev': 645.81, 'lat_max': 4680, 'req_avg': 270.73, 'req_stdev': 164.19, 'req_max': 1720, 'tot_requests': 21424, 'tot_duration': 20010.0, 'read': 10485760, 'err_connect': 64519.0, 'err_read': 0.0, 'err_write': 0.0, 'err_timeout': 0.0, 'req_sec_tot': 1065.82, 'read_tot': 521820.16}

Normalizing data

The next parsing challenge is normalizing the data. For this I created 3 functions for each type of data. get_ms for durations. get_bytes for amounts of data and get_number for amounts. 

For example, the function to get a number (see the example code for the other functions here):

def get_number(number_str):
    x ="^(\d+\.*\d*)(\w*)$", number_str)
    if x is not None:
        size = float(
        suffix = (
        return number_str

    if suffix == 'k':
        return size * 1000
    elif suffix == 'm':
        return size * 1000 ** 2
    elif suffix == 'g':
        return size * 1000 ** 3
    elif suffix == 't':
        return size * 1000 ** 4
    elif suffix == 'p':
        return size * 1000 ** 5
        return size

    return False

As you can see, this function also requires some flexibility. It is called with a string which is a float + optional suffix. I use a similar tactic as with parsing the wrk output lines. First I apply a regular expression to the input next I apply the calculations relevant to the specific suffixes. If there is no suffix, the number itself is returned.

Creating a CSV line

When you have a Python dict, it is relatively easy to make a CSV line from it. I created a small function to do this for me:

def wrk_data(wrk_output):
    return str(wrk_output.get('lat_avg')) + ',' + str(wrk_output.get('lat_stdev')) + ',' + str(wrk_output.get(
        'lat_max')) + ',' + str(wrk_output.get('req_avg')) + ',' + str(wrk_output.get('req_stdev')) + ',' + str(
            'req_max')) + ',' + str(wrk_output.get('tot_requests')) + ',' + str(
        wrk_output.get('tot_duration')) + ',' + str(wrk_output.get(
        'read')) + ',' + str(wrk_output.get('err_connect')) + ',' + str(wrk_output.get('err_read')) + ',' + str(
        wrk_output.get('err_write')) + ',' + str(wrk_output.get('err_timeout')) + ',' + str(wrk_output.get('req_sec_tot')) + ',' + str(wrk_output.get('read_tot'))

It is just a single return statement concatenating the values. This does have the liability though that if certain values cannot be found,  wrk_output.get will throw a KeyError. It expects all the data to be there or have default values. Luckily this should always be the case.

Running the example

First download the wrk sources here (git clone and compile them by executing the ‘make’ command in the cloned repository. Most *NIX systems should have make already installed. I tried it on a pretty bare Ubuntu installation and did not need to install additional dependencies to get this to work.

You can obtain my Python code here and can execute it using python3. First of course update the wrk command path line at the top of the script.

The actual processing is done in the main function:

def main():
    print("****wrk output: \n\n")
    wrk_output = execute_wrk(1, 2, 100, 5, 10, '')
    print(str(wrk_output) + "\n\n")
    print("****wrk output dict: \n\n")
    wrk_output_dict = parse_wrk_output(wrk_output)
    print(str(wrk_output_dict) + "\n\n")
    print("****wrk output csv line: \n\n")
    wrk_output_csv = wrk_data(wrk_output_dict)

execute_wrk(1, 2, 100, 5, 10, ‘’) executes wrk and captures the output. You can of course adjust the parameters to your use-case (don’t make Google do to much work for your tests!). The parameters are the following:

  • cpuset
    Which CPU core to use. Example value 1 
  • threads
    The number of threads wrk should use. Should be greater than the number of cores. Example value 2
  • concurrency
    The number of concurrent requests to keep running. Example value 100.
  • duration
    The duration of the test. Example value 5 means 5 seconds.
  • timeout
    How long is a request allowed to take. Example value 10 means 10 seconds
  • url
    The URL to call. Example value
****wrk output:

Running 5s test @
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   576.71ms  612.14ms   4.57s    88.82%
    Req/Sec    89.09     41.48   171.00     66.67%
  869 requests in 5.03s, 45.55MB read
Requests/sec:    172.61
Transfer/sec:      9.05MB 

wrk_output_dict is the result of parsing the output to a dictionary using
parse_wrk_output (of course fields are in random order). This calls get_number and other functions to normalize the values. Printing the dict gives you a string which is, save the ‘ characters which should be “, JSON. 

****wrk output dict:

{'lat_avg': 576.71, 'lat_stdev': 612.14, 'lat_max': 4570.0, 'req_avg': 89.09, 'req_stdev': 41.48, 'req_max': 171.0, 'tot_requests': 869.0, 'tot_duration': 5030.0, 'read': 47762636.8, 'req_sec_tot': 172.61, 'read_tot': 9489612.8, 'err_connect': 0, 'err_read': 0, 'err_write': 0, 'err_timeout': 0}

wrk_output_csv is the CSV output line. The complete output can look like:

****wrk output csv line:


About Post Author

Maarten Smeets

Maarten is a Software Architect at AMIS Conclusion. Over the past years he has worked for numerous customers in the Netherlands in developer, analyst and architect roles on topics like software delivery, performance, security and other integration related challenges. Maarten is passionate about his job and likes to share his knowledge through publications, frequent blogging and presentations.
0 %
0 %
0 %
0 %
0 %
0 %

Average Rating

5 Star
4 Star
3 Star
2 Star
1 Star

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Next Post

Oracle Cloud Infrastructure Resource Monitoring - Alarm triggers Notification when Metrics satisfy Condition

Oracle Cloud Infrastructure gathers metrics from all OCI resources. These metrics can be visualized and analyzed with the Metrics Explorer. And Alarms can be defined with rules evaluating the metrics that publish to a Notification Topic to signal a situation of interest. Subscriptions can be defined on these Notification Topics, […]
%d bloggers like this: