Intel Xeon 5500 Sequence Memory Latency

Memory latency and bandwidth is one major performance factor. I once visualized all memory subsystems and their latency impact of my Xeon 5500 in a illustration.

images/xeon5500-latency.png more…

Managing various upstream Git Repositories

I track some Linux kernel development subsystems: Linus, perf, network (davem's) net-next and lkvm. Additionally I add two historic Linux repositories: Thomas post 2.4 tree and davej's pre 2.4 branch. In the earlier git days I referenced cloned locally. Later I started to add remotes within one repository. To setup this environment the following commands may be helpful:

# create a vanilla git container
mkdir linux; cd linux
git init

# add remotes
git remote add --track torvalds git://
git remote add --track net-next git://
git remote add --track mingo git://
git remote add --track kvm-tool git://
# historic branches desired?
# git remote add --track post-2.4 git://
# git remote add --track pre-2.4 git://

# update all remotes
git remote update

git checkout --track -b origin-torvalds-master remotes/torvalds/master
git checkout --track -b origin-mingo-perf-core remotes/mingo/perf/core
git checkout --track -b origin-net-next-master remotes/net-next/master
git checkout --track -b origin-kvm-tool-master remotes/kvm-tool/master

See the branch naming of the remotes: all starts with origin. This is my common prefix to handle remote tracking branches and local branches. Furthermore, I also have two ( as primary git master and github as a backup) under control. To differentiate these both I prefix these branches with github and jauu.

The master branch is empty. This is ok, the one and only problem is that gitweb and friends will always start to present the master tree.

It is also possible to connect the historic branches into one coherent whole. See the following command. But note: if you never or seldom use the historic archive you may prefer a seperate referenced cloned repository! With historic roots the repository is really large - which slows down day to day development. A seperate repository, cloned via reference is a optimal compromiss. Git operations are fast in the normal repository and because the referenced clone do not comsume additional memory (at least not that much) because hard links are used.

cat .git/info/grafts
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 e7e173af42dbf37b1d946f9ee00219cb3b2bea6a
7a2deb32924142696b8174cdf9b38cd72a11fc96 379a6be1eedb84ae0d476afbc4b4070383681178

The following two aliases are really helpful to cope with remote branches:

lu = log ..@{upstream}
ft = merge --ff-only @{upstream}

You start by calling git remote update to update all remotes. Then git lu will show you all changes made upstream and git ft finally will integrate this tracking branch if a fast forwarding is possible.

If you're not sure about the current status of your branches and what they are tracking, simply use git branch -vv

Add own remotes where other can pull and as a backup service. Push local branches to github are easily done by: git push github torvalds-epoll-exclusive

git remote add github

If your are behind a proxy you can add proxy setting:

export http_proxy='http://username:password@proxy.local:80'

Captcp Gnuplot Beauty

Today some ideas how Captcp generated throughput graphs can be modified and improved.

We start with a pcap file (captured via tcpdump or wireshark). We create a output directory, called graphs and finally start captcp and visualize one TCP flow. The number of the connection and flow (-f 45.2) is a result of captcp's statistic module - I just picked up one random flow within the capture file. The output of captcp is saved in the directory graphs. Note because we add the -i option captcp generate a Makefile and a gnuplot file, called throughput.gpi. So remember: remove the -i option to prevent an overwrite of these two files. The data file ( is always overwritten at each captcp call - sure:

@hal $ ls
@hal $ mkdir graphs
@hal $ captcp throughput -s 0.1 -i -f 45.2 -o graphs trace.pcap
@hal $ ls graphs
total 12
-rw-r--r-- 1 pfeifer pfeifer  611 Jan 22 00:17 Makefile
-rw-r--r-- 1 pfeifer pfeifer 3322 Jan 22 00:17
-rw-r--r-- 1 pfeifer pfeifer  766 Jan 22 01:06 throughput.gpi

If we call make in the directory the gnuplot is invoked and a PDF file (througput.pdf) is generated. So why generate captcp by default and without any argument a PDF file? Because the format is nice! It is a vector format (you can zoom into it), PDF files can be converted to raster graphics without any problems and PDF files can be included in LaTeX documents as a vector format as well. Finally the options to modify the image is much larger then any other format. We can change the Font, we can include Postscript Math Symbols and so on. Another possibility is svg (scalable vector graphics), which Gnuplot also supports. Advantage is that SVG formats are out of the box supported by every Webbrowser. Major disadvantage is the visual nature of Gnuplot generated SVG's - I think the graphs looks really deformedly, compared to PDF. But there is no hindrance not to edit throughput.gpi and change to SVG output or any other supported format. PDF is just the default format - not more.

To generate a PNG file (suitable as a webgraphic or for M$ Word) you can execute make png. This will generate a PNG file from the PDF via epstopdf(1). So this program must be installed to generate PNG files. So why generate PNG files based on the PDF files - why not directly you may ask? Because Gnuplots PNG terminal is limited in the functionality. If we always generate PDF files we had a single source (PDF) and just make modification there. Here is a screenshot, showing the directory structure and the current TCP throughput graph.


Open througput.gpi to start editing the style, format and look of the graph. First we just draw the graph in red and with a thicker line. Maybe the new image will looks now better one web-pages. The standard version is eventually more suitable for papers and publications.

set terminal postscript eps enhanced color "Times" 30
set output "throughput.eps"
set title "Throughput Graph"

set style line 99 linetype 1 linecolor rgb "#999999" lw 2
set key right bottom
set key box linestyle 99
set key spacing 1.2
set nokey

set grid xtics ytics mytics
set format y "%.0f"

set size 2
set size ratio 0.4

set ylabel "Data [byte]"
set xlabel "Time [seconds]"

set style line 1 lc rgb '#00004d' lt 1 lw 3

plot "" using 1:2 notitle with linespoints ls 1

The next image show the results:


Increase the font-size, change the paper ratio and draw the line even thicker makes the suitable for presentations:


Another nice Gnuplot capability is data interpolation and approximation of data - called smoothing by Gnuplot. Gnuplot provides several algorithms: unique, csplines, acsplines, bezier, sbezier. The next image show three algorithms in one illustation.

images/captcp-cmd6-small.png more…

CAPTCP - Throughput Graphs and Wireshark

Today I want to demonstrate how captcp outperform wireshark for TCP flow analysis. This posting focus on throughput analysis for a specific TCP flow. In a separate posting I will demonstrate how Gnuplot can be tweaked to generate a nice image ready for web publishing or for your bachelor/master/phd thesis. The need for this posting arise from discussions with captcp users where I think that the workflow with captcp and Gnuplot is not ideal. So take this posting as a best practices tutorial. Sure, there will be other ways too.

If you did not downloaded and installed captcp yet you can do it right now:

git clone
cd captcp
su -c 'make install' or sudo make install (depending on your distribution)
rehash if you use zsh

So first we start to capture a TCP flow. As I mentioned in the captcp documentation it is always a good idea to capture network data and do a offline analysis of the data! And please don't forget to disable the offload capabilities of your network adapter! Read the section "What you see is NOT what you get" on the captcp homepage:

tcpdump -i eth5 -w trace.pcap
kill $pid

This capture all data on device eth5 (please use your device) and save the packet trace in trace.pcap. Later we download a large file (say 10 Mbyte) and after all you just stop the capturing process. trace.pcap should now contain at least one complete TCP transfer including three-way-handshake, file transfer and termination. If any other network processes are running, like browser or cloud music players you will download that streams too.

Wireshark Throughput Analysis

We start with wireshark analysis. We open wireshark directly with the trace file. My packet capture file contains many different connection - 47 to be exact. Wireshark can show information about every TCP connection via Statistics -> Conversation List -> TCP (IPv4 & IPv6). The following screenshow show this:


The problem is that we want to limit the throughput graph to exactly on connection. The standard Statistics -> IO Graph is not usable because it will always show the IO impact for the whole capture. But with a trick we can bypass this by save the particular TCP stream to another dumpfile. Via right click on one packet in the TCP stream, followed by Conversation Filter -> TCP we can limit the wireshark view to only this TCP stream. Something like the following filter should be displayed in the filter box:

(ip.addr eq and ip.addr eq and
(tcp.port eq 80 and tcp.port eq 55173)

Next step is to save only the displayed packets. File -> Specified Packets and Save. You will see the dialog inform you that there are captured and displayed packets. The number of displayed packets should match the expectations. Save this file as stream.pcap and open this file now. You should only see the particular packets - if not something went wrong and you should check the filter.

Now you can open Statistics -> IO Graph and the image and the control elements should look like the following:


I set the tick interval to 0.1 seconds and increase the pixel per tick to 5 to display more details. Furthermore I change the Units from Packet/s to Byte/s.

You can save this image as PNG and voila. You can integrate the file in your thesis and thats all. You cannot modify the image afterwards (except Gimp). There is also no possibility to add a axis label or store the image in a vector format (PDF or SVG).

In the next section I show how captcp can display the throughput of a particular TCP connection.

Captcp Throughput Analysis

To identify the TCP stream we start the statistic module of captcp:

captcp statistic trace.pcap

This list all TCP streams in the file. We search the list for the specific connection and remember the particular stream:


For trace.pcap flow 45.2 is the right one! Now we can start the throughput module and graph the data. First we generate the raw data and in a second step we start Gnuplot (via make) to generate the PDF file. To keep the workspace tidy we create a temporary directory to output all intermediate files. We name this directory simple "out".

mkdir out
captcp throughput -s 0.1 -i -f 45.2 -o out trace.pcap
cd out
make png
# view throughput.png

The generated PNG image look like the following:


Arguments to captcp are s to specify the sampling interval - here 0.1 seconds. i mean initial and generate a Gnuplot environment to generate the image out from the raw data file ( You MUST skip this flag if you modify the Gnuplot file (throughput.gpi). If not than the gnuplot file is overwritten each time you execute it. The f flag specify the flow and o the output directory where all file are generated.

I hope this help to print some nice throughput graphs! In a later posting I will show how you can modify the gnuplot file and start with some Gnuplot magic! ;-)


Born to Code in C

back in the days, with black monitors, leather jackets, red phones, 5 1⁄4" floppy disks, ... - I like

images/born-c-1.jpg images/born-c-2.jpg

via Amazon


Differential Profiling

Just for the statistical guys reading my blog: Jiri Olsa posted a patch today to extend perf diff to show differential profiling. Perf diff now support three diff computations:

  • delta: the current default one
  • ratio: ratio differential profile
  • wdiff: weighted differential profile

The idea goes back to Paul E. McKenney and his paper Differential Profiling. Nice reading!


Once Again - Encryption and Authentication

For a new MANET protocol I started to verify the security concept. Not from a security point of view, rather from a protocol feasibility point of view. The security concept is based on existing security protocols. I just removed dynamic components like key exchange and the like (to be correct: dynamic aspects are done by special messages and not that tight linked with the protocol). The result should be a very lean TLV, containing only required security data without padding issues are "reserved" bits. The default encryption algorithm is still AES128 and authentication (HMAC) is done via SHA-512 (if used separately). So nothing special here. By the way: it is a UDP based MANET protocol, IPSec and OpenSSL are not suitable - DTLS is also not suitable.

At the time I started to implement the security relevant bits I was a little bit annoyed about OpenSSL. OpenSSL has a usable API and the provided functionality is large. But that is the problem: OpenSSL is just a big monster of code and functionality. If you have a stripped demand of functionality like en(de)cryption and checksumming you even have to link the whole OpenSSL library into your program (library).

Some days ago I stumbled across NaCl (pronounced "salt") a lightweight encryption/decryption/checksumming library. Focus: speed and security. I like the API and started to use NaCL for the project. Another nice argument: there are packages for Debian and Ubuntu available in the official repository!

The next listing show the API and how to use a combination of Salsa20 (encryption) and Poly1305-AES (authentication).

#include "crypto_secretbox.h"

const unsigned char k[crypto_secretbox_KEYBYTES];
const unsigned char n[crypto_secretbox_NONCEBYTES];
const unsigned char m[...]; unsigned long long mlen;
unsigned char c[...]; unsigned long long clen;

crypto_secretbox(c, m, mlen, n, k);

Safe Arrays and Pointers for C

John Nagle (no, not the TCP Nagle ;-) proposed a extension to C to keep array information over the call context to do bounds checks. His proposal Safe Arrays and Pointers for C through compatible additions to the language is a nice reading. On the GCC maillinglist the responses are mixed: more questions and doubts if such a extension will and can be adopted by the users.


Vacation in Vallis/Switzerland

In the 32nd week I am on vacation in Switzerland in the mountains of Vallis. The picture is taken in Lötschental and the lake is called Schwarzsee. Only a few meters from our accommodation.


The photo is taken by daoro


x86 Floating Point

x86 Floating Point was and still is a source of problems. First of all the FPU (floating point coprocessor, x87) has 8 registers. Via gdb and info all-register you can display the floating point register. These registers are all 80 bit wide. And here the problems start: standardized are floating point math with 32 or 64 bit (float or double). If you compile the next construct with -m32 on a 32bit and a 64bit arch the results will differ:

double val = 52.30582;
double d = 3600.0 * 1000.0 * val;
long l = long( d );
long l2 = long( ( 3600.0 * 1000.0 * val ) );
long l3 = (long)( 3600.0 * 1000.0 * val );
long l4 = long( 3600.0 * 1000.0 * val );

cout.precision( 20 );
cout << "Original value : " << val << endl;
cout << "Double with mult : " << d << endl;
cout << "Casted to long v1 : " << l << endl;
cout << "Casted to long v2 : " << l2 << endl;
cout << "Casted to long v3 : " << l3 << endl;
cout << "Casted to long v4 : " << l4 << endl;

So as I said that on both archs the results will differ! If you enable GCC optimization (e.g. -O3) results may not differ! Huh? The problem here is that as I wrote in the first paragraph that the floating point unit will use a 80 bit wide register. If you enable optimization GCC will (can) use the MMX/SSE unit which in turn will use 64 bit wide register. That is a known problem, especially in numeric sensitive environments. Which often disable x87 floating point unit. You can enforce x86 arithmetic with -fexcess-precision=std / -ffloat-store. Or even better (because faster): -mfpmath=sse -msse2.

The background why 80bit register are introduced by Intel can be read in


If Ping was Designed by the Git People

net-ping host --no=dns,bind --proto=TCP,rfc:492 eth0@ipv4:: -ADDR.ARPA --stats -v




TCPM Agenda for IETF-84 Vancouver Meeting

Michael published the agenda for TCPM Vancouver Meeting and there are lot of items on the list this time. The following list is sorted by personal interest:

  • Proportional Rate Reduction for TCP
  • TCP Fast Open
  • Increasing TCP's Initial Window
  • Impact of IW10 on Interactive Real-Time Communication
  • Additional negotiation in the TCP Timestamp Option field during the TCP handshake
  • TCP and SCTP RTO Restart
  • More Accurate ECN Feedback in TCP
  • TCP Loss Probe (TLP): An Algorithm for Fast Recovery of Tail Losses
  • Evaluating TCP Laminar
  • Highly Efficient Selective Acknowledgement (SACK) for TCP
  • Shared Use of Experimental TCP Options
  • A TCP Authentication Option NAT Extension
  • RFC 1323bis
  • HOST_ID TCP Options: Implementation and Preliminary Test Results
  • Shared Memory Communications over RDMA
  • Processing of IP Security/Compartment and Precedence Information by TCP
  • Processing of TCP segments with Mirrored End-points

I am especially interested in Ilpo's Interactive Real-Time Communication IW10 talk.

BTW: 9am PDT (UTC-7) will be 18:00:00 CEST (UTC+2)


Compiler Block Reordering and Memory Layout Optimization

GCC as enabled with -freorder-blocks and a optimization level larger 1 will reorder instructions at a block level. This optimization is mainly to compress correlated code to provide a optimized cache aware memory layout. Because of some Linux kernel hacking I forced to get the details when and where GCC's optimizations kicks in. The most effective way for userland programs without branch-taken-knowledge is through profile guided optimization nowadays. But this is not possible in every setup (lack of realistic input data, ...). Another way are GCC's builtin expect statement - but this required exact knowledge of data paths and realistic input data. The next two snippets show how GCC reorder a simple if-else statement (just for demonstration):

4004c0: 83 ff 01              cmp    $0x1,%edi
4004c3: 7f 04                 jg     4004c9 <process+0x9>
4004c5: 8d 47 01              lea    0x1(%rdi),%eax
4004c8: c3                    retq
4004c9: 8d 47 02              lea    0x2(%rdi),%eax
4004cc: c3                    retq
4004c0: 83 ff 01              cmp    $0x1,%edi
4004c3: 7e 04                 jle    4004c9 <process+0x9>
4004c5: 8d 47 02              lea    0x2(%rdi),%eax
4004c8: c3                    retq
4004c9: 8d 47 01              lea    0x1(%rdi),%eax
4004cc: c3                    retq

The fascinating question is what are the limits of this optimization. What are the influences of nested statements? Are there any thresholds not to reorder blocks? I just prepared some code generation script to analyze this characteristic. But now do some tests and write an analyzer script too.


3.5 Released

Just a few bugfixes goes in since -rc7 so here we go: Linux 3.5 released. Changeset of upcoming 3.6 should be smaller because of vacation period.


IETF I-D and RFC News

Vancouver is near so there was a lot of I-D submission progress. I-D cutoff is over so ID submission settles. Subsequent a list of some interesting I-D's (this list is sorted):

From TCP perspective not that much. Nice that PCN are in now and some other minor pearls - but thats all. Anyway: a lot of progress in other areas. So here is the list of new RFC's:


Epson WP 4535 - Multi-Function Inkjet and Linux

Today I buyed a new printer: Epson WP 4535. The printer can scan up to 600 DPI to a USB device (as JPEG or PDF). A nice feature because I dont want to install sane or any other scaner software. Printing works out of the box via CUPS. No additional driver installation required. My room mate tried to test the printer with Windows which failed. Under Windows you must download the actual driver from Epson Homepage. No big deal, but with Linux everything works out of the box! The printer is connected to the homenetwork via WiFI. So there is just one cable. We currently debate where we can place the printer.


Futher, you can copy documents and send or receive FAX. So first impressions are quite positiv!

Here is a more detailed review for the printer: WP 4535 CNET review


Cscope and Code Navigation

Cscope is a crucial part in my development tool zoo. Not for small projects but for larger projects like Linux Kernel development I use cscope really often. But as usually: I only use a small subset of all cscope features. This normally boils down to Ctrl-] to jump to the declaration of the function and later Ctrl-t to jump back. With Cscope the acts like gD (goto declaration) with don't stop at file boundaries.

But Cscope has more features! Cscope supports several programming languages and can find all calls to a function or functions called by this function under cursor. So here is a list of all features copied from the cscope vim plugin. You can call call any function by typing: CTRL-followed by one of the following the character.

's'   symbol: find all references to the token under cursor
'g'   global: find global definition(s) of the token under cursor
'c'   calls:  find all calls to the function name under cursor
't'   text:   find all instances of the text under cursor
'e'   egrep:  egrep search for the word under cursor
'f'   file:   open the filename under cursor
'i'   includes: find files that include the filename under cursor
'd'   called: find functions that function under cursor calls

CAPTCP News - TCP Throughput Module

The development of captcp flatten out: no more features where implemented in the last months. Today I merges a branch from Gábor Molnár to print TCP stream throughputs separately. New option is can be enables via -x. Many thanks Gábor!

This time I used the github merge feature the first time. I rarely use github features. For me it was (and still is) a public git hoster. But Github features like included issue tracking and the whole collaboration thing is great!

Sourceforge was fine some years ago, but they missed to adjust their tooling. This for the second time! Years back I used sourceforge daily, at that time with CVS. With the advent of Subversion Sourceforge missed to support subversion too. - they focused on CVS. I switched from sourceforge to (which was a sourceforge clone but with subversion support). And now they missed to reorganize their model to a more distributed, fast-dev, easy-commit style model. Sad, really sad! But hey, this is how innovation works.

I have some more TODOs on my captcp list. But they are currently not high priority. If you had some ideas or (better) patches: you are welcome! ;-)


IETF I-D and RFC News - I

I just decided to summary IETF relevant news as new I-D's or RFC's in a weekly blog posting. The summary will not cover all areas, nor will it list the most prominent development. I focus on topics I am most interested. This mainly includes the following areas: Applications, Internet, Operations and Management, Routing and of course Transport. But occasional I will post I-D's or new RFC's from other areas as well. So here we go:

New I-D's

New RFC's:


Quit User Tracking

This blog used two third party extensions:

  • Google Analytics
  • Disqus

Both set cookies and have a user coverage which is horrible. Nowadays Google Analytics together with Google Search, Android activity and Google Ads have a database of ... in the end they know too much about you.

I decided today that web analysis data is not crucial for me compared to the privacy of my readers. So I removed all third party content from my blog.


By the way: I suggest two Google Chrome extensions:

  • Google Analytics Opt-out Add-on (by Google), Tells the Google Analytics JavaScript (ga.js) not to send information to Google
  • Ghostery, Protect your privacy. See who's tracking your web browsing with Ghostery.