drio

Golang's performance (read fastq files)

October 21, 2012

I have been spending some time coding in golang. So far I really like the language and what it brings to the table.

I saw a very fast implementation in C of a fasta/fastq parser. I wanted to see how golang compares against this C implementation in terms of performance. Notice that the C version is heavily optimized. Everything is done in macros. As a matter of fact the library ships as a single header file.

I decided to use the first benchmark the readfq author used. Parsing a fastq file with 100 million illumina reads. In my first attempt my go code finished faster than perl, python and luajit but it was almost 3 times slower than C. I was happy to see it was faster than luajit, which uses a very efficient JIT compiler.

0.03u 3.97s 41.34r 2176kB c
0.06u 8.69s 109.50r 2176kB go
0.03u 4.93s 131.96r 2192kB luajit
0.02u 2.97s 132.41r 2176kB python27
0.07u 9.89s 275.16r 2192kB perl

After some profiling (kudos to the golang team for releasing such a fantastic profiling tools), I saw I was burning most of my CPU cycles in GC operations. I got great feedback from the golang community suggesting I should use slices of bytes instead of strings. That made sense, since strings are immutable, new strings have to be created pretty much per each operation. Things were looking better after those changes:

0.02u 4.06s 41.33r 2176kB c
0.02u 5.02s 58.55r 2192kB go

I pushed a little bit further by adding more suggestions from the community. Basically all oriented towards avoiding extra GC operations. When reusing slides, I could use my_slice[:0] so the GC can reuse the memory already available for that slide. Also, when splitting slices, if I know I am going to use the first bytes, I can use SplitN() so only one slice of bytes is created.

I also used a closure to express the record iteration, but it was suggested that a more idiomatic golang code would be to follow a more object like approach. After refactoring the results where spectacular:

0.04u 8.58s 40.03r 2192kB go
0.03u 4.19s 41.34r 2192kB c

The golang final code can be found here.