My previous blog post, comparing a simple character counting algorithm between python and D, created an interesting discussion on Google+, with many tuning tips for the D version, and finally we also had a C++ version (kindly provided by Harald Achitz).
Then I could not resist to try out something I've been thinking about for a while (comparing D with Free Pascal (FPC)), so I went away and jotted down an FPC version as well (my first FPC code ever, I think).
Anyway, I now just wanted to summarize a comparison between all of these! I have added below the python code, the faster of the two D versions in my previous post, as well as the new additions: Harald's C++ version and my FPC version.
Python | D (DMD) | D (GDC) | D (LDC) | C++ | FPC |
---|---|---|---|---|---|
7.904s | 0.652s | 0.668s | 0.538s | 0.399s | 0.629s |
Numbers are execution time in seconds, for counting the fraction of "G" and "C" characters compared to "A", "T", "C" and "T" in a 58 MB text file of nearly 1 million lines of text. And so, why not plot it as well:
Not surprising maybe that C++ is the fastest. I find it interesting though that both D and FPC perform in the same ballpark at least, with code written by a novice user (although with some minor tweaking).
For me personally, I also got a first answer on the question I've been wondering about for a while: Will D or FreePascal be faster, given that Pascal is an older and more mature language, and doesn't use a Garbage collector, while D on the other hand have better compiler support, which it can benefit from (like LLVM).
Since I've been betting the most on D lately, I find it interesting to see that still, D can perform better, at least with the right compiler. Still I'm impressed that FPC is totally on par with D, and beaten only when D is compiled with the LLVM backed LDC.
On another note though, I tend to like the compactness of the D code. It's not really that far from the python version, if omitting the closing curly brackets, no? Not bad for a statically compiled high performance language! :) (That is, while FreePascal has it's merits from using mostly natural language words, over quirky symbols, which IMO can enhance readability a bit).
import re import string def main(): file = open("Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa","r") gcCount = 0 totalBaseCount = 0 for line in file: line = line.strip("\n") if not line.startswith(">"): gcCount += len(re.findall("[GC]", line)) totalBaseCount += len(re.findall("[GCTA]", line)) gcFraction = float(gcCount) / totalBaseCount print( gcFraction * 100 ) if __name__ == '__main__': main()
import std.stdio; import std.string; import std.algorithm; import std.regex; void main() { File file = File("Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa","r"); int gcCount = 0; int totalBaseCount = 0; string line; char c; while (!file.eof()) { line = file.readln(); if (line.length > 0 && !startsWith(line, ">")) { for(uint i = 0; i < line.length; i++) { c = line[i]; if (c == 'C' || c == 'T' || c == 'G' || c == 'A') { totalBaseCount++; if ( c == 'G' || c == 'C' ) { gcCount++; } } } } } float gcFraction = ( cast(float)gcCount / cast(float)totalBaseCount ); writeln( gcFraction * 100 ); }
(Kindly shared by Harald Achitz!)
#include <string> #include <fstream> #include <iostream> int main() { std::fstream fs("Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa", std::ios::in); int gcCount = 0; int totalBaseCount = 0; if ( fs.is_open() ) { std::string line; while (std::getline(fs, line)) { if(*(line.begin()) != '>') { for(std::string::const_iterator pos = line.begin(); pos!=line.end() ; ++pos){ switch (*pos){ case 'A' : totalBaseCount++; break; case 'C' : gcCount++; totalBaseCount++; break; case 'G' : gcCount++; totalBaseCount++; break; case 'T' : totalBaseCount++; break; default: break; } } } } //std::cout << "gcCount: " << gcCount << " / totalBaseCount: "<< totalBaseCount << " = " ; std::cout << ( (float)gcCount / (float)totalBaseCount ) * 100 << std::endl ; } else { std::cout << "can't open file" ; } return 0 ; }
program gc; {$mode objfpc} // Do not forget this ever uses Sysutils; var FastaFile: TextFile; CurrentLine: String; GCCount: Integer; TotalBaseCount: Integer; i: Integer; c: Char; GCFraction: Single; begin GCCount := 0; TotalBaseCount := 0; AssignFile(FastaFile, 'Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa'); {$I+} //use exceptions try Reset(FastaFile); repeat Readln(FastaFile, CurrentLine); i := 0; while i < Length(CurrentLine) do begin c := CurrentLine[i]; if (c = 'A') or (c = 'G') or (c = 'C') or (c = 'T') then begin TotalBaseCount := TotalBaseCount + 1; if (c = 'G') or (c = 'C') then GCCount := GCCount + 1; end; i := i + 1; end; until(EOF(FastaFile)); CloseFile(FastaFile); except on E: EInOutError do begin Writeln('File handling error occurred. Details: '+E.ClassName+'/'+E.Message); end; end; GCFraction := GCCount / TotalBaseCount; Writeln(FormatFloat('00.0000', GCFraction * 100)); end.
I used the following compile commands / flags for the test above (from the Makefile):
d: dmd -ofgc_d_dmd -O -release -inline gc.d gdmd -ofgc_d_gdc -O -release -inline gc.d ldmd2 -ofgc_d_ldc -O -release -inline gc.d cpp: g++ -ogc_cpp -O3 gc.cpp fpc: fpc -ogc_fpc -Ur -O3 -TLINUX gc.pas
Let me know if there is other flags that should be used, for some of the above compilers!
Comments
Optimize FPC version
I believe you don't need try-except for fpc (thus you can remove {$mode objfpc} and change integer to longint), and the code can be optimized away:
You might get additional speed by replacing CurrentLine indexing with pointer dereference, due to fpc is currently missing array indexing optimization feature. So the final code would be:
I get a little speedup with that on my machine. Please test on yours.
Cool!
Cool, thanks a million! :) Do you mind if I include it in the post, with credit, and a link? (You suggest one!)
fpc with improved file buffer
Reading the file with a size-adjusted buffer, working with pointers and skipping the 'N's gives a lot a speed.
Some Pascal dialects allow even shorter code
DWScript version, performance here is slightly faster than your Python version
And you can even go a wee bit faster and shorter in the inner loop at the expense of readability with
general comments and speed improvements, mainly for fpc.
I think that the program spends about 40% of its time in reading the file. As a quick test, i took out all the analysis in the fpc program and the execution time dropped to 40%. I did not test it, but i expect that reading the file in one or only a few large chunks will increase the speed quite a lot.
Regarding the Pascal example replacing the while loop with a for loop, using the builtin function inc and switching to set logic gives a small improvement in speed for the fpc example. The analysis loop then becomes:
Some additional improvement can be achieved with Whole Program Optimization, which needs two compiler runs:
fpc -Ur -O3 -Xs- -OWall -FWgc -XX -CX gc.pas
fpc -Ur -O3 -Xs- -Owall -Fwgc -XX -CX gc.pas
Altogether, execution time dropped by 15%.
I think changing the logic to this version, also gives another small speed increase:
Many thanks Mischi! The
Many thanks Mischi! The compiler flags and double compilation indeed shaved off some milliseconds! Would be interesting with some short explanation of what it does! :) (Didn't find much in the man pages for fpc, for some of them?)
I didn't manage to get the code faster with the code changes though ... don't know if I do something stupid.
Whole Program Optimization
This is the link to the respective wiki page: http://wiki.freepascal.org/Whole_Program_Optimization
fpc -i shows which wpo is possible for the respective processor. I have not much more clue about it.
Mischi
Ah, thanks! There's something
Ah, thanks! There's something to study ... and more things to try, if one gets the time :)
Re: reading the file
The speed of reading the file can be improved quite a bit with tuning the file buffer.
adding
Buf : Array[0..4*1024] of byte;
to the var section and the additional Settextbuf statement after the assignfile yields quite a gain:
Assign(FastaFile, 'Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa');
Settextbuf(FastaFile, Buf, sizeof(Buf));
Reset(FastaFile);
Hi, I'm not a fan of either
Hi,
I'm not a fan of either Python or D, but your Python code could be speed up a bit I think:
I have not tested it on my own. I'd be very curious to see the effect of my suggestions if you'd like to test and post.
In case you're interested in High Performance Python:
http://ianozsvald.com/2011/07/25/high-performance-python-tutorial-v0-2-f...
http://ianozsvald.com/2012/03/18/high-performance-python-1-from-pycon-20...
Thanks! Counting them
Thanks! Counting them separtely definitely saved a bit. Upcoming update post in the works! Stay tuned!
Separate counting
I also tried this in the pascal example, but it did not really gain something on a first try. I think that the integer increases might be done in parallel to something else with no extra cost. The time is basially determined by the comparisons. I could shave off quite a bit by starting with a test on the character 'N', which is the most frequent. Although this is one additional test for all the valid characters, it saves two tests for 'N'.
One could also take advantage of the fact that nearly all lines have the same length.
Using a profiler will probably tell you more about where the time is spent.
The fastest version for such a simple program is probably one with pointers, labels and gotos, but no one wants to maintain such code. I could do it in Pascal, but i do not even want to see such code in the public ;-)
MiSchi.
Implementation in nimrod
I liked your small test case idea so I made my own version of your test in Nimrod (http://nimrod-code.org) which you can grab at https://gist.github.com/gradha/5555512. I wrote as a comment the performance values I get on my machine. If I use your measurements as reference and normalize my results against those I get:
cpp: 0.399 (normalized), python: 6.353 and nimrod: 0.5329
So it would seem that nimrod is faster than the alternatives you present, though my python version is faster than yours too, so it would be better if you could run the nimrod version on your machine to compare.
Wow, didn't even know there
Wow, didn't even know there is such a language. Does it have ubuntu packages? :)
The guys at IRC tell me there
The guys at IRC tell me there was some stale package at some point. Looks like https://aur.archlinux.org/packages/ni/nimrod-git/PKGBUILD seems to be some recipe for making one, though compiling nimrod yourself should not be hard following the instructions from https://github.com/Araq/Nimrod#compiling.
I'm using the git version on macosx. IIRC the last stable release doesn't work unless you manually switched to the clang compiler, which is fixed on the git version.