About captin nod


Captin Nod is still living under a desk, and now works in the VFX industry. He sometimes wears a hat, and can occasionally be found gibbering and giggling in a corner for no readily apparent reason.

Links

Stuff
My research page
Paper models
Publications
my photos
Mandriva RPM's
Other stuff
Diary of a PhD widow
hirenj
eke
Celia knits
ndls
Bad Science

View Bhautik Joshi's profile on LinkedIn


Recent posts

What would Cthulhu do?
bloody phd :(
MMVR 2006
i'm in LA
battlestar galactica season 2: redux
i've completely fallen out of my tree
beached
we're men.. men in tights
salvador dali's magic castle
This spoon is too big

Archives

May 2000
June 2000
July 2000
October 2000
November 2000
December 2000
February 2001
March 2001
April 2001
August 2001
January 2002
June 2005
July 2005
August 2005
October 2005
November 2005
December 2005
January 2006
February 2006
March 2006
April 2006
May 2006
June 2006
July 2006
August 2006
November 2006
December 2006
February 2007
March 2007
April 2007
May 2007
June 2007
homepage of bjoshi
lack of style
lack of colour
abundance of correct grammatical terms
Old page


16.2.06

The queue

 
You'll have to pardon me if I've been a bit incomprehensible recently, I've been busy writing yet another paper. Hopefully this'll be the last one before I start my write-up in earnest. The basic gist of it is that I'm writing a paper on squashing brains, which always makes for some rather confusing after-dinner conversation.

Anyway, I've been wanting to keep up with Hiren's stream of consciousness dissertation on programming, so here's my attempt at turning your own brain to a liquefied pulp.

I've had the recent displeasure of needing to build ATLAS, LAPACK and CVMLIB, which all together provide an optimised library for linear algebra, needed for our real-time simulation work. Building on x86 presents few problems, with ATLAS only needing three hours to build (!!), giving me a bunch of static (.a) libraries which I can happily link against. For me, a good set of optimisation flags for gcc, on pentium4 processors are:

-O3 -pipe -march=pentium4 -ffast-math -mfpmath=sse -msse2

On pentium-m (e.g. laptop/centrino)

-O3 -pipe -march=pentium-m -ffast-math -mfpmath=sse -msse2

On x86_64 (notably, on AMD64, such as Athlon and Opteron), it's a whole different kettle of fish. Here, I've discovered, if you build your libraries as shared (.so), then you need to build the libraries you link against also as shared libraries, and build the objects with position independent code. This is complicated by the fact that LAPACK, ATLAS and CVMLIB's build scripts are written to mostly generate static objects :|

The solution lay in manually adding -fPIC to pretty much every compile line in building the libraries (usually by editing the top-level make include file), and then repackaging the ordinary static libraries as shared .so files. This can be achieved using a small shell script:

#!/bin/sh
mkdir tmp
cd tmp
ar x ../$1.a
gcc -shared *.o -o../$1.so
rm -rf *.o
cd ..

To create foo.so from foo.a, simply run:

./conv foo

It's a little ugly, but it does work. Additionally then, I've found the set of optimisation flags on x86_64 that work for me are:

-fPIC -funroll-loops -march=k8 -ffast-math -fpeel-loops -m64

The -m64 does make a significant difference; I've found that, in practice, the linear-algebra routines I run (matrix inversion etc.) run about 10-20% faster with -m64 on AMD platforms. If I had time, I'd like to properly check that out. Oh well.

I've also been keeping up on building a variety KDE packages for Mandriva 2006; the latest selection of (mostly eye-candy) RPM's are here, including a 1.3.7 build of the comix window decoration.

And now, since the medium of blogging seems to demand it: indulge yourself with some alternative entertainment.
Comments: Post a Comment

Links to this post:

Create a Link



<< Home

Powered by Blogger :P

geotargeting test!

Listed on BlogShares