2007-12-18

ksh93 performance through builtins - a small example

I recently held a presentation for the Dutch OpenSolaris Users Group about the work of Roland Mainz on integrating ksh93. I focused on the history of unix shells, how ksh93 was accepted in the OpenSolaris project, specifically highlighting the OpenSolaris ARC process.

I did not discuss the relative merits and reasons for getting ksh93 included, but today I'll mention one reason in particular: shell script performance. Ksh93 is faster than any other POSIX conforming shell in executing code, not in the least because many standard unix commands have been included as builtins in the shell, allowing scripts to bypass fork() and exec().

Here is a comparison of two lines of code in ksh93. Their effect is exactly the same: Ten thousand times, they create and remove a unique directory. The difference is that the second time around, the ksh builtins for mkdir and rmdir are used.


glorantha 10 $ time for i in {0..9999}; do /bin/mkdir $i; /bin/rmdir $i; done


real 0m35.63s
user 0m1.44s
sys 0m1.65s
glorantha 11 $ time for i in {0..9999}; do mkdir $i; rmdir $i; done

real 0m1.44s
user 0m0.33s
sys 0m1.09s

And this explains why a builtin matters. Both the internal and external command are just as efficient if they're only invoked once. When you have to invoke an external command inside an inner loop, the unix fork()/exec() overhead add up.

But there are other ways of improving the above bit of code, if you're satisfied with a slightly different sequence of actions, and non-POSIX code due to the compact {start .. end } notation.

What I like about this code is that it's compact, and quite clear in its intention.

glorantha 12 $ time mkdir {0..9999} && rmdir {0..9999}

real 0m0.53s
user 0m0.02s
sys 0m0.51s
glorantha 13 $ time /bin/mkdir {0..9999} && /bin/rmdir {0..9999}

real 0m0.53s
user 0m0.03s
sys 0m0.48s

More later

No comments: