This post is from the CollabNet VersionOne blog and has not been updated since the original publish date.
The Invisible Launch: Threaded Setupdude Speeds Up Subversion
In late December, we implemented a core system upgrade that significantly increased the speed of all processes handled by Codesion, as well as introduced the concept of job prioritization. And while you may have noticed the system acting more quickly, you probably didn’t even notice this launch of “threaded SetupDude”. I’d like to share with your some of the technical story.
SetupDude allows 2X to 10X increase in jobs performance
SetupDude an invisible part of our system, a daemon responsible for the actual setup of svn, trac, and all our other services across our distributed system. Whenever you make a change in the UI, setupdude is notified and makes sure the configurations are all updated. If you delete a project, guess who makes it go away? Setupdude also handles a range of deferred tasks including email notifications of commits.
To make sure that changes in the UI are reflected in our services as quickly as possible, we’ve given Setupdude an overhaul: it is now “threaded”. This means it can run several jobs at once, using as many CPU cores as we can throw at it.
For most of our servers that’s 8 cores, plus we make sure each core has a second thread it can switch to if it has to wait on anything (for example, hard disk access or a response from another server). Also, some jobs stack well (for example, if there are several changes to apply to one config file). We were already stacking these jobs, but now we do it more easily and can “stack higher” to get the most out of it.
This graph illustrates the relationship we found between number of cores and the optimal number of threads, ignoring IO and memory limitations. We find that overall system speed increases by at least 2x, and under certain conditions by up to 10X.
Job Prioritization for Professional Edition accounts
Codesion’s finely tuned engine means your service setups will be lightning fast. In the rare event that we’re flooring it (eg, if we’re rolling out a feature that requires setup changes to every account) we’ve made sure our Professional Edition customers get the quickest possible delivery by adding a prioritization scheme. Don’t worry — Team Edition customers will still see a significant speed up as the new threading capabilities far outweigh the job prioritization for Pro customers.
It’s also reliable. In testing we randomly killed worker processes. It was like a shooting gallery! Many, many worker processes were harmed in the making of this feature :). Like Terminator 2 the system absorbed it all, re-generated and kept on coming.
To be a Perl fork or DudeFork?
Perl geeks might be curious to know that we ended up creating our own variant on the “forks” library for this. Actual perl threads were too co-dependent – they render the ARLM signal unusable for timeouts, have issues with db handles and some other perl libraries, and when you start looking into their “all variables are local to the thread” feature you find its not copy-on-write doing that, they really copied every variable, making this feature not as lightweight as we’ve come to expect from threads in other environments.
Perl’s “forks” library solves some of these problems but goes out of its way to emulate others so that it can be a “drop-in replacement” for threads. The “fork()” system call won’t work as expected in forks, and the inter-process communication turns out to be via shared variables that are emulated via sockets… so why not just use forks and sockets in the first place? So we wrote “DudeForks”, which works a bit differently and offers a lightweight, robust way to manage a set of worker processes.
Thanks to our development and testing staff for making this upgrade go so smoothly that you probably didn’t even notice it.