[tahoe-dev] darcs vs. mercurial

Sun Nov 4 08:39:26 PST 2007

Brian:

Thanks for the extensive feedback on darcs and hg!  I have some  
comments in-line.

On Nov 4, 2007, at 1:39 AM, Brian Warner wrote:

> Sure thing! First though, I do want to make it clear that I'm not  
> trying to
> provoke a darcs-vs-mercurial flamewar..

Okay!  I hereby agree not to denigrate your lineage, moral character,  
or your good looks during this conversation!

Seriously, I have three motivations:  1.  Help us decide which tool  
is best for our use.  2.  Contribute feedback to the authors of the  
two tools.  3.  If our feedback stimulates them to improve the issues  
that we care about then we benefit.

> I used 'hg convert --source darcs tahoe-trunk' to convert a copy of  
> the Tahoe
> source repository

I just ran a tailor process to do the same thing (here is the command- 
line to do it yourself [1]).

> (obtainable from http://allmydata.org/repos/tahoe) into a
> mercurial repository. The process took about 70 minutes,
> (btw, after looking more closely at the hg output, I see it is  
> incorrect:

The tailor process took about 120 minutes.  Tailor did it correctly  
-- there are no differences between the resulting hg-managed source  
tree and the original darcs-managed source tree.  Do you want to  
report a bug to hg or shall I?

> A direct darcs checkout of our tree consumes (according to 'du', on a
> filesystem that probably uses 4KiB blocks) 20MB, and has 1893  
> files. An hg
> checkout consumes 13MB under the same conditions. (that's including  
> the bogus
> directories.. it consumes 12MB and 824 files once I clean those out  
> to make
> the hg tree look like the darcs ones).

On Mac OS 10.4/Intel (the filesystem is HFS+?), a darcs repository  
takes 17 MB.  If you create a checkpoint (by running "darcs  
optimize") and then do a partial get (with "darcs get --partial")  
then the resulting repository takes 9 MB.  However, I don't trust  
darcs handling of partial repositories in general -- I think there  
are many open bugs against that feature -- so I don't use it.

On my machine the hg version takes around 10 MB.

> A 'darcs get' over SSH takes 167 seconds, almost all of which is  
> fetching
> patches (one at a time: network utilization is very low during this  
> period);
> perhaps the last 5 or 10 seconds is actually applying them to build  
> up the
> new tree. A 'hg clone' over SSH takes 18 seconds, out of which  
> probably 16
> seconds is fetching revisions (which appears to be bandwidth  
> limited: network
> utilization is very high during this period), and the last second  
> or two is
> building the local tree.
>
> A 'darcs push -a' of a 3-line change over SSH takes 2.67 seconds.  
> An 'hg
> push' of the same change over SSH takes 1.16 seconds.
>
> A 'darcs pull -a' of a new 3-line change over SSH takes 4.82  
> seconds. An 'hg
> pull -u' of the same change takes 1.15 seconds. (the '-u' tells hg  
> to update
> the local tree to reflect the new changes, to match the darcs  
> behavior).

If I understand correctly, the SSH Control Master feature was added  
in darcs v1.0.8 in order to improve this behavior, but that feature  
ran afoul of an undiagnosed problem so the Control Master feature is  
off by default in darcs v1.0.9.  That problem was subsequently  
identified as a bug in OpenSSH on Mac OS X [2].

If you could rerun your measurements with the "--ssh-cm" option, I  
would appreciate it.  It certainly won't close the gap between darcs  
and hg performance, but it will give darcs developers an indication  
of how much gap will remain after ssh Control Master works.

> MOTIVATIONS:
>
> My motivation for considering a switch to hg has three components:  
> speed,
> accessibility, and release management.

I too am motivated by these things.

> SPEED: ...
...
> The speed differences of push and pull are my biggest concerns,  
> because I
> think they will get worse as the repository grows.

I believe that using "darcs optimize" and allowing it to create a  
checkpoint from the most recent patch means that the push and pull  
speed are proportional to the number of patches after the most recent  
checkpoint that both ends share.

In practice, this made a huge difference with local pushes and  
pulls.  I haven't yet carefully measured it with remote.

> Our coworker Rob is constantly frustrated by inexplicable minutes- 
> long delays
> with 'darcs push'. I haven't seen nearly the sorts of problems he has
> (perhaps because I never commit from windows or OS-X), but I'm  
> worried that
> this is an inevitable consequence of managing large trees with  
> darcs, and so
> I worry that as the Tahoe repository gets bigger, it (and we) will  
> start to
> suffer the same problems. I assume this delay comes from two  
> things: the
> algorithmic complexity of doing the patch algebra to determine what  
> must be
> sent, and the one-at-a-time nature of the patch transfer steps (I  
> believe it
> is using scp or sftp to copy things one at a time, whereas hg seems  
> to be
> using a custom protocol over ssh like the way cvs and svn do it).

There are actually a host of different issues that combine to cause  
this problem.  None of them can be discounted.  Let's enumerate them:

1.  Darcs has unacceptable algorithmic complexity in the face of  
certain kinds of conflicts [3].  Actually I think that this rarely  
contributes anything at all to the slowdowns that we experience.  You  
can tell if this is the problem because your CPU runs at max for long  
enough that you notice that something is amiss.  Although this is  
rare, it is not unheard of in our experience, and the fact that we  
can't tell if it is this case or one of the other cases below makes  
it frustrating and difficult to debug and to use the tool effectively.

2.  Darcs's use of SSH triggers an OpenSSH bug, possibly only on Mac  
OS X, and possibly only if the "Control Master" feature is enabled  
[2].  I think that the Control Master feature was enabled by default  
in 1.0.8 and disabled by default in 1.0.9.  The symptom of this bug  
is that darcs push or pull to remote repository simply hangs.

3.  Darcs's use of multiple sequential ssh (or sftp or scp or  
something) operations is inefficient.  If you don't have Control  
Master working, then this means you have to wait for full TCP  
connection setup and SSH crypto negotiation multiple times.  I'm not  
sure how many times.  It's more than four or five, and less than the  
total number of patches to be transferred, I think.  Even if you do  
have Control Master working, you still have to wait for multiple  
round trips -- I'm not sure how many, and I'm not sure how much this  
contributes to the total wall-clock time.  I don't know if darcs is  
sophisticated enough to pipeline all of the round trips that it can.

4.  Darcs has a few different bugs in its user interface so that it  
doesn't give proper feedback to the user about what is going on,  
leaving the user to sit there wondering if it is going to take 1 sec,  
100 seconds, or forever.  This is an important part of the user  
experience.  It is a lot more tolerable for a program to take 100  
seconds if it tells you what it is doing!

5.  Some kind of problem I don't fully understand involving your ssh  
server or possibly your ssh client attempting a reverse DNS lookup  
and waiting for a 30 or 60 second timeout before allowing a  
connection to continue.  This is technically not darcs's fault, but  
combined with the other possible slowdowns and the lack of  
transparency (as per the above issues), it is hard to debug, making  
the overall experience is difficult and frustrating.

6.  Darcs operations tend to be semi-proportional (?) to the number  
of patches under consideration (in network operations and/or CPU  
operations).  If you haven't run "darcs optimize" in order to create  
a checkpoint in both of the relevant repositories, then the number of  
patches under consideration gets large.

7.  Generally darcs seems to be a constant factor slower at  
everything than hg is.

My overall recommendation to the darcs developers, now that I've  
thought through these issues, is to start on issue #4 first!

If darcs v1.1.0 reliably prints out information to the user before  
attempting every potentially long-running operation and after  
completing it (i.e., all network operations including connection  
setup, all potentially long-running merges, all potentially long  
sequences of disk operations, etc.) then users like us can take  
advantage of that transparency to help optimize our usage of darcs  
and to submit more useful bug reports.  Obviously this detailed  
performance information doesn't need to be output unless a "-v" or  
even "--debug-performance" flag is turned on.  Timestamps on the  
output lines would be nice to have.

> ACCESSIBILITY: mercurial is an order of magnitude easier to compile  
> than
> darcs. This has never affected me personally, since I'm running debian
> everywhere and tend to only use i386 platforms, but I seem to  
> recall some
> folks using more outlying operating systems or hardware (amd64? OS- 
> X/ppc?
> opensolaris?) who did not yet have a functioning ghc and thus  
> couldn't get a
> darcs binary running. Mercurial only needs python and gcc, so I do  
> not think
> we'll have any community members who are unable to get the latest HEAD
> because they lack the tools to do a checkout.
>
> Mind you, I can't think of specific examples of platforms or people  
> for whom
> the lack of a darcs binary was preventing them from playing with  
> tahoe.

Well, I personally run linux/amd64, OS-X/PowerPC, and opensolaris/ 
amd64, and I downloaded precompiled ghc binaries for each of them,  
and then ran "./configure && make && make install" to build darcs.  ;-)

Your general point is certainly valid, but I'm not sure how many  
people it actually bothers, nowadays.

< Snipping out long thoughtful comments on work flow.  If you haven't  
read Brian's message, read it thoroughly.  I'm not sure whether I  
agree with his beliefs about the effect on workflow. >

> But, hg has these other spiffy features, like the 'hg view' graphical
> revision history browser, and the built-in quilt-like patch management
> thingy, and gpg signatures of revisions (although I do not yet  
> understand
> what exactly this provides nor what value it offers). Really this is a
> reflection of the ease with which hg plugins can be written, which  
> I think is
> a strong argument in hg's favor.

I agree that this is a strong argument.  As I understand it hg was  
designed to accomodate plugins from the beginning, and it has  
attracted an active community of plugin developers.

> Anyways, those are my thoughts. I'm not really pushing to make  
> changes any
> time soon, but I'm always looking to learn more about these tools.  
> I really
> need to spend some time with bzr or monotone to learn those  
> perspectives too.

Thanks a lot!

Regards,

Zooko

[1]
$ tailor --verbose -s darcs -R ~/playground/allmydata/tahoe -r  
INITIAL -t hg  > tailor.config
$ tailor --configfile=./tailor.config

[2] http://bugs.darcs.net/issue437
[3] http://wiki.darcs.net/index.html/ConflictsFAQ