badmerge -- concrete version with bad semantics

See also the abstract version, in which every line of the file is simply an arbitrary character, to focus attention on the question of "Where does this change go?" instead of on the distracting issue of whether a merge algorithm can be more or less faithful to C programming language semantics. My contention is that darcs merge's decision of "where does the change go", as demonstrated in the abstract version, is better than 3-merge's decision, regardless of the semantic meaning of the text.

The following concrete version was developed by Peter Backes. He and I argued about whether the merge behavior of darcs or that of 3-merge is "safer" or "better" for preserving correct semantics when merging source code.

    a
   / \
  b1  c1
  |    
  b2

Now the task is to merge b2 and c1. As far as I know all revision control tools except for Darcs, Codeville, SCCS, and perhaps BitKeeper treat this as a 3-merge of (a, b2, c1).

The svn 3-merge algorithm and the GNU diff3 algorithm apply the changed line from c1 to the wrong place in b2. At least, I think it is the wrong place, but Peter Backes has constructed this example so that applying the change to the wrong place actually fixes a bug! One way to understand this example is to say that it introduces a different bug in the same location as the original bug, while fixing the original bug (which had been moved to a different location by patch b1), so that when 3-merge applies the bugfix, it accidentally applies to the new bug in the old place.

merged by SVN v1.2.0 or GNU diff3 v3.8.1

Whereas darcs applies the changed from from c1 to the right place in b2, resulting in a clean merge with more bugs! In this example, patch b2 had already fixed the bug, but had done so in a different way than patch c1 had done, and on a different line. Since the two fixes didn't touch the same line, darcs merged the two fixes together, resulting in a new program with two bugs instead of zero.

merged by darcs v1

I believe that this is not actually a counter-example to my thesis that darcs merge is strictly better than 3-merge at the task of tracking the location of changes. Rather, this example simply demonstrates that correctly tracking the location of changes is neither necessary nor sufficient for fixing bugs in programs.

Consider the possibility of a merge algorithm that occasionally flipped bits -- a hypothetical algorithm which, when merging certain kinds of changes, would spuriously convert every instance of the character 'a' in a patch into the character 'b'. It would be easy to construct a merge example where this arbitrary change from a's to b's would accidentally fix a bug. This should be taken as proof that correctly preserving the characters in patches is neither necessary nor sufficient for merges to fix bugs. It should not be taken as proof that correctly preserving the characters in patches is unimportant, or is useless for the human task of fixing bugs. Likewise, the example that Peter Backes provided and that I reproduce on this page demonstrates that correctly preserving the locations of changes is neither necessary nor sufficient for fixing bugs, but nonetheless the human task of understanding the consequences of modifying the source code, and thereby of fixing bugs, is aided by the merge algorithm correctly preserving the locations of changes.

code listings

listing a, C code

int square(int x) {int y = x;
	/* This does some squaring */
	/* Update y to equal the result. */
	/* Question: what is the order of magnitude of this algorithm with respect to x? */
	for (int i = 0; i < x; i++) y += x;
	return y;
}

listing b1, C code

int very_slow_square(int x) {int y;
	y = 0;
	/* Update y to equal the result. */
	/* Question: what is the order of magnitude of this algorithm with respect to x? */
	for (int i = 0; i < x; i++)
		for (int j = 0; j < x; j++)
			y += 1;
	return y;
}

int square(int x) {int y = x;
	/* This does some squaring */
	/* Update y to equal the result. */
	/* Question: what is the order of magnitude of this algorithm with respect to x? */
	for (int i = 0; i < x; i++) y += x;
	return y;
}

listing b2, C code

int square(int x) {int y = x;
	/* This does some squaring */
	/* Update y to equal the result. */
	/* Question: what is the order of magnitude of this algorithm with respect to x? */
	for (int i = 0; i < x; i++) y += x; return y;
}

int very_slow_square(int x) {int y;
	y = 0;
	/* Update y to equal the result. */
	/* Question: what is the order of magnitude of this algorithm with respect to x? */
	for (int i = 0; i < x; i++)
		for (int j = 0; j < x; j++)
			y += 1;
	return y;
}

int slow_square(int x) {int y = -3*x*x; x *= 2;
	/* This does some squaring */
	/* Update y to equal the result. */
	/* Question: what is the order of magnitude of this algorithm with respect to x? */
	for (int i = 0; i < x; i++) y += x;
	return y;
}

listing c1, C code

int square(int x) {int y = x;
	y = 0;
	/* Update y to equal the result. */
	/* Question: what is the order of magnitude of this algorithm with respect to x? */
	for (int i = 0; i < x; i++) y += x;
	return y;
}

listing 3-way merged, C code

int square(int x) {int y = x;
        y = 0;
        /* Update y to equal the result. */
        /* Question: what is the order of magnitude of this algorithm with respect to x? */
        for (int i = 0; i < x; i++) y += x; return y;
}

int very_slow_square(int x) {int y;
        y = 0;
        /* Update y to equal the result. */
        /* Question: what is the order of magnitude of this algorithm with respect to x? */
        for (int i = 0; i < x; i++)
                for (int j = 0; j < x; j++)
                        y += 1;
        return y;
}

int slow_square(int x) {int y = -3*x*x; x *= 2;
        /* This does some squaring */
        /* Update y to equal the result. */
        /* Question: what is the order of magnitude of this algorithm with respect to x? */
        for (int i = 0; i < x; i++) y += x;
        return y;
}

listing darcs merged, C code

int square(int x) {int y = x;
        /* This does some squaring */
        /* Update y to equal the result. */
        /* Question: what is the order of magnitude of this algorithm with respect to x? */
        for (int i = 0; i < x; i++) y += x; return y;
}

int very_slow_square(int x) {int y;
        y = 0;
        /* Update y to equal the result. */
        /* Question: what is the order of magnitude of this algorithm with respect to x? */
        for (int i = 0; i < x; i++)
                for (int j = 0; j < x; j++)
                        y += 1;
        return y;
}

int slow_square(int x) {int y = -3*x*x; x *= 2;
        y = 0;
        /* Update y to equal the result. */
        /* Question: what is the order of magnitude of this algorithm with respect to x? */
        for (int i = 0; i < x; i++) y += x;
        return y;
}

zooko
Last modified: 2008-02-25