Using Mercurial’s bisect extension to find bugs

One week ago Openbravo ERP code was moved from Subversion to Mercurial. I am completely new to Mercurial and to distributed SCMs since I always worked with Subversion but, apart from the new concepts it incorporates, the transition was very smooth, at least till the moment. I’ve spent some time during the last days looking the extensions Mercurial has and, for me, one of the nicest ones is bisect. Bisect can be very useful to find the changeset when a bug was introduced in the code.

A real example

Recently I was assigned this bug. I discovered that it was not present in  r2.40 but it was in development branch, furthermore, I find out the bug was caused because a line had been removed from the code. At this point I wanted to know which commit removed that line, just to know if it was a mistake or it was done on purpose trying to fix another bug, so the tedious work started: given two revisions one that has the bug (head of development branch) and another one that has not it (r2.40 tag) try different revisions in between to find which one removed the line that causes the bug. This is not only tedious but also very time consuming.

A good solution: bisect

Bisect is an extension for Mercurial that makes this kind of work much faster. Bisect’s behavior is pretty simple, you tell it the good and bad changesets and it updates your working copy to another one between them, then you can test if in that one the bug is present and you mark it as good or bad, then the process is run again until it finds the changeset that introduces the bug.  In fact bisect just decides for you which is the next changeset to test in. It sounds not to be a very high improvement, but if you combine it with some simple (at least in some occasions) automated test the results can be awesome.

Let me explain it through a simple example:

  • First prepare the environment: a file with a lot of lines, and a commit somewhere there removing a line which we’ll look for afterwards.
$ hg init testBisect
$ cd testFile
$ cat testFile
This is
a file
with some feature
and this
line here
is needed to work fine
$ hg ci -A -m "init file"
adding testFile
$   for (( i = 0; i < 335; i++ )); do echo "line"$i >> testFile; hg commit -m "change here"; done
$ sed -i 's/line here//' testFile
$ hg ci -m "this commit is buggy"
$ for (( i = 0; i < 872; i++ )); do      echo "line"$i >> testFile;      hg commit -m "change here";      done
$ hg parents
changeset:   1208:7568a581b554
tag:         tip
user:        Asier Lostalé
date:        Mon Mar 02 11:13:39 2009 +0100
summary:     change here
  • Now we have 1208 changests! Let’s make the script to decide whether a revision is buggy or not and to continue looking in case it is not:
$  cat test1.sh
#!/bin/sh
MIN_ARGS=2
if [ $# -lt $MIN_ARGS ]; then
  echo "Usage: $(basename $0) FILE TEXT_TO_FIND" >&2
  exit 1
fi
FILE=$1
shift
TEXT_TO_FIND=$*
check() {
   grep -q "$TEXT_TO_FIND" $FILE && RESULT=good || RESULT=bad
   echo $RESULT

   hg bisect --$RESULT
}
while :
do
  if check | grep -q 'Testing changeset'
then
  echo
  hg bisect
else
  hg bisect
  exit 0
fi
done
  • Now we are ready to start testing, first of all reset bisect and tell it which is the bad and good known revisions. Bad is current one and good is the first one.
$ hg bisect --reset
$ hg bisect --bad
$ hg bisect --good 1
Testing changeset 604:9d6a42635e81 (1207 changesets remaining, ~10 tests)
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
  • Finally just execute the test to find out who removed the line.
$ time ./test1.sh testFile 'line here'
Testing changeset 302:132a5339324e (603 changesets remaining, ~9 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 453:aa92eb899545 (302 changesets remaining, ~8 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 377:5c8e69bdb1ce (151 changesets remaining, ~7 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 339:05f7bb18e505 (75 changesets remaining, ~6 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 320:3107aee2dbd2 (37 changesets remaining, ~5 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 329:0d907ee53cdb (19 changesets remaining, ~4 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 334:8a0d38375333 (10 changesets remaining, ~3 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved

Testing changeset 336:dc2037e24dfc (5 changesets remaining, ~2 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
Testing changeset 335:939ca611ae0f (2 changesets remaining, ~1 tests)
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
The first bad revision is:
changeset:   336:dc2037e24dfc
user:        Asier Lostalé
date:        Mon Mar 02 11:11:52 2009 +0100
summary:     this commit is buggy
real    0m1.374s
user    0m1.128s
sys    0m0.204s

And we are done: in less than 1.5s we know which is the commit that removed the line!

Though this example is quite theoretical I think bisect is a very good solution for this kind of searches. Do you have experience with this extension? Any comment about it is welcome.

Advertisements

Subversion 1.5 merge problems

Last days we have had several problems trying to merge two branches using subversion. I wanted to merge trunk to modularity but I always obtained this error:

svn: Working copy path 'lib/runtime' does not exist in repository

This happened using any merge command (svn merge modularity, svn merge trunk@r1 trunk@r2…).
It seems to be related with subversion issue 3067 and the only way to make it work was checking out the svn branch that solves this issue compiling it and using it to do the merge. The steps to do that are:

1) svn co http://svn.collab.net/repos/svn/branches/issue-3067-deleted-subtrees/ svn-mod
2) cd svn-mod
3) ./autogen.sh
4) ./configure
5) make

After doing the merge using that svn client the working copy cannot be used anymore with the old svn client.

Reintegrating multiple branches to trunk

I have some doubts about how to merge back to trunk the following case: I have a project branch that is already reintegrated in the trunk and another branch that started from the first one, the problem is how to reintegrate it to the trunk.

Continue reading

Merging with subversion 1.5

Some weeks ago Openbravo subversion repository was upgraded to version 1.5.

The greatest feature this new version has is the merge tracking. From now on it is not necessary to manually have into account the revisions that have already been merged between branches.

Now merging is as easy as:

myWCbranch$ svn merge https://dev.openbravo.com/svn/openbravo/trunk

This command will merge all the not already merged changes in trunk to my branch working copy. And now the best part: when I’ve finished with by branch, to merge it back to the trunk this command will do all the work:

myWCtrunk$ svn merge --reintegrate https://dev.openbravo.com/svn/openbravo/myBranch

I hope this will save us a lot of headhaches…

There is a new feature personally I don’t like is the the interactive conflict resolution, specially when merging big projects it is annoying to have to manually decide about each conflict while the process has not been finished yet. It is possible to deactivate it editing the .subversion/config file to add in the [miscellany] section the following line:

interactive-conflicts = no

There is some documentation about branches and merges in the Openbravo wiki.

Openbravo ERP r2.50 provisional branch

In this post I will introduce some guidelines and recommendations on how to manage the source code branches for Openbravo R2.40 and R.250.

Currently trunk is used for bug fixing for r2.40 until r2.40 stable version is released.

A branch for r2.50 developments was created from trunk and will be merged back to trunk once r2.40 stable is released. Periodically r2.50 branch will be updated with trunk to take there the bug fixed in trunk.

Taking all this into account the way of working should be:

  • Bug fixing: Only commit to trunk, it is not necessary to do it to r2.50 because these commits will be merged periodically, in case you want to commit the fix also to r2.50 do it via merge: do not make different commits in both sides (which could cause conflicts).
  • New r2.50 developments: Create a branch for the project from r2.50 branch. Once the project is finished merge it back (reintegrate) to r2.50 branch, in case when the project is finish r2.40 stable is released the merge back will go to trunk.