Transitive link dependencies with cmake

CMake has a great feature: transitive link dependencies. Let’s imagine an executable A that depends on library B which itself depends on library C. CMake is smart enough to link A with B and C, even if the dependency list of A only includes B.

Knowing this, I organize my sources in a particular way. A class is usually declared in a .h file and implemented in a cpp file. Well, for almost each .h/.cpp pair I define a library and list in the CMakeLists.txt the dependencies directly used by the cpp file.

Such an approach as huge advantages. First, the code base can scale easily, with many utility classes and executables that link only what is required. For example, each h/cpp file pair should ideally have a unit test. This test probably needs to be linked with only a small fraction of the code base. Relying an CMake transitive dependencies is a great way of selecting only what is required and all that is required.

There is another advantage. Imagine a library L that depends on boost. Of course, all users of L must link boost. Later in the life of the project, L is refactored to rely on c++11 features and it does not need boost anymore. Well, simply changing the dependency list of L will remove boost from everything that does not need it anymore. How convenient!

Of course, this scheme blows up the number of libraries, but this is not a problem. Even visual studio can manage workspaces with hundreds of projects quite well.

CMakeLists.txt example:

cmake_minimum_required(VERSION 2.8)
add_library(logging logging.h logging.cpp)
cxx_test(loggingTest loggingTest.cpp logging gtest_main)

Unit testing with gtest and cmake

After coding two years for Google, I got used to unit tests. Fortunately, part of the google testing framework is open source, it is called gtest. Using gtest is great, but it is usually not installed on the machine on which I want to compile my stuff. One option to address this problem would be to copy gtest sources in all my projects. That is not very satisfying.

I came up with another solution. Since I’m using cmake anyway, I wrote cmake files that would download sources and compile gtest at project compilation time. This solution avoids code duplication and limits system dependency to subversion which is rather common and easy to install on most development platforms.

If you like the simplicity of this approach, have a look at it and let me know if it works for you. Your pull requests on github are welcome if there is something wrong.

Optimizing RANSAC with SSE

When developing computer vision systems that needs to find plane projections from point-to-point correspondences, a RANdom SAmple Consensus (RANSAC) implementation is necessary. The algorithm rejects wrong correspondences (outliers) and find the geometric transformation, usually a homography, explaining inliers. The algorithm randomly picks 4 correspondences, find the corresponding homography, and count how many correspondences it explains. After a fixed number of iteration, the homography with the best support is chosen.

OpenCV offers an implementation in findHomography. It is unfortunately rather slow. A great optimization approach is to leverage SIMD instructions such as SSE or NEON. In theory, SIMD instructions could allow the processor to test 4 homographies in a single pass. However, conditional jumps are forbidden since the execution flow has to be common for the 4 tested homographies.

Computing homographies without conditional jumps

We need a jumpless way of computing a homography from 4 correspondences. Since I’m lazy, I asked maple to generate the function for me. Maple had problem inverting analytically the 8 by 8 matrix to solve the problem. It managed to compute analytically a homography sending the unit square to arbitrary points, though.

Here’s the maple code:

with(linalg);
eq := (x,y) -> <<x[1]| x[2] | 1 | 0| 0| 0| -y[1]*x[1]/1 | -y[1]*x[2]/1>, <0|0|0|x[1]|x[2]|1|- (y[2]/1)*x[1]|- (y[2]/1)*x[2]>>;
mat := < eq([0, 0],x) , eq([0,1],y), eq([1,1],z), eq([1,0],w)>;
inv := inverse(mat);
homo := simplify(evalm(inv &* <x[1],x[2],y[1],y[2],z[1],z[2],w[1],w[2]>));
homo_func := unapply([homo[1],homo[2],homo[3],homo[4],homo[5],homo[6],homo[7],homo[8]],x,y,z,w);
CodeGeneration[C](homo_func, optimize);

This little code produces a rather large function that does not use SIMD instructions at all: it uses double. To compute an arbitrary homography, I just call this function twice, invert one result, and multiply both matrix together. These function do not need conditional jumps.

Thank to the power of C++ (operator overloading in particular), replacing “double” with a SSE type simply amounts to defining a class.

Replacing double with fvec4

Compilers give access to SSE instructions through intrinsics that do not look very friendly. The following example is rather hard to read:

// declares a variable containing "1, 2, 3, 4"
__m128 a = _mm_set_ps(4, 3, 2, 1);
// Adds "7" to all entries of a
a = _mm_add_ps(a, _mm_set1_ps(7));

This version is much easier to read:

fvec4 a(1, 2, 3, 4);
fvec4 a += fvec4(7);

The fvec4 class can be implemented by declaring proper constructors and operators. The only data member is a single __m128 field. It might look like:

struct fvec4 {
    __m128 data;

    fvec4(float a) {data = _mm_set_ps1(a);}
    fvec4(float a, float b, float c, float d) { data = _mm_set_ps(a,b,c,d); }

    fvec4 operator += (const fvec4 &a) {
      this->data = _mm_add_ps(data, a.data);
      return *this;
    }
};

Once we have such a class (see my fvec4 class, used in polyora), we can simply replace “double” by “fvec4″ in the function generated by maple. The result is a function that computes 4 homographies in a single jumpless execution.

Parallel RANSAC loop

We now have everything we need to write our RANSAC loop:

// compute the homography
fvec4 H[3][3];
homography4_from_4corresp(pts1[0], pts1[1], pts1[2], pts1[3],
                          pts2[0], pts2[1], pts2[2], pts2[3],
                          H);
// evaluate support
fvec4 support(0);
for (int i=0; i<n; i++) {
  // Fetch the 4 next correspondences
  p[0] = fvec4( row(i,uv1,stride1)[0] );
  p[1] = fvec4( row(i,uv1,stride1)[1] );
  g[0] = fvec4( row(i,uv2,stride2)[0] );
  g[1] = fvec4( row(i,uv2,stride2)[1] );

  // Projects the points with the homography
  fvec4 t[2];
  homography4_transform(p, H, t);

  // Compute reprojection distances
  fvec4 d = dist2(t,g);

  // Compute support.
  support += (d < threshold) & fvec4(1 - .1 * (d / threshold));
}

And here we go ! See the full implementation here.

Results

Using SSE instructions to find homographies significantly speeds up computation. My implementation is much faster than OpenCV’s findHomography function which is a bit more more accurate, because it uses 64 bits doubles instead of 32 bits floats.