Shape Matching with Polar Boundaries

I was made aware of the use of radius-vector functions (θ-s curves) and tangent-angle functions for shape recognition many years ago (see Simple geometrical shape parameters by Volodymyr Kindratenko). One of the primary limitations of these techniques is that they are based on shape outline alone and disregard any 'holes'. Since I'm currently acquainting myself with computer vision techniques, I thought that it would be interesting and educational to see if I could come up with a variation on these algorithms that can operate with regard to 'holes' within shapes. I've made no significant attempt to provide a formalism for my approach since I'm sure that much better procedures exist than the one I've come up with; having said that, my approach does seem to work with modest success at the basic level to which I've tested it.

Introduction

My approach is to generate a normalized polar projection of the boundary of each shape to be compared. In the images below, each shape (in this instance characters in the typeface "Times New Roman") is displayed together with its polar projection which has been superimposed in the top-left-hand corner.

Each image contains the following information:

The value of this projection is solely based on its usefulness for efficiently matching similar shapes, irrespective of position, size or orientation. Matching proceeds by measuring the degree of overlap between polar projections. A large degree of overlap is regarded in the approach presented here as indicative of similarity-of-shape. From this perspective, the algorithm can be regarded as a variant of the simplest possible tool for comparing binary images – the xor metric. However, the projection has the advantage of normalizing the shape presentation to a degree that allows for efficient evaluation independent of translation, rotation or scaling (albeit upto an estimation of the true xor distance between two shapes).

Algorithm

Given a binary image with the shape white and the background black, the algorithm proceeds to create a polar boundary as follows:

  1. All shape perimeters are extracted from the image as a list of directions (N,S,E,W) that trace the pixel edges. The perimeters of white (resp. black) areas proceed with positive (resp. negative) winding. This is achieved naturally as a consequence of the marching squares algorithm used.
  2. All perimeters are assembled into a single boundary that represents the outline of the shape.
  3. The centroid of the boundary is calculated with a fairly high degree of efficiency as follows: Each perimeter in the boundary is considered separately. A perimeter's centroid is computed by first deriving scanline decompositions from the perimeter in both the vertical and horizontal directions. Each of these structures can be efficiently scanned to calculate the centroid's horizontal and vertical coordinates. Perimeters with positive (resp. negative) winding are assigned centroids with positive (resp. negative) mass. Under this convention, the boundary centroid can readily be computed as the mass weighted average of its perimeter centroids.
  4. The extreme point of the boundary is computed; that being the point (not necessarily unique) on the boundary which is furthest from the boundary's centroid.
  5. A polar plot of the boundary can then be generated. The plot is constructed over a fixed number of angles (this number can be regarded as a global constant for most purposes) that define a set of lines, radiating from the boundary centroid, at which the plot is sampled. The boundary perimeters are walked, and each time a radius is crossed, the square of the perimeter's distance from the centroid is recorded.
  6. The crossings thus identified are sorted by their distance from the centroid and henceforth treated as boundary intervals. In the case where the centroid falls within the shape's interior, there will be an odd number of crossings, in this case zero is added to ensure that all intervals are closed. The polar plot is normalized using the distance to the shape's extreme point; The maximum distance that any interval extends from the centroid is unit length.

Once two or more polar boundaries have been constructructed in this way, they can be compared for similarity in a single orientation as follows:

  1. For each radius, the intervals of each polar boundary are intersected and their lengths summed.
  2. The sums arising from each set of intervals are themselves summed. This is used as a measure of goodness of fit.
  3. This goodness of fit is normalized into the range [0,1] by dividing it by the arithmetic mean of the total length of all intervals in both polar boundaries.

This can be extended to multiple orientations simply by performing the above evaluation at each angle under consideration and choosing the best fit (though there may be more sensible approaches to this - see the notes).

Note the following about the algorithms and my implementation of them:

Evaluation Methodology

To examine the viability of the algorithm for shape recognition, I wrote a program that generated a set of 36 polar boundaries from the upper case letters of the alphabet together with the ten digits 0–9. The shapes and boundaries generated are displayed in a grid below:

The 36 shapes generated together with the boundaries that were extracted from them.

I then calculated the similarity (expressed as a percentage) of each shape to itself and every other shape, these scores were compiled into the following matrix.

      A    B    C    D    E    F    G    H    I    J    K    L    M    N    O    P    Q    R    S    T    U    V    W    X    Y    Z    0    1    2    3    4    5    6    7    8    9  
  A 100%  46%  39%  32%  53%  56%  30%  38%  31%  36%  40%  47%  42%  31%  11%  56%  35%  38%  38%  35%  38%  86%  41%  32%  57%  29%  32%  32%  32%  49%  56%  42%  46%  50%  35%  45% 
  B  46% 100%  30%  64%  38%  43%  53%  73%  34%  36%  40%  34%  42%  45%  25%  36%  49%  48%  50%  36%  42%  46%  40%  38%  40%  41%  57%  30%  41%  34%  44%  48%  46%  34%  47%  44% 
  C  39%  30% 100%  31%  44%  42%  28%  28%  22%  27%  40%  35%  33%  24%  19%  37%  34%  39%  28%  25%  52%  40%  36%  25%  25%  23%  29%  21%  29%  34%  33%  31%  49%  36%  24%  52% 
  D  32%  64%  31% 100%  26%  28%  66%  67%  25%  25%  30%  27%  41%  48%  20%  40%  57%  35%  34%  28%  42%  35%  37%  33%  29%  25%  60%  23%  28%  22%  35%  32%  39%  27%  33%  40% 
  E  53%  38%  44%  26% 100%  49%  29%  36%  34%  40%  42%  40%  40%  28%  33%  54%  28%  42%  40%  32%  37%  48%  35%  33%  47%  42%  31%  35%  50%  57%  43%  46%  43%  32%  38%  43% 
  F  56%  43%  42%  28%  49% 100%  31%  34%  41%  57%  37%  50%  35%  37%  22%  57%  31%  42%  42%  42%  31%  56%  45%  36%  58%  36%  30%  42%  33%  52%  56%  44%  42%  41%  40%  41% 
  G  30%  53%  28%  66%  29%  31% 100%  49%  26%  27%  30%  30%  46%  39%  24%  36%  49%  32%  36%  37%  38%  32%  28%  31%  30%  26%  55%  23%  31%  25%  37%  40%  37%  22%  28%  37% 
  H  38%  73%  28%  67%  36%  34%  49% 100%  35%  33%  38%  31%  43%  56%  22%  36%  49%  42%  50%  36%  41%  40%  37%  44%  31%  38%  63%  33%  36%  32%  43%  42%  40%  32%  48%  40% 
  I  31%  34%  22%  25%  34%  41%  26%  35% 100%  71%  49%  50%  35%  60%  29%  33%  20%  43%  54%  66%  24%  30%  34%  62%  41%  65%  29%  84%  44%  41%  38%  37%  24%  35%  58%  23% 
  J  36%  36%  27%  25%  40%  57%  27%  33%  71% 100%  44%  54%  33%  50%  28%  37%  19%  39%  56%  56%  24%  37%  35%  51%  49%  56%  29%  70%  42%  52%  46%  45%  27%  40%  58%  26% 
  K  40%  40%  40%  30%  42%  37%  30%  38%  49%  44% 100%  37%  45%  45%  13%  38%  29%  64%  48%  49%  47%  36%  39%  61%  40%  43%  29%  50%  41%  36%  42%  44%  43%  36%  47%  44% 
  L  47%  34%  35%  27%  40%  50%  30%  31%  50%  54%  37% 100%  33%  33%  27%  37%  20%  38%  36%  46%  29%  48%  42%  40%  56%  43%  30%  48%  53%  47%  51%  39%  31%  45%  38%  33% 
  M  42%  42%  33%  41%  40%  35%  46%  43%  35%  33%  45%  33% 100%  39%  14%  39%  34%  39%  36%  35%  42%  42%  38%  40%  37%  30%  41%  33%  26%  31%  43%  38%  44%  33%  37%  45% 
  N  31%  45%  24%  48%  28%  37%  39%  56%  60%  50%  45%  33%  39% 100%  17%  32%  36%  39%  54%  53%  32%  30%  35%  61%  36%  51%  43%  62%  30%  32%  39%  35%  31%  33%  51%  30% 
  O  11%  25%  19%  20%  33%  22%  24%  22%  29%  28%  13%  27%  14%  17% 100%  11%   9%  12%  32%  19%  15%  12%   8%  22%  15%  34%  31%  24%  41%  28%  16%  26%  13%   6%  30%  12% 
  P  56%  36%  37%  40%  54%  57%  36%  36%  33%  37%  38%  37%  39%  32%  11% 100%  35%  48%  29%  35%  34%  52%  45%  32%  49%  28%  31%  36%  30%  41%  53%  36%  51%  31%  32%  52% 
  Q  35%  49%  34%  57%  28%  31%  49%  49%  20%  19%  29%  20%  34%  36%   9%  35% 100%  36%  24%  25%  56%  36%  36%  25%  26%  17%  47%  18%  20%  23%  41%  30%  48%  34%  25%  45% 
  R  38%  48%  39%  35%  42%  42%  32%  42%  43%  39%  64%  38%  39%  39%  12%  48%  36% 100%  41%  43%  47%  36%  42%  51%  40%  38%  32%  43%  34%  34%  45%  43%  48%  29%  48%  49% 
  S  38%  50%  28%  34%  40%  42%  36%  50%  54%  56%  48%  36%  36%  54%  32%  29%  24%  41% 100%  51%  32%  37%  33%  56%  47%  45%  48%  52%  40%  48%  42%  55%  29%  29%  64%  28% 
  T  35%  36%  25%  28%  32%  42%  37%  36%  66%  56%  49%  46%  35%  53%  19%  35%  25%  43%  51% 100%  31%  37%  35%  55%  50%  59%  31%  69%  43%  42%  37%  46%  29%  33%  44%  29% 
  U  38%  42%  52%  42%  37%  31%  38%  41%  24%  24%  47%  29%  42%  32%  15%  34%  56%  47%  32%  31% 100%  38%  33%  32%  31%  21%  35%  23%  25%  32%  43%  31%  55%  34%  29%  53% 
  V  86%  46%  40%  35%  48%  56%  32%  40%  30%  37%  36%  48%  42%  30%  12%  52%  36%  36%  37%  37%  38% 100%  39%  30%  56%  28%  34%  30%  34%  44%  60%  43%  43%  51%  33%  42% 
  W  41%  40%  36%  37%  35%  45%  28%  37%  34%  35%  39%  42%  38%  35%   8%  45%  36%  42%  33%  35%  33%  39% 100%  42%  39%  31%  28%  33%  27%  36%  49%  35%  41%  46%  36%  43% 
  X  32%  38%  25%  33%  33%  36%  31%  44%  62%  51%  61%  40%  40%  61%  22%  32%  25%  51%  56%  55%  32%  30%  42% 100%  38%  54%  30%  64%  35%  36%  39%  42%  29%  29%  53%  27% 
  Y  57%  40%  25%  29%  47%  58%  30%  31%  41%  49%  40%  56%  37%  36%  15%  49%  26%  40%  47%  50%  31%  56%  39%  38% 100%  36%  26%  40%  30%  45%  50%  44%  36%  40%  42%  35% 
  Z  29%  41%  23%  25%  42%  36%  26%  38%  65%  56%  43%  43%  30%  51%  34%  28%  17%  38%  45%  59%  21%  28%  31%  54%  36% 100%  31%  69%  58%  42%  34%  32%  23%  31%  45%  22% 
  0  32%  57%  29%  60%  31%  30%  55%  63%  29%  29%  29%  30%  41%  43%  31%  31%  47%  32%  48%  31%  35%  34%  28%  30%  26%  31% 100%  22%  32%  29%  39%  40%  40%  22%  55%  39% 
  1  32%  30%  21%  23%  35%  42%  23%  33%  84%  70%  50%  48%  33%  62%  24%  36%  18%  43%  52%  69%  23%  30%  33%  64%  40%  69%  22% 100%  45%  41%  36%  36%  22%  37%  52%  22% 
  2  32%  41%  29%  28%  50%  33%  31%  36%  44%  42%  41%  53%  26%  30%  41%  30%  20%  34%  40%  43%  25%  34%  27%  35%  30%  58%  32%  45% 100%  53%  35%  33%  27%  33%  30%  26% 
  3  49%  34%  34%  22%  57%  52%  25%  32%  41%  52%  36%  47%  31%  32%  28%  41%  23%  34%  48%  42%  32%  44%  36%  36%  45%  42%  29%  41%  53% 100%  45%  62%  39%  31%  39%  37% 
  4  56%  44%  33%  35%  43%  56%  37%  43%  38%  46%  42%  51%  43%  39%  16%  53%  41%  45%  42%  37%  43%  60%  49%  39%  50%  34%  39%  36%  35%  45% 100%  44%  40%  43%  44%  41% 
  5  42%  48%  31%  32%  46%  44%  40%  42%  37%  45%  44%  39%  38%  35%  26%  36%  30%  43%  55%  46%  31%  43%  35%  42%  44%  32%  40%  36%  33%  62%  44% 100%  43%  33%  43%  42% 
  6  46%  46%  49%  39%  43%  42%  37%  40%  24%  27%  43%  31%  44%  31%  13%  51%  48%  48%  29%  29%  55%  43%  41%  29%  36%  23%  40%  22%  27%  39%  40%  43% 100%  43%  30%  93% 
  7  50%  34%  36%  27%  32%  41%  22%  32%  35%  40%  36%  45%  33%  33%   6%  31%  34%  29%  29%  33%  34%  51%  46%  29%  40%  31%  22%  37%  33%  31%  43%  33%  43% 100%  28%  42% 
  8  35%  47%  24%  33%  38%  40%  28%  48%  58%  58%  47%  38%  37%  51%  30%  32%  25%  48%  64%  44%  29%  33%  36%  53%  42%  45%  55%  52%  30%  39%  44%  43%  30%  28% 100%  29% 
  9  45%  44%  52%  40%  43%  41%  37%  40%  23%  26%  44%  33%  45%  30%  12%  52%  45%  49%  28%  29%  53%  42%  43%  27%  35%  22%  39%  22%  26%  37%  41%  42%  93%  42%  29% 100% 

The similarity of each shape to itself is 100%, as would be expected. The 10 shapes from this set that are least strongly differentiated by this algorithm are:

6 and 9 (similarity  93%)
A and V (similarity  86%)
I and 1 (similarity  84%)
B and H (similarity  73%)
I and J (similarity  71%)
J and 1 (similarity  70%)
T and 1 (similarity  69%)
Z and 1 (similarity  69%)
D and H (similarity  67%)
D and G (similarity  66%)
I and T (similarity  66%)

To examine the algorithm's sensitivity to translation, rotation and scaling (which is intended to be low) I then wrote a program that:

  1. randomly positioned, scaled and orientated a character from the set [A-Z0-9];
  2. extracted the shape boundary in a manner identical to the way that the candidate shape's boundaries were obtained;
  3. generated a polar-boundary for the shape and tested it against each candidate in turn to establish the shape with the greatest similarity to the shape presented;
  4. outputed the details of the shape generated, the candidate chosen and whether a correct match was made.

This was repeated 1000 times to obtain an estimate of the algorithm's error rate.

This test should be considered ideal conditions for the execution of the shape matching technique since there is no noise or distortion beyond that generated by the limitations of the algorithms employed; we should expect the technique to do well on this data or not at all.

The result of the first test run was that 18 tests failed to match giving a failure rate of 1.8%. The letters that failed to match were generally those that were least strongly differentiated: 6, 9, 1 and V. In addition to these the letter Q also failed to match on two occasions.

The number of errors that occured for each letter in the first test run.

I also produced a scatter-plot to identify if errors were clustered at particular angles or scales. From this I inferred that matching was weaker for smaller sized shapes due to increasing irregularity in their boundaries due to pixelation.

Successes and failures for each size & angle in the first test run.

Since it was clear (though unsurprising) that distortion of the shape boundary through its rectilinear construction could lower the algorithm's accuracy I repeated the test with one modification: the candidate shapes were rendered at 45° before being converted into a polar boundary; in all other regards the test was identical. The angle of 45° was chosen to maximize the difference between the shape boundaries extracted (since at that angle straight lines will become perfectly jagged and vice-versa).

The result of this second test run was that 16 tests failed to match giving a failure rate of 1.6%. Again the letters that failed to match were those that were least strongly differentiated and the scatter-plot was similar.

The number of errors that occured for each letter in the second test run.
Successes and failures for each size & angle in the first test run.

Conclusion

The algorithm presented here clearly has a few weaknesses. Its failure rate is not inconsequential even in the straightforward test presented here though it is possible that the general similarity of the shapes may have skewed the results; the characters 6 and 9 in the chosen typeface are all-but indistinguishable, disregarding these errors the failure rate over both tests was 1.2%. Secondly, the algorithm still requires an exhaustive search over all angles to identify a best-match irrespective of orientation; there are however a number of optimizations that this algorithm in particular lends itself to.

With some extra work, the algorithm could prove useful in some specialized contexts.

Source Code

At this time, the source code for this algorithm is in a highly variable state – I intend to publish it at a later date. Anyone who is keenly interested in obtaining the source code is welcome to email me for a copy until then.