I was made aware of the use of radius-vector functions (θ-s curves) and tangent-angle functions for shape recognition many years ago (see Simple geometrical shape parameters by Volodymyr Kindratenko). One of the primary limitations of these techniques is that they are based on shape outline alone and disregard any 'holes'. Since I'm currently acquainting myself with computer vision techniques, I thought that it would be interesting and educational to see if I could come up with a variation on these algorithms that can operate with regard to 'holes' within shapes. I've made no significant attempt to provide a formalism for my approach since I'm sure that much better procedures exist than the one I've come up with; having said that, my approach does seem to work with modest success at the basic level to which I've tested it.

My approach is to generate a normalized polar projection of the boundary of each shape to be compared. In the images below, each shape (in this instance characters in the typeface "Times New Roman") is displayed together with its polar projection which has been superimposed in the top-left-hand corner.

Each image contains the following information:

*The shape*indicated by the white area.*The centroid*indicated with a small red circle.*An extremum*indicated with a small red square.*Exterior perimeters*indicated with a green outline.*Interior perimeters*indicated with a blue outline.*The polar boundary*in which the shape is dark grey against a light-grey background.*The percentage area*covered by the shape in its polar projection.

The value of this projection is solely based on its usefulness for efficiently matching similar shapes, irrespective of position, size or orientation. Matching proceeds by measuring the degree of overlap between polar projections. A large degree of overlap is regarded in the approach presented here as indicative of similarity-of-shape. From this perspective, the algorithm can be regarded as a variant of the simplest possible tool for comparing binary images – the xor metric. However, the projection has the advantage of normalizing the shape presentation to a degree that allows for efficient evaluation independent of translation, rotation or scaling (albeit upto an estimation of the true xor distance between two shapes).

Given a binary image with the shape white and the background black,
the algorithm proceeds to create a *polar boundary* as
follows:

- All shape perimeters are extracted from the image as a list of
directions (N,S,E,W) that trace the pixel edges. The perimeters of
white (
*resp. black*) areas proceed with positive (*resp. negative*) winding. This is achieved naturally as a consequence of the marching squares algorithm used. - All perimeters are assembled into a single boundary that represents the outline of the shape.
- The centroid of the boundary is calculated with a fairly high
degree of efficiency as follows: Each perimeter in the boundary is
considered separately. A perimeter's centroid is computed by first
deriving scanline decompositions from the perimeter in both the
vertical and horizontal directions. Each of these structures can
be efficiently scanned to calculate the centroid's horizontal and
vertical coordinates. Perimeters with positive
(
*resp. negative*) winding are assigned centroids with positive (*resp. negative*) mass. Under this convention, the boundary centroid can readily be computed as the mass weighted average of its perimeter centroids. - The extreme point of the boundary is computed; that being the point (not necessarily unique) on the boundary which is furthest from the boundary's centroid.
- A polar plot of the boundary can then be generated. The plot is
constructed over a fixed number of angles (this number can be regarded
as a global constant for most purposes) that define a set of lines,
radiating from the boundary centroid, at which the plot is
sampled. The boundary perimeters are walked, and each time a radius is
crossed, the
**square**of the perimeter's distance from the centroid is recorded. - The crossings thus identified are sorted by their distance from the centroid and henceforth treated as boundary intervals. In the case where the centroid falls within the shape's interior, there will be an odd number of crossings, in this case zero is added to ensure that all intervals are closed. The polar plot is normalized using the distance to the shape's extreme point; The maximum distance that any interval extends from the centroid is unit length.

Once two or more polar boundaries have been constructructed in this way, they can be compared for similarity in a single orientation as follows:

- For each radius, the intervals of each polar boundary are intersected and their lengths summed.
- The sums arising from each set of intervals are themselves summed. This is used as a measure of goodness of fit.
- This goodness of fit is normalized into the range [0,1] by dividing it by the arithmetic mean of the total length of all intervals in both polar boundaries.

This can be extended to multiple orientations simply by performing the above evaluation at each angle under consideration and choosing the best fit (though there may be more sensible approaches to this - see the notes).

Note the following about the algorithms and my implementation of them:

- The square of a perimeter's distance from the centroid is used
because, for the purpose of measuring the difference between polar
boundaries to compute shape similarity, discrepencies at a greater
distance from the centroid represent a greater difference in shape
area (as per the r
^{2}term in polar integration). - Interpolation is required when a single step along the perimeter spans more than one radius (it is possible for just one step to span upto π/2 radians) . At present linear interpolation is used - this could be improved.
- No use is made of a perimeter's distance between radii. Using the average of these values instead of a single data point, might result in smoother and less noisy polar boundaries.
- Using the square of differences was investigated as a measure of goodness of fit but the results were inferior to those obtained by simply measuring the intersection.
- The value assigned as the goodness of fit is bounded by
min(t
_{A}, t_{B})/2(t_{A}+ t_{B}) where t_{X}is the total interval length (over all radii) in the polar boundary X. This is an obvious consequence of the fact that the combined length of the intersection of two sets of intervals can be no longer than the length of either set considered individually. This could be used to accelerate comparisons between shapes. - Matching polar boundaries in all possible orientations to find a best fit regardless of the relative angles of the shapes could be accelerated by choosing one, or a small number of, distinguished orientations between which comparisons could be made. One possible approach would be to choose, say, the four angles at which the shape extends furthest from the centroid and to then orientate the shape for comparison at only these angles. This might be sufficent for many shapes (though shapes which have a high degree of symmetry, but for one small feature, might fail to match).

To examine the viability of the algorithm for shape recognition, I wrote a program that generated a set of 36 polar boundaries from the upper case letters of the alphabet together with the ten digits 0–9. The shapes and boundaries generated are displayed in a grid below:

The 36 shapes generated together with the boundaries that were extracted from them.

I then calculated the similarity (expressed as a percentage) of each shape to itself and every other shape, these scores were compiled into the following matrix.

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 A 100% 46% 39% 32% 53% 56% 30% 38% 31% 36% 40% 47% 42% 31% 11% 56% 35% 38% 38% 35% 38% 86% 41% 32% 57% 29% 32% 32% 32% 49% 56% 42% 46% 50% 35% 45% B 46% 100% 30% 64% 38% 43% 53% 73% 34% 36% 40% 34% 42% 45% 25% 36% 49% 48% 50% 36% 42% 46% 40% 38% 40% 41% 57% 30% 41% 34% 44% 48% 46% 34% 47% 44% C 39% 30% 100% 31% 44% 42% 28% 28% 22% 27% 40% 35% 33% 24% 19% 37% 34% 39% 28% 25% 52% 40% 36% 25% 25% 23% 29% 21% 29% 34% 33% 31% 49% 36% 24% 52% D 32% 64% 31% 100% 26% 28% 66% 67% 25% 25% 30% 27% 41% 48% 20% 40% 57% 35% 34% 28% 42% 35% 37% 33% 29% 25% 60% 23% 28% 22% 35% 32% 39% 27% 33% 40% E 53% 38% 44% 26% 100% 49% 29% 36% 34% 40% 42% 40% 40% 28% 33% 54% 28% 42% 40% 32% 37% 48% 35% 33% 47% 42% 31% 35% 50% 57% 43% 46% 43% 32% 38% 43% F 56% 43% 42% 28% 49% 100% 31% 34% 41% 57% 37% 50% 35% 37% 22% 57% 31% 42% 42% 42% 31% 56% 45% 36% 58% 36% 30% 42% 33% 52% 56% 44% 42% 41% 40% 41% G 30% 53% 28% 66% 29% 31% 100% 49% 26% 27% 30% 30% 46% 39% 24% 36% 49% 32% 36% 37% 38% 32% 28% 31% 30% 26% 55% 23% 31% 25% 37% 40% 37% 22% 28% 37% H 38% 73% 28% 67% 36% 34% 49% 100% 35% 33% 38% 31% 43% 56% 22% 36% 49% 42% 50% 36% 41% 40% 37% 44% 31% 38% 63% 33% 36% 32% 43% 42% 40% 32% 48% 40% I 31% 34% 22% 25% 34% 41% 26% 35% 100% 71% 49% 50% 35% 60% 29% 33% 20% 43% 54% 66% 24% 30% 34% 62% 41% 65% 29% 84% 44% 41% 38% 37% 24% 35% 58% 23% J 36% 36% 27% 25% 40% 57% 27% 33% 71% 100% 44% 54% 33% 50% 28% 37% 19% 39% 56% 56% 24% 37% 35% 51% 49% 56% 29% 70% 42% 52% 46% 45% 27% 40% 58% 26% K 40% 40% 40% 30% 42% 37% 30% 38% 49% 44% 100% 37% 45% 45% 13% 38% 29% 64% 48% 49% 47% 36% 39% 61% 40% 43% 29% 50% 41% 36% 42% 44% 43% 36% 47% 44% L 47% 34% 35% 27% 40% 50% 30% 31% 50% 54% 37% 100% 33% 33% 27% 37% 20% 38% 36% 46% 29% 48% 42% 40% 56% 43% 30% 48% 53% 47% 51% 39% 31% 45% 38% 33% M 42% 42% 33% 41% 40% 35% 46% 43% 35% 33% 45% 33% 100% 39% 14% 39% 34% 39% 36% 35% 42% 42% 38% 40% 37% 30% 41% 33% 26% 31% 43% 38% 44% 33% 37% 45% N 31% 45% 24% 48% 28% 37% 39% 56% 60% 50% 45% 33% 39% 100% 17% 32% 36% 39% 54% 53% 32% 30% 35% 61% 36% 51% 43% 62% 30% 32% 39% 35% 31% 33% 51% 30% O 11% 25% 19% 20% 33% 22% 24% 22% 29% 28% 13% 27% 14% 17% 100% 11% 9% 12% 32% 19% 15% 12% 8% 22% 15% 34% 31% 24% 41% 28% 16% 26% 13% 6% 30% 12% P 56% 36% 37% 40% 54% 57% 36% 36% 33% 37% 38% 37% 39% 32% 11% 100% 35% 48% 29% 35% 34% 52% 45% 32% 49% 28% 31% 36% 30% 41% 53% 36% 51% 31% 32% 52% Q 35% 49% 34% 57% 28% 31% 49% 49% 20% 19% 29% 20% 34% 36% 9% 35% 100% 36% 24% 25% 56% 36% 36% 25% 26% 17% 47% 18% 20% 23% 41% 30% 48% 34% 25% 45% R 38% 48% 39% 35% 42% 42% 32% 42% 43% 39% 64% 38% 39% 39% 12% 48% 36% 100% 41% 43% 47% 36% 42% 51% 40% 38% 32% 43% 34% 34% 45% 43% 48% 29% 48% 49% S 38% 50% 28% 34% 40% 42% 36% 50% 54% 56% 48% 36% 36% 54% 32% 29% 24% 41% 100% 51% 32% 37% 33% 56% 47% 45% 48% 52% 40% 48% 42% 55% 29% 29% 64% 28% T 35% 36% 25% 28% 32% 42% 37% 36% 66% 56% 49% 46% 35% 53% 19% 35% 25% 43% 51% 100% 31% 37% 35% 55% 50% 59% 31% 69% 43% 42% 37% 46% 29% 33% 44% 29% U 38% 42% 52% 42% 37% 31% 38% 41% 24% 24% 47% 29% 42% 32% 15% 34% 56% 47% 32% 31% 100% 38% 33% 32% 31% 21% 35% 23% 25% 32% 43% 31% 55% 34% 29% 53% V 86% 46% 40% 35% 48% 56% 32% 40% 30% 37% 36% 48% 42% 30% 12% 52% 36% 36% 37% 37% 38% 100% 39% 30% 56% 28% 34% 30% 34% 44% 60% 43% 43% 51% 33% 42% W 41% 40% 36% 37% 35% 45% 28% 37% 34% 35% 39% 42% 38% 35% 8% 45% 36% 42% 33% 35% 33% 39% 100% 42% 39% 31% 28% 33% 27% 36% 49% 35% 41% 46% 36% 43% X 32% 38% 25% 33% 33% 36% 31% 44% 62% 51% 61% 40% 40% 61% 22% 32% 25% 51% 56% 55% 32% 30% 42% 100% 38% 54% 30% 64% 35% 36% 39% 42% 29% 29% 53% 27% Y 57% 40% 25% 29% 47% 58% 30% 31% 41% 49% 40% 56% 37% 36% 15% 49% 26% 40% 47% 50% 31% 56% 39% 38% 100% 36% 26% 40% 30% 45% 50% 44% 36% 40% 42% 35% Z 29% 41% 23% 25% 42% 36% 26% 38% 65% 56% 43% 43% 30% 51% 34% 28% 17% 38% 45% 59% 21% 28% 31% 54% 36% 100% 31% 69% 58% 42% 34% 32% 23% 31% 45% 22% 0 32% 57% 29% 60% 31% 30% 55% 63% 29% 29% 29% 30% 41% 43% 31% 31% 47% 32% 48% 31% 35% 34% 28% 30% 26% 31% 100% 22% 32% 29% 39% 40% 40% 22% 55% 39% 1 32% 30% 21% 23% 35% 42% 23% 33% 84% 70% 50% 48% 33% 62% 24% 36% 18% 43% 52% 69% 23% 30% 33% 64% 40% 69% 22% 100% 45% 41% 36% 36% 22% 37% 52% 22% 2 32% 41% 29% 28% 50% 33% 31% 36% 44% 42% 41% 53% 26% 30% 41% 30% 20% 34% 40% 43% 25% 34% 27% 35% 30% 58% 32% 45% 100% 53% 35% 33% 27% 33% 30% 26% 3 49% 34% 34% 22% 57% 52% 25% 32% 41% 52% 36% 47% 31% 32% 28% 41% 23% 34% 48% 42% 32% 44% 36% 36% 45% 42% 29% 41% 53% 100% 45% 62% 39% 31% 39% 37% 4 56% 44% 33% 35% 43% 56% 37% 43% 38% 46% 42% 51% 43% 39% 16% 53% 41% 45% 42% 37% 43% 60% 49% 39% 50% 34% 39% 36% 35% 45% 100% 44% 40% 43% 44% 41% 5 42% 48% 31% 32% 46% 44% 40% 42% 37% 45% 44% 39% 38% 35% 26% 36% 30% 43% 55% 46% 31% 43% 35% 42% 44% 32% 40% 36% 33% 62% 44% 100% 43% 33% 43% 42% 6 46% 46% 49% 39% 43% 42% 37% 40% 24% 27% 43% 31% 44% 31% 13% 51% 48% 48% 29% 29% 55% 43% 41% 29% 36% 23% 40% 22% 27% 39% 40% 43% 100% 43% 30% 93% 7 50% 34% 36% 27% 32% 41% 22% 32% 35% 40% 36% 45% 33% 33% 6% 31% 34% 29% 29% 33% 34% 51% 46% 29% 40% 31% 22% 37% 33% 31% 43% 33% 43% 100% 28% 42% 8 35% 47% 24% 33% 38% 40% 28% 48% 58% 58% 47% 38% 37% 51% 30% 32% 25% 48% 64% 44% 29% 33% 36% 53% 42% 45% 55% 52% 30% 39% 44% 43% 30% 28% 100% 29% 9 45% 44% 52% 40% 43% 41% 37% 40% 23% 26% 44% 33% 45% 30% 12% 52% 45% 49% 28% 29% 53% 42% 43% 27% 35% 22% 39% 22% 26% 37% 41% 42% 93% 42% 29% 100%

The similarity of each shape to itself is 100%, as would be expected. The 10 shapes from this set that are least strongly differentiated by this algorithm are:

6 and 9 (similarity 93%) A and V (similarity 86%) I and 1 (similarity 84%) B and H (similarity 73%) I and J (similarity 71%) J and 1 (similarity 70%) T and 1 (similarity 69%) Z and 1 (similarity 69%) D and H (similarity 67%) D and G (similarity 66%) I and T (similarity 66%)

To examine the algorithm's sensitivity to translation, rotation and scaling (which is intended to be low) I then wrote a program that:

- randomly positioned, scaled and orientated a character from the set [A-Z0-9];
- extracted the shape boundary in a manner identical to the way that the candidate shape's boundaries were obtained;
- generated a polar-boundary for the shape and tested it against each candidate in turn to establish the shape with the greatest similarity to the shape presented;
- outputed the details of the shape generated, the candidate chosen and whether a correct match was made.

This was repeated 1000 times to obtain an estimate of the algorithm's error rate.

This test should be considered ideal conditions for the execution of the shape matching technique since there is no noise or distortion beyond that generated by the limitations of the algorithms employed; we should expect the technique to do well on this data or not at all.

The result of the first test run was that 18 tests failed to match giving a failure rate of 1.8%. The letters that failed to match were generally those that were least strongly differentiated: 6, 9, 1 and V. In addition to these the letter Q also failed to match on two occasions.

The number of errors that occured for each letter in the first test run.

I also produced a scatter-plot to identify if errors were clustered at particular angles or scales. From this I inferred that matching was weaker for smaller sized shapes due to increasing irregularity in their boundaries due to pixelation.

Successes and failures for each size & angle in the first test run.

Since it was clear (though unsurprising) that distortion of the shape boundary through its rectilinear construction could lower the algorithm's accuracy I repeated the test with one modification: the candidate shapes were rendered at 45° before being converted into a polar boundary; in all other regards the test was identical. The angle of 45° was chosen to maximize the difference between the shape boundaries extracted (since at that angle straight lines will become perfectly jagged and vice-versa).

The result of this second test run was that 16 tests failed to match giving a failure rate of 1.6%. Again the letters that failed to match were those that were least strongly differentiated and the scatter-plot was similar.

The number of errors that occured for each letter in the second test run.

Successes and failures for each size & angle in the first test run.

The algorithm presented here clearly has a few weaknesses. Its failure rate is not inconsequential even in the straightforward test presented here though it is possible that the general similarity of the shapes may have skewed the results; the characters 6 and 9 in the chosen typeface are all-but indistinguishable, disregarding these errors the failure rate over both tests was 1.2%. Secondly, the algorithm still requires an exhaustive search over all angles to identify a best-match irrespective of orientation; there are however a number of optimizations that this algorithm in particular lends itself to.

With some extra work, the algorithm could prove useful in some specialized contexts.

At this time, the source code for this algorithm is in a highly variable state – I intend to publish it at a later date. Anyone who is keenly interested in obtaining the source code is welcome to email me for a copy until then.