I was made aware of the use of radius-vector functions (θ-s curves) and tangent-angle functions for shape recognition many years ago (see Simple geometrical shape parameters by Volodymyr Kindratenko). One of the primary limitations of these techniques is that they are based on shape outline alone and disregard any 'holes'. Since I'm currently acquainting myself with computer vision techniques, I thought that it would be interesting and educational to see if I could come up with a variation on these algorithms that can operate with regard to 'holes' within shapes. I've made no significant attempt to provide a formalism for my approach since I'm sure that much better procedures exist than the one I've come up with; having said that, my approach does seem to work with modest success at the basic level to which I've tested it.
My approach is to generate a normalized polar projection of the boundary of each shape to be compared. In the images below, each shape (in this instance characters in the typeface "Times New Roman") is displayed together with its polar projection which has been superimposed in the top-left-hand corner.
Each image contains the following information:
The value of this projection is solely based on its usefulness for efficiently matching similar shapes, irrespective of position, size or orientation. Matching proceeds by measuring the degree of overlap between polar projections. A large degree of overlap is regarded in the approach presented here as indicative of similarity-of-shape. From this perspective, the algorithm can be regarded as a variant of the simplest possible tool for comparing binary images – the xor metric. However, the projection has the advantage of normalizing the shape presentation to a degree that allows for efficient evaluation independent of translation, rotation or scaling (albeit upto an estimation of the true xor distance between two shapes).
Given a binary image with the shape white and the background black, the algorithm proceeds to create a polar boundary as follows:
Once two or more polar boundaries have been constructructed in this way, they can be compared for similarity in a single orientation as follows:
This can be extended to multiple orientations simply by performing the above evaluation at each angle under consideration and choosing the best fit (though there may be more sensible approaches to this - see the notes).
Note the following about the algorithms and my implementation of them:
To examine the viability of the algorithm for shape recognition, I wrote a program that generated a set of 36 polar boundaries from the upper case letters of the alphabet together with the ten digits 0–9. The shapes and boundaries generated are displayed in a grid below:
I then calculated the similarity (expressed as a percentage) of each shape to itself and every other shape, these scores were compiled into the following matrix.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 A 100% 46% 39% 32% 53% 56% 30% 38% 31% 36% 40% 47% 42% 31% 11% 56% 35% 38% 38% 35% 38% 86% 41% 32% 57% 29% 32% 32% 32% 49% 56% 42% 46% 50% 35% 45% B 46% 100% 30% 64% 38% 43% 53% 73% 34% 36% 40% 34% 42% 45% 25% 36% 49% 48% 50% 36% 42% 46% 40% 38% 40% 41% 57% 30% 41% 34% 44% 48% 46% 34% 47% 44% C 39% 30% 100% 31% 44% 42% 28% 28% 22% 27% 40% 35% 33% 24% 19% 37% 34% 39% 28% 25% 52% 40% 36% 25% 25% 23% 29% 21% 29% 34% 33% 31% 49% 36% 24% 52% D 32% 64% 31% 100% 26% 28% 66% 67% 25% 25% 30% 27% 41% 48% 20% 40% 57% 35% 34% 28% 42% 35% 37% 33% 29% 25% 60% 23% 28% 22% 35% 32% 39% 27% 33% 40% E 53% 38% 44% 26% 100% 49% 29% 36% 34% 40% 42% 40% 40% 28% 33% 54% 28% 42% 40% 32% 37% 48% 35% 33% 47% 42% 31% 35% 50% 57% 43% 46% 43% 32% 38% 43% F 56% 43% 42% 28% 49% 100% 31% 34% 41% 57% 37% 50% 35% 37% 22% 57% 31% 42% 42% 42% 31% 56% 45% 36% 58% 36% 30% 42% 33% 52% 56% 44% 42% 41% 40% 41% G 30% 53% 28% 66% 29% 31% 100% 49% 26% 27% 30% 30% 46% 39% 24% 36% 49% 32% 36% 37% 38% 32% 28% 31% 30% 26% 55% 23% 31% 25% 37% 40% 37% 22% 28% 37% H 38% 73% 28% 67% 36% 34% 49% 100% 35% 33% 38% 31% 43% 56% 22% 36% 49% 42% 50% 36% 41% 40% 37% 44% 31% 38% 63% 33% 36% 32% 43% 42% 40% 32% 48% 40% I 31% 34% 22% 25% 34% 41% 26% 35% 100% 71% 49% 50% 35% 60% 29% 33% 20% 43% 54% 66% 24% 30% 34% 62% 41% 65% 29% 84% 44% 41% 38% 37% 24% 35% 58% 23% J 36% 36% 27% 25% 40% 57% 27% 33% 71% 100% 44% 54% 33% 50% 28% 37% 19% 39% 56% 56% 24% 37% 35% 51% 49% 56% 29% 70% 42% 52% 46% 45% 27% 40% 58% 26% K 40% 40% 40% 30% 42% 37% 30% 38% 49% 44% 100% 37% 45% 45% 13% 38% 29% 64% 48% 49% 47% 36% 39% 61% 40% 43% 29% 50% 41% 36% 42% 44% 43% 36% 47% 44% L 47% 34% 35% 27% 40% 50% 30% 31% 50% 54% 37% 100% 33% 33% 27% 37% 20% 38% 36% 46% 29% 48% 42% 40% 56% 43% 30% 48% 53% 47% 51% 39% 31% 45% 38% 33% M 42% 42% 33% 41% 40% 35% 46% 43% 35% 33% 45% 33% 100% 39% 14% 39% 34% 39% 36% 35% 42% 42% 38% 40% 37% 30% 41% 33% 26% 31% 43% 38% 44% 33% 37% 45% N 31% 45% 24% 48% 28% 37% 39% 56% 60% 50% 45% 33% 39% 100% 17% 32% 36% 39% 54% 53% 32% 30% 35% 61% 36% 51% 43% 62% 30% 32% 39% 35% 31% 33% 51% 30% O 11% 25% 19% 20% 33% 22% 24% 22% 29% 28% 13% 27% 14% 17% 100% 11% 9% 12% 32% 19% 15% 12% 8% 22% 15% 34% 31% 24% 41% 28% 16% 26% 13% 6% 30% 12% P 56% 36% 37% 40% 54% 57% 36% 36% 33% 37% 38% 37% 39% 32% 11% 100% 35% 48% 29% 35% 34% 52% 45% 32% 49% 28% 31% 36% 30% 41% 53% 36% 51% 31% 32% 52% Q 35% 49% 34% 57% 28% 31% 49% 49% 20% 19% 29% 20% 34% 36% 9% 35% 100% 36% 24% 25% 56% 36% 36% 25% 26% 17% 47% 18% 20% 23% 41% 30% 48% 34% 25% 45% R 38% 48% 39% 35% 42% 42% 32% 42% 43% 39% 64% 38% 39% 39% 12% 48% 36% 100% 41% 43% 47% 36% 42% 51% 40% 38% 32% 43% 34% 34% 45% 43% 48% 29% 48% 49% S 38% 50% 28% 34% 40% 42% 36% 50% 54% 56% 48% 36% 36% 54% 32% 29% 24% 41% 100% 51% 32% 37% 33% 56% 47% 45% 48% 52% 40% 48% 42% 55% 29% 29% 64% 28% T 35% 36% 25% 28% 32% 42% 37% 36% 66% 56% 49% 46% 35% 53% 19% 35% 25% 43% 51% 100% 31% 37% 35% 55% 50% 59% 31% 69% 43% 42% 37% 46% 29% 33% 44% 29% U 38% 42% 52% 42% 37% 31% 38% 41% 24% 24% 47% 29% 42% 32% 15% 34% 56% 47% 32% 31% 100% 38% 33% 32% 31% 21% 35% 23% 25% 32% 43% 31% 55% 34% 29% 53% V 86% 46% 40% 35% 48% 56% 32% 40% 30% 37% 36% 48% 42% 30% 12% 52% 36% 36% 37% 37% 38% 100% 39% 30% 56% 28% 34% 30% 34% 44% 60% 43% 43% 51% 33% 42% W 41% 40% 36% 37% 35% 45% 28% 37% 34% 35% 39% 42% 38% 35% 8% 45% 36% 42% 33% 35% 33% 39% 100% 42% 39% 31% 28% 33% 27% 36% 49% 35% 41% 46% 36% 43% X 32% 38% 25% 33% 33% 36% 31% 44% 62% 51% 61% 40% 40% 61% 22% 32% 25% 51% 56% 55% 32% 30% 42% 100% 38% 54% 30% 64% 35% 36% 39% 42% 29% 29% 53% 27% Y 57% 40% 25% 29% 47% 58% 30% 31% 41% 49% 40% 56% 37% 36% 15% 49% 26% 40% 47% 50% 31% 56% 39% 38% 100% 36% 26% 40% 30% 45% 50% 44% 36% 40% 42% 35% Z 29% 41% 23% 25% 42% 36% 26% 38% 65% 56% 43% 43% 30% 51% 34% 28% 17% 38% 45% 59% 21% 28% 31% 54% 36% 100% 31% 69% 58% 42% 34% 32% 23% 31% 45% 22% 0 32% 57% 29% 60% 31% 30% 55% 63% 29% 29% 29% 30% 41% 43% 31% 31% 47% 32% 48% 31% 35% 34% 28% 30% 26% 31% 100% 22% 32% 29% 39% 40% 40% 22% 55% 39% 1 32% 30% 21% 23% 35% 42% 23% 33% 84% 70% 50% 48% 33% 62% 24% 36% 18% 43% 52% 69% 23% 30% 33% 64% 40% 69% 22% 100% 45% 41% 36% 36% 22% 37% 52% 22% 2 32% 41% 29% 28% 50% 33% 31% 36% 44% 42% 41% 53% 26% 30% 41% 30% 20% 34% 40% 43% 25% 34% 27% 35% 30% 58% 32% 45% 100% 53% 35% 33% 27% 33% 30% 26% 3 49% 34% 34% 22% 57% 52% 25% 32% 41% 52% 36% 47% 31% 32% 28% 41% 23% 34% 48% 42% 32% 44% 36% 36% 45% 42% 29% 41% 53% 100% 45% 62% 39% 31% 39% 37% 4 56% 44% 33% 35% 43% 56% 37% 43% 38% 46% 42% 51% 43% 39% 16% 53% 41% 45% 42% 37% 43% 60% 49% 39% 50% 34% 39% 36% 35% 45% 100% 44% 40% 43% 44% 41% 5 42% 48% 31% 32% 46% 44% 40% 42% 37% 45% 44% 39% 38% 35% 26% 36% 30% 43% 55% 46% 31% 43% 35% 42% 44% 32% 40% 36% 33% 62% 44% 100% 43% 33% 43% 42% 6 46% 46% 49% 39% 43% 42% 37% 40% 24% 27% 43% 31% 44% 31% 13% 51% 48% 48% 29% 29% 55% 43% 41% 29% 36% 23% 40% 22% 27% 39% 40% 43% 100% 43% 30% 93% 7 50% 34% 36% 27% 32% 41% 22% 32% 35% 40% 36% 45% 33% 33% 6% 31% 34% 29% 29% 33% 34% 51% 46% 29% 40% 31% 22% 37% 33% 31% 43% 33% 43% 100% 28% 42% 8 35% 47% 24% 33% 38% 40% 28% 48% 58% 58% 47% 38% 37% 51% 30% 32% 25% 48% 64% 44% 29% 33% 36% 53% 42% 45% 55% 52% 30% 39% 44% 43% 30% 28% 100% 29% 9 45% 44% 52% 40% 43% 41% 37% 40% 23% 26% 44% 33% 45% 30% 12% 52% 45% 49% 28% 29% 53% 42% 43% 27% 35% 22% 39% 22% 26% 37% 41% 42% 93% 42% 29% 100%
The similarity of each shape to itself is 100%, as would be expected. The 10 shapes from this set that are least strongly differentiated by this algorithm are:
6 and 9 (similarity 93%) A and V (similarity 86%) I and 1 (similarity 84%) B and H (similarity 73%) I and J (similarity 71%) J and 1 (similarity 70%) T and 1 (similarity 69%) Z and 1 (similarity 69%) D and H (similarity 67%) D and G (similarity 66%) I and T (similarity 66%)
To examine the algorithm's sensitivity to translation, rotation and scaling (which is intended to be low) I then wrote a program that:
This was repeated 1000 times to obtain an estimate of the algorithm's error rate.
This test should be considered ideal conditions for the execution of the shape matching technique since there is no noise or distortion beyond that generated by the limitations of the algorithms employed; we should expect the technique to do well on this data or not at all.
The result of the first test run was that 18 tests failed to match giving a failure rate of 1.8%. The letters that failed to match were generally those that were least strongly differentiated: 6, 9, 1 and V. In addition to these the letter Q also failed to match on two occasions.
I also produced a scatter-plot to identify if errors were clustered at particular angles or scales. From this I inferred that matching was weaker for smaller sized shapes due to increasing irregularity in their boundaries due to pixelation.
Since it was clear (though unsurprising) that distortion of the shape boundary through its rectilinear construction could lower the algorithm's accuracy I repeated the test with one modification: the candidate shapes were rendered at 45° before being converted into a polar boundary; in all other regards the test was identical. The angle of 45° was chosen to maximize the difference between the shape boundaries extracted (since at that angle straight lines will become perfectly jagged and vice-versa).
The result of this second test run was that 16 tests failed to match giving a failure rate of 1.6%. Again the letters that failed to match were those that were least strongly differentiated and the scatter-plot was similar.
The algorithm presented here clearly has a few weaknesses. Its failure rate is not inconsequential even in the straightforward test presented here though it is possible that the general similarity of the shapes may have skewed the results; the characters 6 and 9 in the chosen typeface are all-but indistinguishable, disregarding these errors the failure rate over both tests was 1.2%. Secondly, the algorithm still requires an exhaustive search over all angles to identify a best-match irrespective of orientation; there are however a number of optimizations that this algorithm in particular lends itself to.
With some extra work, the algorithm could prove useful in some specialized contexts.
At this time, the source code for this algorithm is in a highly variable state – I intend to publish it at a later date. Anyone who is keenly interested in obtaining the source code is welcome to email me for a copy until then.