How the marker locator works¶
Here is a description on how the marker is designed and how detecting it works.
Fourier transform¶
Given \(N\) observations (\(x_0\), \(x_1\), …), the \(kth\) term in the discrete Fourier transform is given by the equation:
Notice that the Discrete Fourier transform is a weighted sum over a set of observations, that is a convolution. In the standard situation the set of observations is sampled along a linear dimension.
Instead of sampling along a linear dimension, the sampling will be done over a 2D area, the kernel window. This is similar to a 2 D convolution, which is defined as follows:
The task is now to design a pattern to add to the object that should be tracked and which is possible to detect with the convolution based approach described above.
Square wave¶
A square wave \(x(t)\) with amplitude \(1\) and frequency \(f\) can be represented with the Fourier series
Given the function \(x(t)\), the Fourier transform can be used to determine the elements of the Fourier series.
The plain marker¶
Instead of locating a square wave in an image, the pattern is bent around a certain point and then replaced with high and low intensities. This is illustrated in Table 2. The generated pattern has a well defined spatial center and as will be seen later, the pattern can be detected using convolution.
|
By altering the number of repetitions of the black and white pattern around the central point, a set of different markers can be generated. The number of repetitions of the pattern is denoted the order ($n$) of the pattern. In Fig. 11 20 patterns are visualized, the patterns have the orders \(n = 1\) to \(n = 20\).
Fig. 11 Markers with different orders from 1 to 20.¶
Detection of marker¶
Detection of a square wave with $n$ repetitions using Fourier analysis, relies on a convolution of the $N$ measurement of the signal with the kernel \(Y_n\):
A somewhat similar kernel is used to detect the n-fold edge markers. The kernel is specified using polar coordinates as follows:
where \(\theta\) is the direction and $r$ is the distance to the actual position in the kernel. The center of the polar coordinates are placed in the middle of the kernel and is scaled such that a circle with radius 1 is the largest circle that can be placed inside the kernel. Four different views of a kernel with order \(n = 4\) is shown in Table 3.
Fig. 12 Real part of kernel.¶ |
Fig. 13 Imaginary part of kernel.¶ |
Fig. 14 Magnitude of kernel.¶ |
Fig. 15 Argument of kernel.¶ |
To detect a marker with a known order, the input image is converted to a grayscale image and then convolved with the \(Z_n\) kernel. As the $Z_n$ kernel contains complex weights, the resulting image will contain complex values. The magnitude of the complex values in the resulting image, tells us how well the input picture matches the used kernel, this is visualized in Table 4. The argument of the complex value tells the orientation of the pattern, to best match the input image.
Estimating the quality of the detected marker¶
Interpreting the magnitude response to the \(Z_n\) kernel poses an issue when markers with different (but nearby) orders are present in the input image. As can be seen in Table 4, where a marker of order \(n = 4\) is being detected, there is a moderate response around the marker mounted on the Hubsan UAV with order \(n = 5\). This issue can of course be reduced by avoiding markers with similar orders in the same image, but a better solution is to check that the algorithm actually found a marker with the requested order; that is to assign some kind of quality score of the detection result.
The used approach for estimating the quality of a detected marker, is to utilize information about the orientation of the marker (from the argument of the kernel response) to align the orientation of the located marker with the expected pattern of white and black regions. An example of a template for the position of white and black markers are shown in Table 5. For all pixels in the white / black regions of the template, the average image intensity and standard deviation is calculated, this gives the values: \(\mu_w\), \(\mu_b\), \(\sigma_w\) and \(\sigma_b\).
|
|
|
A marker that matches the pattern that has been searched for (regarding order of the kernel) and is positioned correctly above the center of the pattern, will have well separated grayscale values for pixels in the white and black regions respectively. Whether this is the case is quantified by calculating the normalized difference (\(t\)) between the grayscale values in the white and black regions:
If \(t\) has a value above 7, it indicates that there is a very large difference between the grayscale values in the white and black regions of the template. In an attempt of making the estimated quality easier to interpret, the following mapping between the \(t\) value and the resulting quality score is utilized.
The quality score gives a number between zero and one. A low score indicates that the detected marker does not match what was searched for and is likely to be a random match. When the quality score gets above 0.5 the tracker is quite confident that the detected marker is actually what was searched for.
The oriented marker¶
Even though the plain marker contains some information about the orientation of the marker (as it is possible to discriminate between markers with different orientations), is is not possible for the pattern to point in a certain direction, for eg. specifying the orientation of a tracked object. By changing one of the black legs of the pattern to white, the pattern is given a unique orientation. This is illustrated in Fig. 18.
Fig. 18 Markers where one of the black legs have been removed to indicate an orientation of the marker. The markers have the orders (n = 3 … 6).¶