Next: Example: A Small Database Up: Visual Tracking Previous: Landmark Recognition

Landmark Tracking

In order to describe the environment, images must be obtained from representative viewpoints. For the purposes of this discussion, let us assume that we select viewpoints that cover the configuration space in a uniform grid. This is by no means a requirement or constraint, but rather a simplifying assumption. In order to achieve computational efficiency, viewpoints are selected such that the camera is facing in a consistent orientation. Once the sample images have been acquired, they are used to automatically learn a suitable set of tracked landmarks for subsequent positioning.

The set of tracked landmarks is initially defined by the set of single candidate landmarks observed in a selected bootstrap image from the database. These candidate landmarks, which become prototypes for matching, are selected in this manner in order to guarantee uniqueness - no two landmark candidates will overlap within the same image. Matching is based on a minimisation of the Euclidean distance between the principal components encodings of the prototype and of the observed candidate landmarks in each image. Typically, we select the initial bootstrap image to be the one that is taken from a camera position closest to the centroid of all visited camera positions. Given this initial set of prototypes, the candidate landmarks in each of the remaining images are considered for inclusion in one of the tracked landmarks. Consideration for inclusion in a set is based on the following methodology:
algorithm419

The goal of this method is to grow landmark sets as much as possible in configuration space so that a candidate landmark can be matched to the correct target over a large portion of the space. The local search in the neighbourhood of is performed in order to counter the effects of any instabilities in the underlying landmark detector. Figure 4.3 shows a typical landmark set. Each thumbnail image corresponds to the landmark as detected in the image taken at the corresponding grid position in configuration space. Grid positions with no corresponding thumbnail image indicate positions in the configuration space where no landmark candidate was found that matched the prototype. This can occur under three separate conditions: first, no suitable landmark candidate was detected by the landmark detector; second, a landmark candidate was detected but found a better match to a different prototype in the local neighbourhood; or third, a landmark candidate was detected but differed too greatly in appearance from the prototype - that is, the distance in the subspace was greater than the user-defined threshold.

Figure 4.3: A typical landmark set. Each thumbnail corresponds to the landmark as detected in the image taken at the corresponding grid position in camera space.

A tracked landmark is the essential modelling primitive that defines the ``map'' and which is used for subsequent correspondence and position estimation. It should be noted that the tracking method makes no assumptions regarding position within the image, which somewhat relaxes some constraints that could be imposed on the pose of the camera - landmarks can be matched regardless of their image position.

Next: Example: A Small Database Up: Visual Tracking Previous: Landmark Recognition

Robert Sim
Tue Jul 21 10:30:54 EDT 1998