the Denser the Better ?

    技术2022-06-24  46

          对图像进行密集地特征取样,聚类效果会更好么?

          2006 CVPR, Kristen Grauman et al

          Unsupervised Learning of Categories from Sets of Partially Matching Image Features

          Another open question regarding the concepts presented is what particular types of local features and interest operators are most appropriate for discerning object categories within this framework. For instance, in our recent experiments doing supervised classification with a large set of categories, we saw that sets of features sampled densely from the images may provide more accurate categorization results.

          2010 CVPR, Tinne Tuytelaars

          Dense Interest Points

          In the recent Pascal Visual Object Classes Challenges, it has been shown that the best performance is obtained by using a combination of both interest points and densely sampled image patches.

    Interested Points

          Interest points yield a high repeatability, i.e. they can be extracted reliably and are often found again at similar locations in other images of the same object or scene. Also, with interest points the user can choose the appropriate level of viewpoint and illumination invariance, depending on the application and expected variability. As the name suggests, interest point detectors focus on ‘interesting’ regions, which are typically regions with high information content, that can be localized precisely. These are in some sense optimal when the goal is to find correspondences between two views of the same object or scene. However, when the goal is image interpretation, it is unclear whether the heuristics on which these feature extraction methods are based actually perform a good feature selection task or not.

          On the downside, the number of interest points extracted from an image varies a lot based on the image content – often from a few hundred to several thousand without changing the parameters. Sometimes, when the contrast in an image is low, not a single interest point is found, making the image representation useless.

    Dense Sampling Features

          Dense sampling on a regular grid, on the other hand, results in a good coverage of the entire object or scene and a constant amount of features per image area. Regions with less contrast contribute equally to the overall image representation. This is based on the idea that, even if such patches cannot be matched accurately, they do contain valuable information regarding the image content, that may help to interpret the scene. Also, spatial relations between features follow a regular pattern that is easily represented in a simple model. This is important when using Markov or conditional random fields or when modeling the spatial configuration of features. This does not hold for interest points, where spatial relations are more arbitrary and, as a consequence, incorporating information on the spatial configuration is more complicated. For instance, consider the definition of the concept of ‘neighbors’. Sivic et al. have proposed taking the N nearest neighbors, but these are in some cases quite far away. Alternatively, one can use all interest points within a certain radius around the given interest point, but then the number of neighbors can vary greatly.

          On the downside, dense sampling cannot reach the same level of repeatability as obtained with interest points, unless sampling is performed extremely densely – but then the number of features quickly grows unacceptably large. In practice, researchers often use an overlap between patches of 50% or less. This is clearly not enough to guarantee similar descriptors in case of structured scenes, especially when combined with SIFT-like descriptors, which further divide the region in smaller sub patches (in spite of SIFTs robustness to small misalignments).

     


    最新回复(0)