Global contrast based salient region detection.pdf

发布时间：2022-05-31 发布人：admin 分类：说明书资料大小：5.60M 资料格式：pdf 举报版权申诉

wu_luke-9964541-4744302542998519430.pdf-第1页.png

第1页 / 共9页

wu_luke-9964541-4744302542998519430.pdf-第2页.png

第2页 / 共9页

wu_luke-9964541-4744302542998519430.pdf-第3页.png

第3页 / 共9页

wu_luke-9964541-4744302542998519430.pdf-第4页.png

第4页 / 共9页

wu_luke-9964541-4744302542998519430.pdf-第5页.png

第5页 / 共9页

wu_luke-9964541-4744302542998519430.pdf-第6页.png

第6页 / 共9页

wu_luke-9964541-4744302542998519430.pdf-第7页.png

第7页 / 共9页

wu_luke-9964541-4744302542998519430.pdf-第8页.png

第8页 / 共9页

ReprintCoverPage

0159

Global Contrast based Salient Region Detection ∗ Ming-Ming Cheng1 Guo-Xin Zhang1 Niloy J. Mitra2 Xiaolei Huang3 1 TNList, Tsinghua University Shi-Min Hu1 3 Lehigh University 2 KAUST chengmingvictor@gmail.com Abstract Reliable estimation of visual saliency allows appropriate processing of images without prior knowledge of their contents, and thus remains an important step in many computer vi- sion tasks including image segmentation, object recognition, and adaptive compression. We propose a regional contrast based saliency extraction algorithm, which simultaneously eval- uates global contrast diﬀerences and spatial coherence. The proposed algorithm is simple, eﬃcient, and yields full resolution saliency maps. Our algorithm consistently outperformed existing saliency detection methods, yielding higher precision and better recall rates, when evaluated using one of the largest publicly available data sets. We also demonstrate how the extracted saliency map can be used to create high quality segmentation masks for subsequent image processing. Keywords: Saliency detection, global contrast, early and biologically-inspired Vision. The documents contained in these pages are included to ensure timely dissemina- Disclaimer: tion of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have oﬀered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder. @conference{11cvpr/Cheng_Saliency, title={Global Contrast based Salient Region Detection}, author={Ming-Ming Cheng and Guo-Xin Zhang and Niloy J. Mitra and Xiaolei Huang and Shi-Min Hu}, booktitle={IEEE CVPR}, pages={409--416}, year={2011}, } ∗Project page: http://cg.cs.tsinghua.edu.cn/people/∼cmm/saliency/ 1

Global Contrast based Salient Region Detection Ming-Ming Cheng1 Guo-Xin Zhang1 Niloy J. Mitra2 Xiaolei Huang3 Shi-Min Hu1 3 Lehigh University 1 TNList, Tsinghua University 2 KAUST chengmingvictor@gmail.com Abstract Reliable estimation of visual saliency allows appropriate processing of images without prior knowledge of their con- tents, and thus remains an important step in many computer vision tasks including image segmentation, object recog- nition, and adaptive compression. We propose a regional contrast based saliency extraction algorithm, which simul- taneously evaluates global contrast differences and spatial coherence. The proposed algorithm is simple, efﬁcient, and yields full resolution saliency maps. Our algorithm con- sistently outperformed existing saliency detection methods, yielding higher precision and better recall rates, when eval- uated using one of the largest publicly available data sets. We also demonstrate how the extracted saliency map can be used to create high quality segmentation masks for sub- sequent image processing. 1. Introduction Humans routinely and effortlessly judge the importance of image regions, and focus attention on important parts. Computationally detecting such salient image regions re- mains a signiﬁcant goal, as it allows preferential allocation of computational resources in subsequent image analysis and synthesis. Extracted saliency maps are widely used in many computer vision applications including object- of-interest image segmentation [13, 18], object recogni- tion [25], adaptive compression of images [6], content- aware image editing [28, 33, 30, 9], and image retrieval [4]. Saliency originates from visual uniqueness, unpre- dictability, rarity, or surprise, and is often attributed to vari- ations in image attributes like color, gradient, edges, and boundaries. Visual saliency, being closely related to how we perceive and process visual stimuli, is investigated by mul- tiple disciplines including cognitive psychology [26, 29], neurobiology [8, 22], and computer vision [17, 2]. Theo- ries of human attention hypothesize that the human vision system only processes parts of an image in detail, while leaving others nearly unprocessed. Early work by Treis- man and Gelade [27], Koch and Ullman [19], and subse- Figure 1. Given input images (top), a global contrast analysis is used to compute high resolution saliency maps (middle), which can be used to produce masks (bottom) around regions of interest. quent attention theories proposed by Itti, Wolfe and others, suggest two stages of visual attention: fast, pre-attentive, bottom-up, data driven saliency extraction; and slower, task dependent, top-down, goal driven saliency extraction. We focus on bottom-up data driven saliency detection using image contrast. It is widely believed that human cor- tical cells may be hard wired to preferentially respond to high contrast stimulus in their receptive ﬁelds [23]. We pro- pose contrast analysis for extracting high-resolution, full- ﬁeld saliency maps based on the following observations: A global contrast based method, which separates a large-scale object from its surroundings, is preferred over local contrast based methods producing high saliency values at or near object edges. Global considerations enable assignment of compara- ble saliency values to similar image regions, and can uniformly highlight entire objects. Saliency of a region depends mainly on its contrast to the nearby regions, while contrasts to distant regions are less signiﬁcant. Saliency maps should be fast and easy to generate to allow processing of large image collections, and facil- itate efﬁcient image classiﬁcation and retrieval. We propose a histogram-based contrast method (HC) to measure saliency. HC-maps assign pixel-wise saliency val- 409

(b) IT[17] (c) MZ[21] (d) GB[14] (a) original Figure 2. Saliency maps computed by different state-of-the-art methods (b-i), and with our proposed HC (j) and RC methods (k). Most results highlight edges, or are of low resolution. See also Figure 6 (and our project webpage). (e) SR[15] (f) AC[1] (k) RC (g) CA[12] (h) FT[2] (i) LC[32] (j) HC ues based simply on color separation from all other image pixels to produce full resolution saliency maps. We use a histogram-based approach for efﬁcient processing, while employing a smoothing procedure to control quantization artifacts. Note that our algorithm is targeted towards natu- ral scenes, and maybe suboptimal for extracting saliency of highly textured scenes (see Figure 12). As an improvement over HC-maps, we incorporate spa- tial relations to produce region-based contrast (RC) maps where we ﬁrst segment the input image into regions, and then assign saliency values to them. The saliency value of a region is now calculated using a global contrast score, mea- sured by the region’s contrast and spatial distances to other regions in the image. We have extensively evaluated our methods on publicly available benchmark data sets, and compared our methods with (eight) state-of-the-art saliency methods [17, 21, 32, 14, 15, 1, 2, 12] as well as with manually produced ground truth annotations1. The experiments show signiﬁcant im- provements over previous methods both in precision and re- call rates. Overall, compared with HC-maps, RC-maps pro- duce better precision and recall rates, but at the cost of in- creased computations. Encouragingly, we observe that the saliency cuts extracted using our saliency maps are, in most cases, comparable to manual annotations. We also present application of the extracted saliency maps to segmentation, context aware resizing, and non-photo realistic rendering. 2. Related Work We focus on relevant literature targeting pre-attentive bottom-up saliency detection, which may be biologically motivated, or purely computational, or involve both as- pects. Such methods utilize low-level processing to deter- mine the contrast of image regions to their surroundings, us- ing feature attributes such as intensity, color, and edges [2]. We broadly classify the algorithms into local and global schemes. Local contrast based methods investigate the rarity of image regions with respect to (small) local neighbor- hoods. Based on the highly inﬂuential biologically inspired early representation model introduced by Koch and Ull- man [19], Itti et al. [17] deﬁne image saliency using central- surrounded differences across multi-scale image features. 1Results for 1000 images and prototype software are available at the project webpage: http://cg.cs.tsinghua.edu.cn/people/%7Ecmm/saliency/ Ma and Zhang [21] propose an alternative local contrast analysis for generating saliency maps, which is then ex- tended using a fuzzy growth model. Harel et al. [14] nor- malize the feature maps of Itti et al., to highlight conspic- uous parts and permit combination with other importance maps. Liu et al. [20] ﬁnd multi-scale contrast by linearly combining contrast in a Gaussian image pyramid. More recently, Goferman et al. [12] simultaneously model local low-level clues, global considerations, visual organization rules, and high-level features to highlight salient objects along with their contexts. Such methods using local contrast tend to produce higher saliency values near edges instead of uniformly highlighting salient objects (see Figure 2). Global contrast based methods evaluate saliency of an image region using its contrast with respect to the entire image. Zhai and Shah [32] deﬁne pixel-level saliency based on a pixel’s contrast to all other pixels. However, for efﬁ- ciency they use only luminance information, thus ignoring distinctiveness clues in other channels. Achanta et al. [2] propose a frequency tuned method that directly deﬁnes pixel saliency using a pixel’s color difference from the average image color. The elegant approach, however, only consid- ers ﬁrst order average color, which can be insufﬁcient to analyze complex variations common in natural images. In Figures 6 and 7, we show qualitative and quantitative weak- nesses of such approaches. Furthermore, these methods ig- nore spatial relationships across image parts, which can be critical for reliable and coherent saliency detection (see Sec- tion 5). 3. Histogram Based Contrast Based on the observation from biological vision that the vision system is sensitive to contrast in visual signal, we propose a histogram-based contrast (HC) method to deﬁne saliency values for image pixels using color statistics of the input image. Speciﬁcally, the saliency of a pixel is deﬁned using its color contrast to all other pixels in the image, i.e., the saliency value of a pixel Ik in image I is deﬁned as, S(Ik) = ∀Ii∈I D(Ik, Ii), (1) where D(Ik, Ii) is the color distance metric between pixels Ik and Ii in the L∗a∗b∗space (see also [32]). Equation 1 can be expanded by pixel order to have the following form, S(Ik) = D(Ik, I1) + D(Ik, I2) + + D(Ik, IN ), (2) 410

n y c n e u q e r f Figure 3. Given an input image (left), we compute its color his- togram (middle). Corresponding histogram bin colors are shown in the lower bar. The quantized image (right) uses only 43 his- togram bin colors and still retains sufﬁcient visual quality for saliency detection. where N is the number of pixels in image I. It is easy to see that pixels with the same color value have the same saliency value under this deﬁnition, since the measure is oblivious to spatial relations. Hence, rearranging Equation 2 such that the terms with the same color value cj are grouped together, we get saliency value for each color as, S(Ik) = S(cl) = fjD(cl, cj), (3) j=1 where cl is the color value of pixel Ik, n is the number of distinct pixel colors, and fj is the probability of pixel color cj in image I. Note that in order to prevent salient region color statistics from being corrupted by similar colors from other regions, one can develop a similar scheme using vary- ing window masks. However, given the strict efﬁciency re- quirement, we take the simple global approach. Histogram based speed up. Naively evaluating the saliency value for each image pixel using Equation 1 takes O(N 2) time, which is computationally too expensive even for medium sized images. The equivalent representation in Equation 3, however, takes O(N) + O(n2) time, implying that computational efﬁciency can be improved to O(N) if O(n2) O(N). Thus, the key to speed up is to reduce the number of pixel colors in the image. However, the true- color space contains 2563 possible colors, which is typically larger than the number of image pixels. Zhai and Shah [32] reduce the number of colors, n, by only using luminance. In this way, n2 = 2562 (typically 2562 N). However, their method has the disadvantage that the distinctiveness of color information is ignored. In this work, we use the full color space instead of luminance only. To reduce the number of colors needed to consider, we ﬁrst quantize each color channel to have 12 different values, which reduces the number of colors to 123 = 1728. Con- sidering that color in a natural image typically covers only a small portion of the full color space, we further reduce the number of colors by ignoring less frequently occurring colors. By choosing more frequently occurring colors and ensuring these colors cover the colors of more than 95% of the image pixels, we typically are left with around n = 85 colors (see Section 5 for experimental details). The colors Figure 4. Saliency of each color, normalized to the range [0, 1], be- fore (left) and after (right) color space smoothing. Corresponding saliency maps are shown in the respective insets. of the remaining pixels, which comprise fewer than 5% of the image pixels, are replaced by the closest colors in the histogram. A typical example of such quantization is shown in Figure 3. Note that again due to efﬁciency requirements we select the simple histogram based quantization instead of optimizing for an image speciﬁc color palette. Color space smoothing. Although we can efﬁciently compute color contrast by building a compact color his- togram using color quantization and choosing more fre- quent colors, the quantization itself may introduce artifacts. Some similar colors may be quantized to different values. In order to reduce noisy saliency results caused by such randomness, we use a smoothing procedure to reﬁne the saliency value for each color. We replace the saliency value of each color by the weighted average of the saliency val- ues of similar colors (measured by L∗a∗b∗distance). This is actually a smoothing process in the color feature space. Typically we choose m = n/4 nearest colors to reﬁne the saliency value of color c by, S(c) = 1 (4) (T − D(c, ci))S(ci) (m − 1)T where T =m tion factor comes fromm i=1 D(c, ci) is the sum of distances between color c and its m nearest neighbors ci, and the normaliza- i=1(T − D(c, ci)) = (m − 1)T. Note that we use a linearly-varying smoothing weight (T − D(c, ci)) to assign larger weights to colors closer to c in the color feature space. In our experiments, we found that such linearly-varying weights are better than Gaussian weights, which fall off too sharply. Figure 4 shows the typical ef- fect of color space smoothing with the corresponding his- tograms sorted by decreasing saliency values. Note that similar histogram bins are closer to each other after such smoothing, indicating that similar colors have higher likeli- hood of being assigned similar saliency values, thus reduc- ing quantization artifacts (see Figure 7). m i=1 Implementation details. To quantize the color space into 123 different colors, we uniformly divide each color chan- nel into 12 different levels. While the quantization of colors is performed in the RGB color space, we measure color dif- ferences in the L∗a∗b∗color space because of its perceptual accuracy. However, we do not perform quantization directly 411

use the number of pixels in ri as w(ri) to emphasize color contrast to bigger regions. The color distance between two regions r1 and r2 is deﬁned as, n1 n2 Dr(r1, r2) = f(c1,i)f(c2,j)D(c1,i, c2,j) (6) i=1 j=1 where f(ck,i) is the probability of the i-th color ck,i among all nk colors in the k-th region rk, k = f1, 2g. Note that we use the probability of a color in the probability density function (i.e. normalized color histogram) of the region as the weight for this color to emphasize more the color differ- ences between dominant colors. Storing and calculating the regular matrix format his- togram for each region is inefﬁcient since each region typ- ically contains a small number of colors in the color his- togram of the whole image. Instead, we use a sparse his- togram representation for efﬁcient storage and computation. Spatially weighted region contrast. We further incorpo- rate spatial information by introducing a spatial weighting term in Equation 5 to increase the effects of closer regions and decrease the effects of farther regions. Speciﬁcally, for any region rk, the spatially weighted region contrast based saliency is deﬁned as: exp(Ds(rk, ri)/σ2 s)w(ri)Dr(rk, ri) (7) S(rk) = rk=ri where Ds(rk, ri) is the spatial distance between regions rk and ri, and σs controls the strength of spatial weighting. Larger values of σs reduce the effect of spatial weighting so that contrast to farther regions would contribute more to the saliency of the current region. The spatial distance between two regions is deﬁned as the Euclidean distance between s = 0.4 their centroids. In our implementation, we use σ2 with pixel coordinates normalized to [0, 1]. 5. Experimental Comparisons We have evaluated the results of our approach on the publicly available database provided by Achanta et al. [2]. To the best of our knowledge, the database is the largest of its kind, and has ground truth in the form of accu- rate human-marked labels for salient regions. We com- pared the proposed global contrast based methods with 8 state-of-the-art saliency detection methods. Following [2], we selected these methods according to: number of cita- tions (IT[17] and SR[15]), recency (GB[14], SR, AC[1], FT[2] and CA[12]), variety (IT is biologically-motivated, MZ[21] is purely computational, GB is hybrid, SR works in the frequency domain, AC and FT output full resolution saliency maps), and being related to our approach (LC[32]). Figure 5. Image regions generated by Felzenszwalb and Hutten- locher’s segmentation method [11] (left), region contrast based segmentation with (left-middle) and without (right-middle) dis- tance weighting. Incorporating the spatial context, we get a high quality saliency cut (right) comparable to human labeled ground truth. in the L∗a∗b∗color space since not all colors in the range L∗ 2 [0, 100], and a∗, b∗ 2 [127, 127] necessarily cor- respond to real colors. Experimentally we observed worse quantization artifacts using direct L∗a∗b∗color space quan- tization. Best results were obtained by quantization in the RGB space while measuring distance in the L∗a∗b∗color space, as opposed to performing both quantization and dis- tance calculation in a single color space, either RGB or L∗a∗b∗. 4. Region Based Contrast Humans pay more attention to those image regions that contrast strongly with their surroundings [10]. Besides con- trast, spatial relationships play an important role in human attention. High contrast to its surrounding regions is usually stronger evidence for saliency of a region than high contrast to far-away regions. Since directly introducing spatial rela- tionships when computing pixel-level contrast is computa- tionally expensive, we introduce a contrast analysis method, region contrast (RC), so as to integrate spatial relationships into region-level contrast computation. In RC, we ﬁrst seg- ment the input image into regions, then compute color con- trast at the region level, and deﬁne the saliency for each region as the weighted sum of the region’s contrasts to all other regions in the image. The weights are set according to the spatial distances with farther regions being assigned smaller weights. Region contrast by sparse histogram comparison. We ﬁrst segment the input image into regions using a graph- based image segmentation method [11]. Then we build the color histogram for each region as in Section 3. For a region rk, we compute its saliency value by measuring its color contrast to all other regions in the image, w(ri)Dr(rk, ri), (5) S(rk) = rk=ri where w(ri) is the weight of region ri and Dr(,) is the color distance metric between the two regions. Here we 412

(a) original (b) LC (c) CA (d) FT (e) HC-maps (f) RC-maps (g) RCC Figure 6. Visual comparison of saliency maps. (a) original images, saliency maps produced using (b) Zhai and Shah [32], (c) Goferman et al. [12], (d) Achanta et al. [2], (e) our HC and (f) RC methods, and (g) RC-based saliency cut results. Our methods generate uniformly highlighted salient regions. (See our project webpage for all results on the full benchmark dataset.) We used our methods and the others to compute saliency maps for all the 1000 images in the database. Table 1 com- pares the average time taken by each method. Our algo- rithms, HC and RC, are implemented in C++. For the other methods namely IT, GB, SR, FT and CA, we used the au- thors’ implementations, while for LC, we implemented the algorithm in C++ since we could not ﬁnd the authors’ im- plementation. For typical natural images, our HC method needs O(N) computation time and is sufﬁciently efﬁcient for real-time applications. In contrast, our RC variant is slower as it requires image segmentation [11], but produces superior quality saliency maps. In order to comprehensively evaluate the accuracy of our methods for salient object segmentation, we performed two experiments using different objective comparison measures. In the ﬁrst experiment, to segment salient objects and cal- culate precision and recall curves [15], we binarized the saliency map using every possible ﬁxed threshold, similar to the ﬁxed thresholding experiment in [2]. In the second experiment, we segment salient objects by iteratively ap- plying GrabCut [24] initialized using thresholded saliency maps, as we will describe later. We also use the obtained saliency maps as importance weighting for content aware image resizing and non-photo realistic rendering. Segmentation by xed thresholding. The simplest way to get a binary segmentation of salient objects is to thresh- old the saliency map with a threshold Tf 2 [0, 255]. To reli- ably compare how well various saliency detection methods highlight salient regions in images, we vary the threshold Tf from 0 to 255. Figure 7 shows the resulting precision vs. recall curves. We also present the beneﬁts of adding the color space smoothing and spatial weighting schemes, along with objective comparison with other saliency extrac- tion methods. Visual comparison of saliency maps obtained by the various methods can be seen in Figures 2 and 6. The precision and recall curves clearly show that our methods outperform the other eight methods. The extrem- ities of the precision vs. recall curve are interesting: At maximum recall where Tf = 0, all pixels are retained as positives, i.e., considered to be foreground, so all the meth- ods have the same precision and recall values; precision 0.2 and recall 1.0 at this point indicate that, on average, there are 20% image pixels belonging to the ground truth salient regions. At the other end, the minimum recall values of our methods are higher than those of the other methods, because the saliency maps computed by our methods are smoother and contain more pixels with the saliency value 255. Saliency cut. We now consider the use of the com- puted saliency map to assist in salient object segmentation. 413

Method Time(s) Code SR[15] IT[17] MZ[21] GB[14] 1.614 0.611 0.064 Matlab Matlab Matlab 0.070 C++ FT[2] AC[1] CA[12] LC[32] 0.018 0.016 C++ C++ 0.109 53.1 C++ Matlab HC 0.019 C++ RC 0.253 C++ Table 1. Average time taken to compute a saliency map for images in the database by Achanta et al. [2]. Most images in the database (see our project webpage) have resolution 400 × 300. Algorithms were tested using a Dual Core 2.6 GHz machine with 2GB RAM. 1 0.8 0.6 n o i s i c e r P 0.4 0.2 0 GB MZ FT NHC HC NRC RC 0.2 0.4 1 0.8 0.6 n o i s i c e r P 0.4 0.2 0.6 0.8 1 0 IT SR AC CA LC HC RC 0.2 0.4 Precision Recall F−beta 1 0.8 0.6 0.4 0.2 0.6 0.8 0 1 IT MZ GB SR AC FT CA LC HC RC Recall Recall Figure 7. Precision-recall curves for naive thresholding of saliency maps using 1000 publicly available benchmark images. (Left, mid- dle) Different options of our method compared with GB[14], MZ[21], FT[2], IT[17], SR[15], AC[1], CA[12], and LC[32]. NHC denotes a naive version of our HC method with color space smoothing disabled, and NRC denotes our RC method with spatial related weighting disabled. (Right) Precision-recall bars for our saliency cut algorithm using different saliency maps as initialization. Our method RC shows high precision, recall, and Fβ values over the 1000-image database. (Please refer to our project webpage for the respective result images.) Saliency maps have been previously employed for unsuper- vised object segmentation: Ma and Zhang [21] ﬁnd rect- angular salient regions by fuzzy region growing on their saliency maps. Ko and Nam [18] select salient regions us- ing a support vector machine trained on image segment fea- tures, and then cluster these regions to extract salient ob- jects. Han et al. [13] model color, texture, and edge features in a Markov random ﬁeld framework to grow salient ob- ject regions from seed values in the saliency maps. More recently, Achanta et al. [2] average saliency values within image segments produced by mean-shift segmentation, and then ﬁnd salient objects by identifying image segments that have average saliency above a threshold that is set to be twice the mean saliency value of the entire image. In our approach, we iteratively apply GrabCut [24] to reﬁne the segmentation result initially obtained by thresh- olding the saliency map (see Figure 8). Instead of manually selecting a rectangular region to initialize the process, as in Figure 8. Saliency Cut. (Left to right) Initial segmentation, trimap after ﬁrst iteration, trimap after second iteration, ﬁnal segmenta- tion, and manually labeled ground truth. In the segmented images, blue is foreground, gray is background, while in the trimaps, fore- ground is red, background is green, and unknown regions are left unchanged. 414 classical GrabCut, we automatically initialize GrabCut us- ing a segmentation obtained by binarizing the saliency map using a ﬁxed threshold; the threshold is chosen empirically to be the threshold that gives 95% recall rate in our ﬁxed thresholding experiments. Once initialized, we iteratively run GrabCut to improve the saliency cut result. (At most 4 iterations are run in our experiments.) After each iteration, we use dilation and ero- sion operations on the current segmentation result to get a new trimap for the next GrabCut iteration. As shown in Figure 8, the region outside the dilated region is set to back- ground, the region inside the eroded region is set to fore- ground, and the remaining areas are set to unknown in the trimap. GrabCut, which by itself is an iterative process us- ing Gaussian mixture models and graph cut, helps to re- ﬁne salient object regions at each step. Regions closer to an initial salient object region are more likely to be part of that salient object than far-away regions. Thus, our new initialization enables GrabCut to include nearby salient re- gions, and exclude non-salient regions according to color feature dissimilarity. In the implementation, we set a nar- row image-border region (15 pixels wide) to be always in the background in order to avoid slow convergence in the border region. Figure 8 shows two examples of our saliency cut algo- rithm. In the ﬂag example, unwanted regions are correctly excluded during GrabCut iterations. In the ﬂower exam- ple, our saliency cut method successfully expanded the ini- tial salient regions (obtained directly from the saliency map) and converged to an accurate segmentation result.

(a) original (b) LC[32] (c) CA[12] (d) FT[2] (e) HC-maps (f) RC-maps (g) ground truth Figure 9. Saliency cut using different saliency maps for initialization. The respective saliency maps are shown in Figure 6. original CA RC original CA RC Figure 10. Comparison of content aware image resizing [33] re- sults using CA[12]saliency maps and our RC saliency maps. To objectively evaluate our new saliency cut method us- ing our RC-map as initialization, we compare our results with results obtained by coupling iterative GrabCut with ini- tializations from saliency maps computed by other methods. For consistency, we binarize each such saliency map using a threshold that gives 95% recall rate in the corresponding ﬁxed thresholding experiment (see Figure 7). A visual com- parison of the results is shown in Figure 9. Average preci- sion, recall, and F -Measure are compared over the entire ground-truth database [2], with the F -Measure deﬁned as: Fβ = (1 + β2)P recision Recall β2 P recision + Recall . (8) We use β2 = 0.3 as in Achanta et al. [2] to weigh preci- sion more than recall. As can be seen from the comparison (see Figures 7-right and 9), saliency cut using our RC and HC saliency maps signiﬁcantly outperform other methods. Compared with the state-of-the-art results on this database (precision = 75%, recall = 83%), we by Achanta et al. achieved better accuracy (precision = 90%, recall = 90%). (Our demo software is available at our project webpage.) Content aware image resizing. In image re-targeting, saliency maps are usually used to specify relative impor- tance across image parts (see also [3]). We experimented with using our saliency maps in the image resizing method proposed by Zhang et al. [33]2, which distributes distortion energy to relatively non-salient regions of an image while preserving both global and local image features. Figure 10 compares the resizing results using our RC-maps with the results using CA[12] saliency maps. Our RC saliency maps help produce better resizing results since the saliency val- ues in salient object regions are piece-wise smooth, which is important for energy based resizing methods. CA saliency maps, having higher saliency values at object boundaries, are less suitable for applications like resizing, which require entire salient objects to be uniformly highlighted. 2The authors’ original implementation is publicly available. Figure 11. (Middle, right) FT[2] and RC saliency maps are used respectively for stylized rendering [16] of an input image (left). Our method produces a better saliency map, see insets, resulting in improved preservation of details, e.g., around the head and the fence regions. Figure 12. Challenging examples for our histogram based meth- ods involve non-salient regions with similar colors as the salient parts (top), or an image with textured background (bottom). (Left to right) Input image, HC-map, HC saliency cut, RC-map, RC saliency cut. Non-photorealistic rendering. Artists often abstract im- ages and highlight meaningful parts of an image while masking out unimportant regions [31]. Inspired by this ob- servation, a number of non-photorealistic rendering (NPR) efforts use saliency maps to generate interesting effects [7]. We experimentally compared our work with the most re- lated, state-of-the-art saliency detection algorithm [2] in the context of a recent NPR technique [16] (see Figure 11). Our RC-maps give better saliency masks, which help the NPR method to better preserve details in important image parts and region boundaries, while smoothing out others. 6. Conclusions and Future Work We presented global contrast based saliency computa- tion methods, namely Histogram based Contrast (HC) and spatial information-enhanced Region based Contrast (RC). While the HC method is fast and generates results with ﬁne details, the RC method generates spatially coherent high quality saliency maps at the cost of reduced computational efﬁciency. We evaluated our methods on the largest pub- licly available data set and compared our scheme with eight other state-of-the-art methods. Experiments indicate the 415

分享到：

赞收藏

资料库

Global contrast based salient region detection.pdf

相关推荐

人工智能

热门标签

最新资料