Unsupervised Global Urban Area Mapping via Automatic Labeling from ASTER and PALSAR Satellite Images
<p>Methodology of unsupervised urban area mapping. (<b>a</b>) Two examples of traditional unsupervised classification under different distributions; (<b>b</b>) Step 1/2 of our method: find some salient candidates based on common prior knowledge; (<b>c</b>) Step 2/2 of our method: Propagate the confidence of candidates based on current distribution, select training samples automatically and perform classification.</p> ">
<p>Processing flow diagram of unsupervised global urban area mapping.</p> ">
<p>The structure of the confusion matrix for urban area mapping.</p> ">
<p>Distribution of investigated urban areas, which are marked by red crosses.</p> ">
<p>Processing results of our proposed method at Mexicali, Mexico (32.65°N, 115.52°W). (<b>a</b>) ASTER/VNIR false color image; (<b>b</b>) PALSAR false color image (image contrast was enhanced for better visual effect); (<b>c</b>) Prediction of urban/non-urban area; (<b>d</b>) Urban area confidence map derived by improved LLGC; (<b>e</b>) Generated training data; (<b>f</b>) Final urban area map.</p> ">
<p>Comparison results of urban area maps. (<b>a</b>) ASTER/VNIR false color image; (<b>b</b>) PALSAR false color image (image contrast was enhanced for better visual effect); (<b>c</b>) Urban area map derived by our method; (<b>d</b>) MCD urban area map; (<b>e</b>) SVM urban area map.</p> ">
Abstract
: In this study, a novel unsupervised method for global urban area mapping is proposed. Different from traditional clustering-based unsupervised methods, in our approach a labeler is designed, which is able to automatically select training samples from satellite images by propagating common urban/non-urban knowledge through the unlabeled data. Two kinds of satellite images, captured by the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and the Phased Array L-band Synthetic Aperture Radar (PALSAR), are exploited here. In this method, spectral features are first extracted from the original dataset, followed by coarse prediction of urban/non-urban areas via weak classifiers. By developing an improved belief-propagation based clustering algorithm, a confidence map is obtained and training data are selected via weighted sampling. Finally, the urban area map is obtained by employing the Support Vector Machine (SVM) classifier. The proposed method can generate urban area maps at a resolution of 15 m, while the same settings are used for all test cases. Experimental results involving 75 scenes from different climate zones show that our proposed method achieves an overall accuracy of 84.4% and a kappa coefficient of 0.628, which is competitive relative to the supervised SVM method.1. Introduction
Urbanization has always been an important issue with great impacts for various applications ranging from regional and global environmental changes [1,2], socio-economic problems [3], to urban planning and disaster management [4,5]. The percentage of global urbanization has been increasing in the past decades and now more than half of the world’s population lives in urban settlements [6], which further enhances the importance and impacts of urbanization and becomes increasing popular worldwide [7,8]. Global urban area maps are exploited in various researches to evaluate the influence of urbanization on the natural and human environments and to estimate some important aspects of urbanization such as the size, scale and shape of cities [9]. Comparing with traditional methods, the satellite-based remote sensing technique offers advantages in monitoring such properties of urbanization, due to its timeliness, efficiency and global coverage. Therefore, the study of deriving global urban area maps and corresponding attributes from different kinds of satellite images is attracting increasing attention worldwide [10–14].
To recognize urban areas from satellite images, different classification and clustering methods are employed for this purpose and the reader may refer to [15] for an overview. However, this task remains very challenging due to the large diversity of spectral characteristics of urban areas. An example is the AVHRR 1km global land cover product [16], which employs an unsupervised clustering algorithm on most land cover classes with the assistance of manual image interpretation. However, urban areas cannot be consistently classified by using this method, due to the heterogeneous features and complex patterns of land use in urban areas. Therefore, in the AVHRR product additional maps from Defense Mapping Agency are integrated for identifying urban areas.
In general, urban area classification methods can be divided into two categories: supervised and unsupervised ones. For supervised methods, support vector machine (SVM) based classifiers are very popular due to the good performance and robustness [17]. In [18], an urban area mapping method is proposed by combining multiple SVM classifiers via fuzzy integral and attractor dynamics. In [19], a SVM-based region growing method is presented for extracting urban areas from data captured by Defense Meteorological Satellite Program’s Operational Line-scan System (DMSP-OLS) and Satellites Pour l’Observation de la Terre (SPOT) Vegetation (VGT). In addition, artificial neural network (ANN) based methods are also widely used [20,21], especially in early studies. Other supervised classification methods such as decision tree, random forest and logarithmic regression can also be found in urban area related studies [22–24], which achieve plausible results. In [25], several supervised classification methods are exploited together. First logistic regression models are created to represent the priori probability of urban areas, and then the supervised classification is performed by combining the decision tree and boosting techniques. For unsupervised methods, traditional clustering methods such as K-Means and the iterative self-organizing data analysis technique (ISODATA) are often exploited. In GLC2000 land cover products, unsupervised classification methods are applied to multi-spectral and multi-temporal datasets for generating land cover maps, but regional products are produced and tuned independently by different groups [13]. In IGBP-DIS global 1km land cover products, an optimized K-Means algorithm for handling large datasets is utilized [26]. In some studies, both supervised and unsupervised methods are employed for recognizing urban areas. In the GlobCover product, first a supervised spectral classification is conducted for identifying some specific land cover classes. Then an unsupervised clustering algorithm is applied to the spectro-temporal characteristics, followed by an automated reference-based labeling step [27,28].
Usually, supervised methods produce higher accuracy than unsupervised ones, while more processing steps are required in order to build reliable training data [19,29]. These researches provide very valuable information about the urban settlements, especially for regions in developing countries which are less documented.
However, there is a common issue for both supervised and unsupervised urban area mapping methods. Due to the large variety of local landscapes in different areas, most classifiers need to be tuned based on the local study area and the accuracy of urban maps may decrease violently if the same settings are applied directly to other areas [16]. This means large amount of human interaction of experienced researchers is needed for parsing the results, which can be very time-consuming and expensive. The cost is even higher for supervised methods, since training samples of good quality need to be collected for each scene. In addition, most products of global urban mapping have limited resolution, ranging from about 300 m to 9000 m [11].
In this paper, a robust unsupervised global urban area mapping method is proposed, which performs urban classification fully automatically for all 75 test scenes and is able to generate urban area maps at a resolution of 15 m with an average overall accuracy of 84.4%. The rest of this paper is organized as follows. In Section 2, we briefly introduce the problem and the dataset used in this study. Section 3 describes the details of the proposed method. Experimental results are presented in Section 4 and the paper is summarized in Section 5.
2. Problem Statement and Dataset
2.1. Defining Urban Area
In social and economic studies, an urban area is characterized by high population density and is usually defined by its demographic attributes according to the available information of administrative units. However, as pointed out in [30], this definition of urban area suffers from the heterogeneity: the national definitions can vary much across countries and over time. In addition, it also depends on the available information of administrative units, which is less documented in developing countries and therefore results in low- resolution urban area maps.
Independent of demographic attributes of regions, in this paper urban areas are defined according to their spectral features, i.e., the value of pixels from multi-spectral satellite images. In remote sensing literature [3,31], urban areas are usually defined as places which are recognized as “built up” objects, such as buildings, roads and dams. Correspondingly, non-urban areas are defined as places without any artificial objects, such as grassland, forests, rivers and agricultural fields. This definition of urban area is homogeneous and can be applied for analyzing urban areas across different countries over time.
For global urban area mapping, the issue of sub-pixel mixing plays an important role. It is considered as one of the main reasons why various products of urban area maps which are derived from low-resolution satellite images can have significant differences [32]. In this research, two kinds of high-resolution satellite images are exploited, and the spatial resolution of our urban area map is 15 m, much higher than most existing maps. According to the analysis in [33], we believe that the 15 m urban area map is sufficient for representing most features of urban land covers. Therefore, the problem of sub-pixel mixing will not be discussed and is considered as a future task of our research. In addition, due to the limitation on the mechanism of satellite remote sensing, urban areas which are covered by non-urban objects, such as houses hidden by dense canopy of trees, will be classified as non-urban in our method.
2.2. ASTER and PALSAR Satellite Images
In the proposed method, satellite images captured by ASTER and PALSAR are exploited for generating urban area maps. The ASTER instrument is provided by the Japanese Ministry of Economy, Trade and Industry (METI) and has been operating for global coverage since December 1999 [34]. ASTER includes three separate optical subsystems with different ground resolution: the visible and near-infrared (VNIR) radiometer, shortwave-infrared (SWIR) radiometer, and thermal infrared (TIR) radiometer. It supplies VNIR satellite images of 15 m spatial resolution, which are superior to most existing global urban maps. In addition, VNIR is especially useful since it can provide stereo coverage in Band 3, according to its nadir (Band 3N) and backward (Band 3B) views. Therefore, ASTER/VNIR images attract increasing attention and have been exploited for a number of urban area related researches such as [35–38].
In this work, four types of ASTER VNIR satellite images from three spectral bands (Band 1, Band 2, Band 3N and Band 3B) are utilized, denoted as Asterb1, Asterb2, Asterb3 and Asterb4, respectively. In addition, the research in [39] shows that terrain information is very helpful for recognizing urban areas. Thus the degree of slope, which is calculated from the digital elevation model (DEM) generated by stereoscopic analysis of ASTER/VNIR data, is also exploited.
PALSAR was developed by METI as a joint project with Japan Aerospace Exploration Agency (JAXA), and was launched in 2006 on board the Advanced Land Observing Satellite (ALOS) [40]. Features of PALSAR, such as multi-polarization and off nadir pointing, improved the accuracy of recognizing geological structure [41]. PALSAR satellite images have been applied for urban area mapping in recent researches [42,43] and the study in [44] shows that ALOS/PALSAR data have better performance for distinguishing bare lands and deserts from urban areas than ASTER images. Therefore, PALSAR HH (horizontal transmitting, horizontal receiving) and HV (horizontal transmitting, vertical receiving) polarization images obtained in the Fine Beam Dual polarization (FBD) mode are exploited here (denoted as hh and hv, respectively). In addition, to reduce the distortion caused by high degree of local incident angle in mountainous areas, a correction step on HH images was performed based on the method in [45], resulting in local-incident-angle corrected HH images (denoted as hhcor).
3. Methodology
3.1. Overview
Recognizing urban areas in a fully automatic way is very challenging. Traditional supervised classification methods need to build sample data for different scenes, while unsupervised ones require to tune parameters manually when handling different cases. These steps are very labor-intensive and can be quite expensive. Inspired by recent advances in semi-supervised learning methods which incorporate a small number of labeled data with unlabeled data [46–48], a novel unsupervised urban area mapping is proposed here.
The general idea of our method can be explained as follows. In Figure 1a, case 1 and case 2 stands for two examples of distributions of urban and nor-urban areas, where x stands for the value of their spectral features. It can be seen that to distinguish urban/non-urban areas, the optimal threshold for cases 1 is u1. However, the distribution of case 2 is somewhat different and its optimal threshold is u2. It is obvious that applying u1 to case 2 will lead to a lot of misclassifications and vice versa. Therefore, using exactly the same classifier for both cases will suffer much from the difference of these two distributions. For this reason, our proposed method tries to adapt the prior knowledge to the unlabeled input data. As shown in Figure 1b, based on some general prior knowledge of spectral distributions of landscapes, some pixels in the satellite images are recognized as urban/non-urban areas (denoted by blue/red points). Meanwhile, there are still a large number of unlabeled pixels (denoted by black points, respectively). First the similarity among all pixels is evaluated and the confidence of belonging to urban/non-urban area is propagated based on the similarity. Training samples will be selected based on the confidence, leading to a traditional supervised classifier. The final result is shown in Figure 1c, where the optimal threshold v1 and v2 can be obtained for case 1 and case 2, respectively. Therefore, in this way the proposed method can build the urban area classifiers for different scenes based on the distributions of input data.
The key part of our proposed method for global urban area mapping is building training samples in a fully automatic way. A labeler is designed for this task via analyzing ASTER VNIR images, ASTER slope data and PALSAR HH/HV images. Figure 2 shows the detailed processing flow of the proposed method. Firstly, various spectral features are extracted for further analysis. Then in the labeler coarse prediction of urban/non-urban areas is performed by applying prior knowledge to weak classifiers based on these features, resulting in a small number of urban/non-urban pixels. By improving a clustering algorithm known as Learning with Local and Global Consistency (LLGC) [49], an urban area confidence map is obtained and training samples are selected correspondingly. Finally, the urban area map is achieved by utilizing the Support Vector Machine (SVM) classifier with training samples and extracted features.
The main advantages of our proposed method consist of three aspects:
The designed labeler only employs some common knowledge about urban area for coarse prediction and is able to refine the result adaptively according to the distributions of current unlabeled data. Therefore, our method shows strong ability of unsupervised learning from input data, which is demonstrated in our experiment involving 75 scenes over different climate zones.
The proposed method provides competitive accuracy, even when comparing with the traditional supervised SVM method.
The proposed method is fully automatic and its performance is quite robust. No manual interaction is needed and the same parameter settings are applied to all test scenes.
3.2. Feature Extraction
In addition to the original satellite images (Asterb1 ∼ Asterb4, hh, hv) and preprocessing results (slope, hhcor), some other features are also employed. Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) have been widely used after their appearance in [50,51] and are considered as very effective descriptors about vegetation features [52] and surface water features [53], respectively. For ASTER data, the definitions of NDVI and NDWI are given by:
According to [44], in mountain areas the difference between PALSAR hh and hhcor images can be quite large, which is useful for recognizing non-urban areas. Therefore, here we define hhsub as follows:
Moreover, entropy filtering is a common and effective technique for describing the richness of texture, by calculating the local entropy of pixels within a given window. Therefore, to describe the rich texture in urban areas, entropy filtering [54,55] is performed on the PALSAR hh image with a neighboring window size of 15 × 15 pixels, denoted as follows:
3.3. Predict Non-Urban and Urban Area
As aforementioned, in this step prediction of urban/non-urban areas will be performed based on some common prior knowledge. Several independent weak classifiers are utilized, generating a number of urban/non-urban pixels which will be used as seeds for the LLGC clustering algorithm later. Please note that we are not expecting that a single week classifier can recognize urban/non-urban areas with a high accuracy. The purpose of this step is to make a coarse prediction about salient urban/non-urban areas by combining these weak classifiers. In our design, it is still acceptable even if there are some misclassified pixels because the following LLGC algorithm is robust against noises.
When applying this step, we assume that a sufficient number of urban and non-urban pixels must exist in the scene. Otherwise, the result of urban/non-urban area prediction will be inaccurate, leading to poor performance on urban area classification. It is suggested that at least 105 points for both urban and non-urban land cover classes should appear in the scene to ensure the diversity of spectral features of urban/non-urban areas. Usually, this requirement can be easily satisfied as long as a city is included in the selected image.
The predictor for non-urban areas is designed as follows:
Here mean(.) stands for the average value of the input images and std(.) for the standard deviation. This predictor consists of 6 independent classifiers, which generate 6 masks correspondingly. For these masks, the value of mask is 1 if the condition is satisfied, and 0 otherwise. mask1 and mask2 are defined according to NDVI and NDWI. Pixels whose values are much higher than the mean value are marked, indicating obvious vegetation and water areas. mask3 is defined in a similar way, intending to recognize non-urban areas that have large difference between hh and hhcor due to the influence of the incident angle, such as mountain areas. Based on our observations, usually the values of the PALSAR HH image in urban areas are much higher than those of non-urban objects and mask4 is designed based on this rule. By analyzing the slope data and the richness of the texture, mask5 and mask6 are proposed based on given thresholds. Based on our experience, the values of thresh5 and thresh6 are set as 15 and 4.5, respectively, for all cases in the following experiments.
For all 6 masks, binary morphological operations [56], denoted as MorphFilt(.) here, are utilized for refining the masks. First a morphological close operation is performed, followed by a morphological open operation, with fixed structuring elements of 10 × 10 pixels. The purpose is to remove isolated non-urban areas which include few pixels and masknonurban is obtained by taking the union of these refined masks.
The predictor for urban areas is designed in a similar way and masknonurban is also integrated. The definitions are as follows:
Here Not(.) means the inverse of the binary mask. Similar to mask4, here mask7 also exploits the rule about the high-reflectance rate in urban areas. Finally maskurban is obtained by using morphological operations to refine the intersection of masknonurban and mask7.
3.4. Confidence Estimate by LLGC
It is noteworthy that maskurban and masknonurban do not stand for the full set of urban/non-urban pixels, respectively. Theoretically, they only represent a subset of urban/non-urban pixels, which are salient enough to be recognized by the prior knowledge. In practice, this prediction result is not error-free and the label of some pixels may be incorrect. Here the prediction result will be regarded as initial seeds of the LLGC clustering algorithm and the urban area confidence map will be built by propagating the belief from seeds through the whole feature space where the input data reside.
The LLGC algorithm [49] was first proposed in 2004 and has been widely used due to its good performance and stability against noisy initialization. Here we provide a brief introduction about the LLGC algorithm and how it is implemented in our method.
Given a set of pixels X = {x1, x2, …, xN}, the initial value of the N × 2 non-negative label matrix F is defined as follows:
In our method, ASTER/VNIR Band 1, Band 2 and Band 3 images are exploited as clustering features and are merged to generate a color image Asterrgb. The dist(.) function is defined by:
However, there are two problems if the original LLGC algorithm is applied. First, the number of urban/non-urban pixels based on the coarse predictor is imbalanced. The number of urban pixels is much smaller than that of non-urban pixels, and the LLGC algorithm will mark almost all pixels as non-urban since labeled non-urban pixels have a much stronger influence in the propagation step. Second, in our test cases the variable N, which stands for the valid number of pixels in a satellite image, can be as large as 2,000,000 and it is impossible to construct the dense matrix W and S with such a huge size.
To solve these problems, we improve the LLGC algorithm in two aspects:
Quantize the Asterrgb image by converting it into an indexed image and then apply LLGC to indexed colors. Pixels with the same indexed color will be considered as one entrance in F, with the number of pixels integrated in the matrix W and S correspondingly. In this way, N is not more than the maximum number of indexed colors, which is set as 300 in all test cases.
Now the pixels are represented by X = {(xi, ni), i = 1, 2, …, M}, where M stands for the total number of indexed colors and ni for the number of pixels which belong to the i-th index color. The affinity matrix W is defined in a way slightly different from Equation (21), and its size becomes M×M:
Note that now Wi,i = 1. It can be proved that the propagation matrix SM×M = [Si,j] can be expressed as follows:
By this means, the improved LLGC algorithm can achieve promising results while the computation cost is greatly reduced.
Based on maskurban, find the largest connected urban area and choose the sub-image according to its bounding rectangle. The number of urban/non-urban pixels in this sub-image is balanced and the LLGC algorithm is performed for this region. The urban area confidence map of the sub-image can be mapped back to the whole image, according to the rule that pixels with the same indexed color share the same confidence.
In this way, the improved LLGC algorithm is able to generate the urban area confidence map efficiently and effectively. Training data for further classification, i.e., samples of urban/non-urban pixels, are obtained through weighted sampling, where the confidence of each pixel is used as the weight.
3.5. Urban Area Classification
Based on our proposed labeler, training data are obtained automatically and now traditional supervised methods for urban area classification can be applied. Here the widely used Support Vector Machine (SVM) classifier [57] is exploited. In our method, a total of 10 features (Asterb1, Asterb2, Asterb3, Asterb4, slope, NDVI, NDWI, hh, hv, hhent) are used for classification, via the classical SVM classifier with a linear kernel function. The SVM classifier was implemented by using the LIBSVM library [58].
3.6. Accuracy Assessment
To evaluate the accuracy of extracted urban area maps, a widely used assessment method based on the confusion matrix is employed here. The confusion matrix is generated by cross-tabulation of the class labels from the classification results against the ground truth data. The diagonal elements in the confusion matrix represent the cases where the classification results agree with the ground truth data, while the off-diagonal ones show disagreements in the labels. For urban area mapping, there are two classes: urban and non-urban (abbreviated as U/NU, respectively) and the size of the confusion matrix is 2 × 2. The structure of the confusion matrix is shown in Figure 3.
According to comments in [59], the performance of urban area mapping will be evaluated via 4 parameters, defined as follows:
In general, overall accuracy indicates the rate of correct classification, while user’s accuracy and producer’s accuracy show whether the urban areas have been overestimated or underestimated. Kappa coefficient represents the inter-rater agreement of the confusion matrix and sometimes is regarded as a more robust measure than overall accuracy. For more detailed interpretation about these parameters, please refer to [59,60].
4. Experimental Results
4.1. Study Area
In this experiment, 75 urban areas are investigated and their locations are shown in Figure 4. Considering that the performance of urban area classification may vary based on the landscapes in the scene, these areas are selected from different climate zones, following a similar proportion of the number of cities by climate zone in GRUMP settlement points [22]. In total, 10 scenes are from cities in the tropical zone, 16 from the arid zone, 33 from the temperate zone, and 16 from the cold zone. Here all ASTER images are obtained within the period from January 2000 to March 2008, and are aligned based on the Global Earth Observation Grid (GEO Grid) as described in [61]. As for PALSAR images, Level 4.1 product (see the user’s guide in [40] for more details) was utilized and the pixel spacing of HH/HV polarization images is 12.5 m. The PALSAR HH/HV images are captured from January 2006 to March 2011, and spatial resampling has been performed to align with the ASTER data of 15 m resolution, based on the GEO Grid service [62]. We assume that there are no significant changes of urban area in these scenes between the capture date of ASTER and PALSAR images and generally this assumption is reasonable for most cases.
4.2. Ground Truth Data
To provide quantified evaluation about the accuracy of extracted urban area maps, ground truth data were collected via manual interaction, based on the false color images consisting of ASTER/VNIR satellite images (see Figure 5a for an example). One author and two trained assistants manually selected urban/non-urban pixels from the false color image based on their visual appearances on color tone and texture. Each operator separately selected a set of possible urban/non-urban points in random, and then submitted the data to the other two operators for verification. For a point to be interpreted as urban/non-urban, two of the three operators had to interpret it as urban/non-urban. For each scene, about 80∼90 pixels in total for urban/non-urban areas were sampled in random. It is noteworthy that the ground truth data are not involved in our method and are only used for evaluating the performance of different methods.
4.3. Criterion of Performance Evaluation
To verify the performance of our proposed method, the accuracy of urban area mapping are compared with other two baseline maps. First, we employed the global urban area map of 2001 from MCD12Q1 [63], which is derived from Terra- and Aqua-MODIS data. It has a resolution of about 500 m and covers all investigated cities in our experiment. In addition, it was considered as the most accurate urban area map over 140 cities among 8 maps [32]. Here the MCD maps were resampled to 15 m resolution by using the resample function in the GRASS GIS software [64]. Second, we designed a supervised urban area extraction method based on SVM. Half of the ground truth points are used as training data and the same procedures in Section 3.5 are performed to classify pixels and therefore generate the urban area map.
To evaluate the quality of extracted urban area maps, the accuracy parameters (user’s accuracy, producer’s accuracy, overall accuracy and kappa) are calculated and the corresponding confusion matrix is listed. In addition, the visual appearance of some cases is presented.
Here unsupervised classification methods were not selected for comparison and the reason is twofold. First, as shown in [13,16,28], a large amount of human interaction is needed for post refinement. Second, the performance of such methods heavily depends on the characteristics of local landscapes and usually parameters need to be carefully tuned for different scenes. Our proposed method is fully automatic and the experimental result was obtained in 75 different scenes with fixed parameter settings. Therefore, it is obvious that our method is superior to unsupervised methods in these two aspects. As for the accuracy of urban area mapping, we believe that the comparison with the SVM method, which achieves promising results for urban area mapping studies [17,19,65], is sufficient to demonstrate the performance of our method.
4.4. Processing Results of Proposed Method
In this subsection, we demonstrate how each part of the proposed method works via an example. Figure 5 shows an example taken at Mexicali, Mexico. Figure 5a is the ASTER/VNIR false color image, where the red channel stands for VNIR Band 3N (0.76–0.86 μm), green for VNIR Band 1 (0.52–0.60 μm), and blue for VNIR Band 2 (0.63–0.69 μm), respectively. Figure 5b is the PALSAR false color image, where the red channel stands for hh, green for hhcor, and blue for hv, respectively.
Followed by the method described in Section 3.3, the predicted urban/non-urban areas are obtained (see Figure 5c), where blue points stand for urban and green for non-urban. And there are still a number of unknown locations, marked by white points. The urban area confidence map derived by the improved LLGC method and the automatically selected samples are displayed in Figure 5d,e, respectively. In the confidence map, the intensity of pixels stands for the likelihood of belonging to urban area, where a higher value indicates a larger possibility. Total 500 urban points and 300 non-urban points (marked as blue/green cross respectively) are selected by our labeler, which are used to train the urban area model based on SVM. The final urban area map according to the classification result of SVM is given in Figure 5f.
It can be seen that the prediction map generated by the common prior knowledge can only make a rough estimate about the urban/non-urban areas. Some points are marked with incorrect labels and some are still unknown. The result is refined by using the improved LLGC method to propagate the confidence of points, selecting corresponding training samples, and utilizing the SVM method to build the urban area classifier. It is clear that the final urban area map matches much better with the ASTER and PALSAR images than the prediction map.
4.5. Comparison Results and Discussions
As mentioned in Section 4.3, the urban area maps extracted by our method are compared with the maps from MCD12Q1 and the maps generated by the supervised SVM method (abbreviated as MCD/SVM, respectively). The accuracy parameters by climate zones are listed in Table 1 and the corresponding confusion matrices are displayed in Table 2.
For different climate zones, the overall accuracy of our method is about 10%∼14% higher that that of MCD, and is about 3%∼5% lower than that of SVM. For kappa coefficient, our method also outperforms MCD and has close performance to SVM. In addition, the performance of the proposed method is quite stable for different climate zones. SVM is also stable for all zones while the performance of MCD is slightly different when handling cold and temperate zones.
It is noteworthy that MCD has the best producer’s accuracy and the worst user’s accuracy. The reason can be found from the confusion matrix: in MCD maps most of ground truth urban points have been successfully included, but meanwhile a large percent of non-urban points have also been incorrectly classified as urban points. In contrast, although SVM and our maps missed more ground truth urban points, the number of misclassified non-urban points is much less than that of MCD maps.
Figure 6 shows the extracted urban area maps of 5 cities (Mexicali, Addis Ababa, Niamey, Khulna and Fes), in comparison with MCD and SVM maps. It can be seen that the spectral characteristics of these scenes may vary to a large extent due to the difference of landscapes, which is very challenging for traditional unsupervised methods. Our method automatically adapts the difference over scenes and extracts the high resolution urban area maps with promising accuracy. The urban area maps by our method have better description about urban areas than MCD maps, and their performance are quite similar to that of SVM maps.
In general, the experimental results indicate that our proposed unsupervised method has better performance than the low resolution MCD maps, and is comparable to the supervised SVM method. However, there are two limitations of this method. First, as mentioned in Section 3.3, it is assumed that the images to be classified must include a sufficient number of urban and non-urban pixels. To satisfy this assumption, usually a small amount of manual work about urban area selection or confirmation is required. Second, the key contribution of our proposed method is building training samples in a fully automatic way. In most ideal case, its performance should be close to that of the supervised SVM method. Therefore, it is not realistic to expect the proposed method can outperform the supervised SVM method with manually selected samples. Meanwhile, since our proposed method is an automatic one with fixed parameter settings, we believe its performance is very promising to many potential urban area mapping applications.
5. Conclusions and Future Work
In this paper, we present an unsupervised method for global urban area mapping, based on ASTER and PALSAR satellite images. Based on our carefully designed labeler, the common prior knowledge about urban/non-urban area is propagated via the improved LLGC clustering algorithm through the unlabeled dataset and training samples can be automatically selected. The urban area map is generated by applying the SVM classifier to extracted samples and spectral features.
The proposed method shows strong ability of unsupervised learning from input datasets, which is demonstrated in the experiment including 75 scenes from different climate zones. The same parameter settings are used for all cases and no manual interaction is needed. Our method achieves an overall accuracy of 84.4% and a kappa coefficient of 0.628, which is comparable to the supervised SVM method.
More importantly, the proposed method here indicates a novel framework for unsupervised learning problems in the field of remote sensing. Given some common prior knowledge about the objects of interest and sufficient unlabeled data set, the proposed framework can transfer the prior knowledge into the new data set in a reasonable way, leading to promising classification results. Therefore, we believe that the proposed framework has great practical value for various classification issues in remote sensing and might be applied for many potential applications in the near future.
The future work of this study consists of three aspects. First, we plan to extend this method by using additional high-resolution global land cover data sets such as Corine Land Cover data. Second, in this method, a sufficient number of urban/non-urban pixels are needed in the coarse prediction step. Therefore, we will try to improve the performance of this step by employing more prior knowledge. Finally, we are also interested in utilizing other semi-supervised learning methods, so that prior knowledge can be further integrated with the unlabeled dataset.
Acknowledgments
This work was supported in part by the GRENE Program (Green Network of Excellence, 2011-2016) funded by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) in Japan, and by the National Natural Science Foundation of China under Grant 41301365.
Author Contributions
Yulin Duan had the original idea for the study and, with all co-authors carried out the design. Yulin Duan, Xiaowei Shao and Yun Shi was responsible for the design and implementation of the proposed algorithm while Hiroyuki Miyazaki, Koki Iwao and Ryosuke Shibasaki were responsible for the preparation and verification of experimental data. Xiaowei Shao drafted the manuscript, which was revised by all authors. All authors read and approved the final manuscript.
Conflicts of Interest
The author declares no conflict of interest.
References
- Foley, J.A.; DeFries, R.; Asner, G.P.; Barford, C.; Bonan, G.; Carpenter, S.R.; Chapin, F.S.; Coe, M.T.; Daily, G.C.; Gibbs, H.K.; et al. Global consequences of land use. Science 2005, 309, 570–574. [Google Scholar]
- Seto, K.C.; Satterthwaite, D. Interactions between urbanization and global environmental change. Curr. Opin. Environ. Sustain. 2010, 2, 127–128. [Google Scholar]
- Angel, S.; Sheppard, S.; Civco, D.L.; Buckley, R.; Chabaeva, A.; Gitlin, L.; Kraley, A.; Parent, J.; Perlin, M. The Dynamics of Global Urban Expansion; Transport and Urban Development Department, The World Bank: Washington, DC, USA, 2005. [Google Scholar]
- Sanderson, D. Cities, disasters and livelihoods. Environ. Urban. 2000, 12, 93–102. [Google Scholar]
- Doocy, S.; Gorokhovich, Y.; Burnham, G.; Balk, D.; Robinson, C. Tsunami mortality estimates and vulnerability mapping in Aceh, Indonesia. Am. J. Public Health 2007, 97, S146–S151. [Google Scholar]
- Department for Economic and Social Affairs, United Nations, World Population Prospects; The 2012 Revision; Department for Economic and Social Affairs: New York, NY, USA, 2013.
- Weng, Q. Global Urban Monitoring and Assessment through Earth Observation; CRC Press: Boca Raton, FL, USA, 2014. [Google Scholar]
- Blaschke, T.; Hay, G.J.; Weng, Q.; Resch, B. Collective sensing: Integrating geospatial technologies to understand urban systems—An overview. Remote Sens. 2011, 3, 1743–1776. [Google Scholar]
- Batty, M. The size, scale, and shape of cities. Science 2008, 319, 769–771. [Google Scholar]
- Fan, J.; Ma, T.; Zhou, C.; Zhou, Y.; Xu, T. Comparative estimation of urban development in China’s cities using socioeconomic and DMSP/OLS night light data. Remote Sens. 2014, 6, 7840–7856. [Google Scholar]
- Schneider, A.; Friedl, M.A.; Potere, D. Mapping global urban areas using MODIS 500-m data: New methods and datasets based on “urban ecoregions”. Remote Sens. Environ. 2010, 114, 1733–1746. [Google Scholar]
- Schneider, A.; Friedl, M.A.; Potere, D. A new map of global urban extent from MODIS satellite data. Environ. Res. Lett. 2009, 4. [Google Scholar] [CrossRef]
- Bartholomé, E.; Belward, A. GLC2000: A new approach to global land cover mapping from Earth observation data. Int. J. Remote Sens. 2005, 26, 1959–1977. [Google Scholar]
- Gao, F.; De Colstoun, E.B.; Ma, R.; Weng, Q.; Masek, J.G.; Chen, J.; Pan, Y.; Song, C. Mapping impervious surface expansion using medium resolution satellite image time series: A case study in the Yangtze River Delta, China. Int. J. Remote Sens. 2012, 33, 7609–7628. [Google Scholar]
- Weng, Q. Remote sensing of impervious surfaces in the urban areas: Requirements, methods, and trends. Remote Sens. Environ. 2012, 117, 34–49. [Google Scholar]
- Loveland, T.; Reed, B.; Brown, J.; Ohlen, D.; Zhu, Z.; Yang, L.; Merchant, J. Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data. Int. J. Remote Sens. 2000, 21, 1303–1330. [Google Scholar]
- Mountrakis, G.; Im, J.; Ogole, C. Support vector machines in remote sensing: A review. ISPRS J. Photogramm. Remote Sens. 2011, 66, 247–259. [Google Scholar]
- Nemmour, H.; Chibani, Y. Multiple support vector machines for land cover change detection: An application for mapping urban extensions. ISPRS J. Photogramm. Remote Sens. 2006, 61, 125–133. [Google Scholar]
- Cao, X.; Chen, J.; Imura, H.; Higashi, O. A SVM-based method to extract urban areas from DMSP-OLS and SPOT VGT data. Remote Sens. Environ. 2009, 113, 2205–2209. [Google Scholar]
- Mas, J.; Flores, J. The application of artificial neural networks to the analysis of remotely sensed data. Int. J. Remote Sens. 2008, 29, 617–663. [Google Scholar]
- Weng, Q.; Hu, X. Medium spatial resolution satellite imagery for estimating and mapping urban impervious surfaces using LSMA and ANN. IEEE Trans. Geosci. Remote Sens. 2008, 46, 2397–2406. [Google Scholar]
- CIESIN (Center for International Earth Science Information Network), Columbia University. Global Rural Urban Mapping Project (GRUMP), Alpha Version: Settlement Points, Available online: http://sedac.ciesin.columbia.edu/gpw accessed on 8 October 2004.
- Rodriguez-Galiano, V.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar]
- Hansen, M.; DeFries, R.; Townshend, J.R.; Sohlberg, R. Global land cover classification at 1 km spatial resolution using a classification tree approach. Int. J. Remote Sens. 2000, 21, 1331–1364. [Google Scholar]
- Schneider, A.; Friedl, M.A.; McIver, D.K.; Woodcock, C.E. Mapping urban areas by fusing multiple sources of coarse resolution remotely sensed data. Photogramm. Eng. Remote Sens. 2003, 69, 1377–1386. [Google Scholar]
- Loveland, T.R.; Zhiliang, Z.; Ohlen, D.O.; Brown, J.F.; Reed, B.C.; Limin, Y. An analysis of the IGBP global land-cover characterization process. Photogramm. Eng. Remote Sens. 1999, 65, 1021–1032. [Google Scholar]
- Arino, O.; Gross, D.; Ranera, F.; Bourg, L.; Leroy, M.; Bicheron, P.; Latham, J.; di Gregorio, A.; Brockman, C.; Witt, R.; et al. GlobCover: ESA service for global land cover from MERIS, Proceedings of the IEEE International Symposium on Geoscience and Remote Sensing, Barcelona, Spain, 23–28 July 2007; pp. 2412–2415.
- Bontemps, S.; Defourny, P.; Bogaert, E.V.; Arino, O.; Kalogirou, V.; Perez, J.R. GLOBCOVER 2009-Products Description and Validation Report, Available online: http://due.esrin.esa.int/globcover/LandCover2009/GLOBCOVER2009_Validation_Report_2.2.pdf accessed on 18 February 2011.
- Thapa, R.B.; Murayama, Y. Urban mapping, accuracy, & image classification: A comparison of multiple approaches in Tsukuba City, Japan. Appl. Geogr. 2009, 29, 135–144. [Google Scholar]
- Montgomery, M.R. The urban transformation of the developing world. Science 2008, 319, 761–764. [Google Scholar]
- Orenstein, D.E.; Bradley, B.A.; Albert, J.; Mustard, J.F.; Hamburg, S.P. How much is built? Quantifying and interpreting patterns of built space from different data sources. Int. J. Remote Sens. 2011, 32, 2621–2644. [Google Scholar]
- Potere, D.; Schneider, A.; Angel, S.; Civco, D.L. Mapping urban areas on a global scale: Which of the eight maps now available is more accurate? Int. J. Remote Sens. 2009, 30, 6531–6558. [Google Scholar]
- Small, C. High spatial resolution spectral mixture analysis of urban reflectance. Remote Sens. Environ. 2003, 88, 170–186. [Google Scholar]
- Yamaguchi, Y.; Kahle, A.B.; Tsu, H.; Kawakami, T.; Pniel, M. Overview of advanced spaceborne thermal emission and reflection radiometer (ASTER). IEEE Trans. Geosci. Remote Sens. 1998, 36, 1062–1071. [Google Scholar]
- Galletti, C.S.; Myint, S.W. Land-use mapping in a mixed urban-agricultural arid landscape using object-based image analysis: A case study from Maricopa, Arizona. Remote Sens. 2014, 6, 6089–6110. [Google Scholar]
- Pu, R.; Gong, P.; Michishita, R.; Sasagawa, T. Spectral mixture analysis for mapping abundance of urban surface components from the Terra/ASTER data. Remote Sens. Environ. 2008, 112, 939–954. [Google Scholar]
- Chen, Y.; Shi, P.; Fung, T.; Wang, J.; Li, X. Object-oriented classification for urban land cover mapping with ASTER imagery. Int. J. Remote Sens. 2007, 28, 4645–4651. [Google Scholar]
- Stefanov, W.L.; Netzband, M. Assessment of ASTER land cover and MODIS NDVI data at multiple scales for ecological characterization of an arid urban center. Remote Sens. Environ. 2005, 99, 31–43. [Google Scholar]
- Clarke, K. A self-modifying cellular automaton model of historical. Environ. Plan. B. 1997, 24, 247–261. [Google Scholar]
- PALSAR Project, Available online: http://gds.palsar.ersdac.jspacesystems.or.jp/e/guide/ accessed on 23 February 2006.
- Rosenqvist, A.; Shimada, M.; Ito, N.; Watanabe, M. ALOS PALSAR: A pathfinder mission for global-scale monitoring of the environment. IEEE Trans. Geosci. Remote Sens. 2007, 45, 3307–3316. [Google Scholar]
- Kajimoto, M.; Susaki, J. Urban-area extraction from polarimetric SAR images using polarization orientation angle. Geosci. Remote Sens. Lett. IEEE. 2013, 10, 337–341. [Google Scholar]
- Esch, T.; Taubenböck, H.; Roth, A.; Heldens, W.; Felbier, A.; Thiel, M.; Schmidt, M.; Müller, A.; Dech, S. TanDEM-X mission—New perspectives for the inventory and monitoring of global settlement patterns. J. Appl. Remote Sens. 2012, 6. [Google Scholar] [CrossRef]
- Itabashi, K.; Miyazaki, H.; Iwao, K.; Nakamura, K.; Shibasaki, R. A method for constructing urban extent map from ALOS/PALSAR satellite data, Proceedings of the Asian Conference on Remote Sensing, Taipei, Taiwan, 3–7 October 2011; pp. 432–437.
- Kellndorfer, J.M.; Pierce, L.E.; Dobson, M.C.; Ulaby, F.T. Toward consistent regional-to-global-scale vegetation characterization using orbital SAR systems. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1396–1411. [Google Scholar]
- Luo, Y.; Tao, D.; Geng, B.; Xu, C.; Maybank, S.J. Manifold regularized multitask learning for semi-supervised multilabel image classification. IEEE Trans. Image Process 2013, 22, 523–536. [Google Scholar]
- Triguero, I.; García, S.; Herrera, F. Self-labeled techniques for semi-supervised learning: Taxonomy, software and empirical study. Knowl. Inf. Syst. 2013, 42, 1–40. [Google Scholar]
- Belkin, M.; Niyogi, P.; Sindhwani, V. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 2006, 7, 2399–2434. [Google Scholar]
- Zhou, D.; Bousquet, O.; Lal, T.N.; Weston, J.; Schölkopf, B. Learning with local and global consistency. Adv. Neural Inf. Process. Syst. 2004, 16, 321–328. [Google Scholar]
- Tucker, C.J. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar]
- McFeeters, S. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar]
- Fensholt, R.; Sandholt, I.; Stisen, S. Evaluating MODIS MERIS, and VEGETATION vegetation indices using in situ measurements in a semiarid environment. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1774–1786. [Google Scholar]
- Ji, L.; Zhang, L.; Wylie, B. Analysis of dynamic thresholds for the normalized difference water index. Photogramm. Eng. Remote Sens. 2009, 75, 1307–1317. [Google Scholar]
- Eddins, S.L.; Gonzalez, R.; Woods, R. Digital Image Processing Using Matlab; Princeton Hall Pearson Education Inc.: Upper Saddle River, NJ, USA, 2004. [Google Scholar]
- MathWorks. Documentation about Entropyfilt, Available online: http://www.mathworks.com/help/images/ref/entropyfilt.html accessed on 9 October 2014.
- Serra, J. Image Analysis and Mathematical Morphology; Academic Press: Waltham, MA, USA, 1982. [Google Scholar]
- Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
- Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2011, 2. [Google Scholar] [CrossRef]
- Foody, G.M. Status of land cover classification accuracy assessment. Remote Sens. Environ. 2002, 80, 185–201. [Google Scholar]
- Türk, G. Gt index: A measure of the success of prediction. Remote Sens. Environ. 1979, 8, 65–75. [Google Scholar]
- Yamamoto, N.; Nakamura, R.; Yamamoto, H.; Tsuchida, S.; Kojima, I.; Tanaka, Y.; Sekiguchi, S. Geo grid: Grid infrastructure for integration of huge satellite imagery and geoscience data setsets, Proceedings of the IEEE International Conference on Computer and Information Technology, Seoul, Korea, 20–22 September 2006; p. 75.
- Sekiguchi, S.; Tanaka, Y.; Kojima, I.; Yamamoto, N.; Yokoyama, S.; Tanimura, Y.; Nakamura, R.; Iwao, K.; Tsuchida, S. Design principles and IT overviews of the GEO grid. Syst. J. IEEE. 2008, 2, 374–389. [Google Scholar]
- LP DAAC (Land Processes Distributed Active Archive Center), U.S. Department of the Interior. Land Cover Type Yearly L3 Global 500 m SIN Grid, Available online: https://lpdaac.usgs.gov/products/modis_products_table/mcd12q1 accessed on 18 June 2009.
- Neteler, M.; Bowman, M.H.; Landa, M.; Metz, M. GRASS GIS: A multi-purpose open source GIS. Environ. Model. Softw. 2012, 31, 124–130. [Google Scholar]
- Huang, X.; Zhang, L. A comparative study of spatial approaches for urban mapping using hyperspectral ROSIS images over Pavia City, northern Italy. Int. J. Remote Sens. 2009, 30, 3205–3221. [Google Scholar]
Climate Zone | Method | Overall Accuracy | Kappa Coefficient | Producer’s Accuracy | User’s Accuracy |
---|---|---|---|---|---|
Tropical | Ours | 84.5% | 0.635 | 68.4% | 81.5% |
MCD | 70.6% | 0.445 | 94.3% | 53.0% | |
SVM | 89.6% | 0.760 | 80.5% | 87.0% | |
Arid | Ours | 85.2% | 0.637 | 70.0% | 78.5% |
MCD | 72.2% | 0.454 | 90.7% | 52.3% | |
SVM | 87.2% | 0.689 | 74.6% | 81.4% | |
Temperate | Ours | 84.4% | 0.618 | 63.9% | 83.5% |
MCD | 70.7% | 0.424 | 86.2% | 52.5% | |
SVM | 87.3% | 0.702 | 76.9% | 82.0% | |
Cold | Ours | 83.2% | 0.629 | 67.9% | 84.8% |
MCD | 74.0% | 0.500 | 92.0% | 60.3% | |
SVM | 87.9% | 0.744 | 84.3% | 83.9% | |
Total | Ours | 84.4% | 0.628 | 66.9% | 82.3% |
MCD | 71.8% | 0.453 | 89.7% | 54.4% | |
SVM | 87.7% | 0.717 | 78.7% | 83.0% |
Ours | MCD | SVM | |||||||
---|---|---|---|---|---|---|---|---|---|
Tropical | NU | U | Total | NU | U | Total | NU | U | Total |
NU | 460 | 77 | 537 | 294 | 14 | 308 | 469 | 48 | 517 |
U | 38 | 167 | 205 | 204 | 230 | 434 | 29 | 196 | 225 |
Total | 498 | 244 | 742 | 498 | 244 | 742 | 498 | 244 | 742 |
Arid | NU | U | Total | NU | U | Total | NU | U | Total |
NU | 978 | 138 | 1116 | 685 | 43 | 728 | 988 | 117 | 1105 |
U | 89 | 324 | 413 | 382 | 419 | 801 | 79 | 345 | 424 |
Total | 1067 | 462 | 1529 | 1067 | 462 | 1529 | 1067 | 462 | 1529 |
Temperate | NU | U | Total | NU | U | Total | NU | U | Total |
NU | 1617 | 291 | 1908 | 1091 | 111 | 1202 | 1583 | 186 | 1769 |
U | 102 | 514 | 616 | 628 | 694 | 1322 | 136 | 619 | 755 |
Total | 1719 | 805 | 2524 | 1719 | 805 | 2524 | 1719 | 805 | 2524 |
Cold | NU | U | Total | NU | U | Total | NU | U | Total |
NU | 798 | 169 | 967 | 544 | 42 | 586 | 777 | 82 | 859 |
U | 64 | 356 | 420 | 318 | 483 | 801 | 85 | 443 | 528 |
Total | 862 | 525 | 1387 | 862 | 525 | 1387 | 862 | 525 | 1387 |
Total | NU | U | Total | NU | U | Total | NU | U | Total |
NU | 3854 | 675 | 4529 | 2614 | 210 | 2824 | 3818 | 433 | 4521 |
U | 292 | 1361 | 1653 | 1532 | 1826 | 3358 | 328 | 1603 | 1931 |
Total | 4146 | 2036 | 6182 | 4146 | 2036 | 6182 | 4146 | 2036 | 6182 |
© 2015 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Duan, Y.; Shao, X.; Shi, Y.; Miyazaki, H.; Iwao, K.; Shibasaki, R. Unsupervised Global Urban Area Mapping via Automatic Labeling from ASTER and PALSAR Satellite Images. Remote Sens. 2015, 7, 2171-2192. https://doi.org/10.3390/rs70202171
Duan Y, Shao X, Shi Y, Miyazaki H, Iwao K, Shibasaki R. Unsupervised Global Urban Area Mapping via Automatic Labeling from ASTER and PALSAR Satellite Images. Remote Sensing. 2015; 7(2):2171-2192. https://doi.org/10.3390/rs70202171
Chicago/Turabian StyleDuan, Yulin, Xiaowei Shao, Yun Shi, Hiroyuki Miyazaki, Koki Iwao, and Ryosuke Shibasaki. 2015. "Unsupervised Global Urban Area Mapping via Automatic Labeling from ASTER and PALSAR Satellite Images" Remote Sensing 7, no. 2: 2171-2192. https://doi.org/10.3390/rs70202171
APA StyleDuan, Y., Shao, X., Shi, Y., Miyazaki, H., Iwao, K., & Shibasaki, R. (2015). Unsupervised Global Urban Area Mapping via Automatic Labeling from ASTER and PALSAR Satellite Images. Remote Sensing, 7(2), 2171-2192. https://doi.org/10.3390/rs70202171