[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Spatial neighborhood based anomaly detection in sensor datasets

Published: 01 March 2010 Publication History

Abstract

Success of anomaly detection, similar to other spatial data mining techniques, relies on neighborhood definition. In this paper, we argue that the anomalous behavior of spatial objects in a neighborhood can be truly captured when both (a) spatial autocorrelation (similar behavior of nearby objects due to proximity) and (b) spatial heterogeneity (distinct behavior of nearby objects due to difference in the underlying processes in the region) are taken into consideration for the neighborhood definition. Our approach begins by generating micro neighborhoods around spatial objects encompassing all the information about a spatial object. We selectively merge these based on spatial relationships accounting for autocorrelation and inferential relationships accounting for heterogeneity, forming macro neighborhoods. In such neighborhoods, we then identify (i) spatio-temporal outliers, where individual sensor readings are anomalous, (ii) spatial outliers, where the entire sensor is an anomaly, and (iii) spatio-temporally coalesced outliers, where a group of spatio-temporal outliers in the macro neighborhood are separated by a small time lag indicating the traversal of the anomaly. We demonstrate the effectiveness of our approach in neighborhood formation and anomaly detection with experimental results in (i) water monitoring and (ii) highway traffic monitoring sensor datasets. We also compare the results of our approach with an existing approach for spatial anomaly detection.

References

[1]
ARC (2002) ARC IMS 4.0, ArcView 8.3. http://www.esri.com/
[2]
Aurenhammer F (1991) Voronoi diagrams--a survey of a fundamental geometric data structure. ACM Comput Surv 23(3):345-405.
[3]
Birant D, Kut A (2006) Spatio-temporal outlier detection in large databases. J Comput Inf Technol 14(4): 291-297.
[4]
Chatfield C (1983) Statistics for technology, a course in applied statistics. Science Paperbacks. Chapman & Hall/CRC, Boca Raton, FL.
[5]
Dasgupta D, Forrest S (1999) Novelty detection in time series data using ideas from immunology. In: International conference on intelligent systems.
[6]
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases. In: KDD, AAAI Press, USA, pp 44-49.
[7]
Ester M, Kriegel H, Sander J (1997) Spatial data mining: a database approach. In: 5th International symposium on advances in spatial databases, Springer, London, pp 47-66.
[8]
Ester M, Frommelt A, Kriegel HP, Sander J (1998) Algorithms for characterization and trend detection in spatial databases. In: 4th International conference on KDD.
[9]
Ester M, Kriegel HP, Sander J (1999) Knowledge discovery in spatial databases. In: KI '99: proceedings of the 23rd annual German conference on artificial intelligence, Springer, London, pp 61-74.
[10]
Estivill-Castro V, Lee I (2000) Autoclust: automatic clustering via boundary extraction for mining massive point--data sets. In: 5th International conference on geocomputation.
[11]
Griffith D (1987) Spatial autocorrelation: a primer. Assoc Am Geogr.
[12]
Haining R (2003) Spatial data analysis: theory and practice. Cambridge University Press, Cambridge.
[13]
Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 16(12):1472-1485.
[14]
Huang Y, Pei J, Xiong H (2006) Mining co-location patterns with rare events from spatial data sets. Geo-Informatica 10(3):239-260.
[15]
Kang I, Kim T, Li K (1997) A spatial data mining method by delaunay triangulation. In: 5th ACM international workshop on advances in geographic information systems, pp 35-39. 267836.
[16]
Kang JM, Shekhar S,Wennen C, Novak P (2008) Discovering flow anomalies: a sweet approach. In: ICDM, IEEE computer society, pp 851-856.
[17]
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. John Wiley & Sons Inc., Hoboken, NJ.
[18]
Keogh E, Lonardi S, Chiu BY (2002) Finding surprising patterns in a time series database in linear time and space. In: 8th ACM international conference on knowledge discovery and data mining, ACM Press, New York, NY, pp 550-556.
[19]
Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: 24th International conference on very large data bases, NY, USA, pp 392-403. http://www.vldb.org/conf/1998/ p392.pdf
[20]
Kou Y, Lu CT, Santos RFD (2007) Spatial outlier detection: a graph-based approach. In: ICTAI '07: proceedings of the 19th IEEE international conference on tools with artificial intelligence, vol 1 (ICTAI 2007), IEEE Computer Society, Washington, DC, pp 281-288.
[21]
Kulldorff M (1997) A spatial scan statistic. Commun Stat Theory Methods 26(6):1481-1496.
[22]
Kulldorff M, Athas WF, Feurer EJ, A Miller B, Key CR (1998) Evaluating cluster alarms: a space-time scan statistic and brain cancer in los alamos, new mexico. Am J Public Health 88(9):1377-1380.
[23]
Lu C, Chen D, Kou Y (2003) Detecting spatial outliers with multiple attributes. In: 15th IEEE international conference on tools with artificial intelligence, p 122.
[24]
Lu CT, Kou Y, Zhao J, Chen L (2007) Detecting and tracking regional outliers in meteorological data. Inf Sci 177(7):1609-1632.
[25]
McGuire MP, Janeja V, Gangopadhyay A (2008) Spatiotemporal neighborhood discovery for sensor data. In: Proceedings of the 2nd international workshop on knowledge discovery from sensor data (Sensor-KDD 2007), held in conjunction with the 14th international conference on knowledge discovery and data mining (ACM SIG-KDD 2008).
[26]
Miller HJ, Han J (2001) Geographic data mining and knowledge discovery. Taylor & Francis Inc., New York, NY.
[27]
Moran P (1948) The interpretation of statistical maps. J R Stat Soc B 10(243):51.
[28]
NASQAN (2002) USGS, National stream water quality network (NASQAN), published data. http://pubs. usgs.gov/dds/wqn96cd/html/wqn/wq/region05.htm. Accessed 25 Aug 2009.
[29]
Naus J (1965) The distribution of the size of the maximum cluster of points on the line. J Am Stat Assoc 60:532-538.
[30]
Ng RT, Han J (1994) Efficient and effective clusteringmethods for spatial data mining. In: 20th International conference on very large data bases, Morgan Kaufmann, Los Altos, CA, pp 144-155.
[31]
Okabe A, Boots B, Sugihara K, Chiu S (2000) Spatial tessellations: concepts and applications of Voronoi diagrams. John Wiley & Sons Ltd., West Sussex, England.
[32]
Shahabi C, Tian X, ZhaoW(2000) TSA-tree: a wavelet-based approach to improve the efficiency of multilevel surprise and trend queries on time-series data. In: 12th International conference on scientific and statistical database management.
[33]
Shekhar S, Lu C, Zhang P (2001) Detecting graph-based spatial outliers: algorithms and applications (a summary of results). In: 7th ACM international conference on knowledge discovery and data mining, pp 371-376.
[34]
Shekhar S, Schrater P, Vatsavai R,WuW, Chawla S (2002) Spatial contextual classification and prediction models for mining geospatial data. In: IEEE transaction on multimedia.
[35]
Shekhar S, Lu CT, Zhang P, Shekhar S, Lu CT, Zhang P (2003) A unified approach to spatial outliers detection. GeoInformatica 7:139-166.
[36]
Shewchuk JR (1996) Triangle: engineering a 2d quality mesh generator and delaunay triangulator. In: Selected papers from the workshop on applied computational geormetry, towards geometric engineering, Springer, London, pp 203-222.
[37]
Sun P, Chawla S (2004) On local spatial outliers. In: 4th IEEE international conference on data mining, pp 209-216.
[38]
Unwin D (1982) Introductory spatial analysis. Methuen, London.

Cited By

View all
  1. Spatial neighborhood based anomaly detection in sensor datasets

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Data Mining and Knowledge Discovery
    Data Mining and Knowledge Discovery  Volume 20, Issue 2
    March 2010
    136 pages

    Publisher

    Kluwer Academic Publishers

    United States

    Publication History

    Published: 01 March 2010

    Author Tags

    1. Outlier detection
    2. Sensors
    3. Spatial neighborhood

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Outlier Detection in Urban Traffic DataProceedings of the 8th International Conference on Web Intelligence, Mining and Semantics10.1145/3227609.3227692(1-12)Online publication date: 25-Jun-2018
    • (2018)Support high-order tensor data description for outlier detection in high-dimensional big sensor dataFuture Generation Computer Systems10.1016/j.future.2017.10.01381:C(177-187)Online publication date: 1-Apr-2018
    • (2015)Domain-driven co-location miningGeoinformatica10.1007/s10707-014-0209-319:1(147-183)Online publication date: 1-Jan-2015
    • (2015)STenSrKnowledge and Information Systems10.1007/s10115-014-0733-343:2(333-353)Online publication date: 1-May-2015
    • (2013)Detecting spatio-temporal outliers in crowdsourced bathymetry dataProceedings of the Second ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic Information10.1145/2534732.2534739(55-62)Online publication date: 5-Nov-2013
    • (2013)Multi-domain anomaly detection in spatial datasetsKnowledge and Information Systems10.1007/s10115-012-0534-536:3(749-788)Online publication date: 1-Sep-2013
    • (2012)Probabilistic distance based abnormal pattern detection in uncertain series dataKnowledge-Based Systems10.1016/j.knosys.2012.06.00336(182-190)Online publication date: 1-Dec-2012
    • (2011)An Empirical Evaluation of Similarity Coefficients for Binary Valued DataInternational Journal of Data Warehousing and Mining10.4018/jdwm.20110401037:2(44-66)Online publication date: 1-Apr-2011

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media