8000 GitHub - hoeztutar/stray: stray {STReam AnomalY} : Robust Anomaly Detection in Data Streams with Concept Drift :dog::dog::dog::dog::cat: :dog::dog:
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
forked from pridiltal/stray

stray {STReam AnomalY} : Robust Anomaly Detection in Data Streams with Concept Drift 🐶🐶🐶🐶🐱 🐶🐶

Notifications You must be signed in to change notification settings

hoeztutar/stray

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

stray {STReam AnomalY}

Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. Licence

Build Status


minimal R version CRAN_Status_Badge packageversion


Last-changedate

Anomaly Detection in High Dimensional Data Space

This package is a modification of HDoutliers package. HDoutliers is a powerful algorithm for the detection of anomalous observations in a dataset, which has (among other advantages) the ability to detect clusters of outliers in multi-dimensional data without requiring a model of the typical behavior of the system. However, it suffers from some limitations that affect its accuracy. In this package, we propose solutions to the limitations of HDoutliers, and propose an extension of the algorithm to deal with data streams that exhibit non-stationary behavior. The results show that our proposed algorithm improves the accuracy, and enables the trade-off between false positives and negatives to be better balanced.

This package is still under development and this repository contains a development version of the R package stray.

Installation

You can install oddstream from github with:

# install.packages("devtools")
devtools::install_github("pridiltal/stray")

Example

One dimensional data set with one outlier

library(stray)
require(ggplot2)
#> Loading required package: ggplot2
set.seed(1234)
data <- c(rnorm(1000, mean = -6), 0, rnorm(1000, mean = 6))
outliers <- find_HDoutliers(data)
display_HDoutliers(data,outliers )

plot of chunk onedim

Two dimentional dataset with 8 outliers

set.seed(1234)
n <- 1000 # number of observations
nout <- 10 # number of outliers
typical_data <- tibble::as.tibble(matrix(rnorm(2*n), ncol = 2, byrow = TRUE))
out <- tibble::as.tibble(matrix(5*runif(2*nout,min=-5,max=5), ncol = 2, byrow = TRUE))
data <- dplyr::bind_rows(out, typical_data )
outliers <- find_HDoutliers(data)
display_HDoutliers(data, outliers)

plot of chunk twodim

High dimensionl data

require(tourr)
outpoints <- matrix(rnorm(12, mean=200), nrow = 2)
colnames(outpoints) <- colnames(flea[,-7])
data <- rbind(flea[,-7], outpoints)
outliers <- find_HDoutliers(data)
display_HDoutliers(data, outliers)

About

stray {STReam AnomalY} : Robust Anomaly Detection in Data Streams with Concept Drift 🐶🐶🐶🐶🐱 🐶🐶

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%
0