[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN105187383A - Abnormal behaviour detection method based on communication network - Google Patents

Abnormal behaviour detection method based on communication network Download PDF

Info

Publication number
CN105187383A
CN105187383A CN201510475895.9A CN201510475895A CN105187383A CN 105187383 A CN105187383 A CN 105187383A CN 201510475895 A CN201510475895 A CN 201510475895A CN 105187383 A CN105187383 A CN 105187383A
Authority
CN
China
Prior art keywords
user
exceptional value
abnormal
value
addressee
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510475895.9A
Other languages
Chinese (zh)
Inventor
刘峤
刘瑶
秦志光
其他发明人请求不公开姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201510475895.9A priority Critical patent/CN105187383A/en
Publication of CN105187383A publication Critical patent/CN105187383A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an abnormal behaviour detection method based on a communication network. The abnormal behaviour detection method is capable of identifying abnormal behaviours in the communication network based on individual non-textual characteristics. The abnormal behaviour detection method disclosed by the invention can be widely applied to mining and analyzing user behaviours. The provided method comprises the following four steps: (1), dividing a given communication network into a plurality of network snapshots according to a time sequence; (2), extracting user data comprising the three characteristics, namely the communication traffic, the communication time distribution and the receiver frequency distribution, according to the network snapshots; (3), calculating abnormal values comprising the three characteristics, namely a communication traffic abnormal value, a communication time distribution abnormal value and a receiver frequency distribution abnormal value, according to the user data; and (4), standardizing the abnormal values through a conversion process, and converting the abnormal values into the same interval, so that the abnormal values are convenient to compare and analyze.

Description

A kind of abnormal behavior detection method based on communication network
Technical field
Invention relates to Data Mining, is specifically related to a kind of method that abnormal behavior detects.
A kind of Open Chinese formula entity relation extraction method based on pattern self study of background technology
Digging user behavior and analytical behavior are the important research field of mining data exception and inside threat extremely.
Communication network is formed by many people communication service, such as Email, phone etc.Communication network plays an important role in daily life, and he provides model and social relationships that a unprecedented chance let us removes analysis and digging user.Excavate about the user behavior in communication network now and had a lot of research, such as corporations excavate, role analysis, simulation model etc.
A large amount of research work is had to concentrate on above personal behavior model excavation and event excavation in recent communication network.But the contact of abnormality detection and correlation model is closely, how defining conventional model is important study hotspot.
Challenge main is at present exactly how easily and accurately to simulate and to represent telex network model.Relatively more conventional technology is exactly text based semantic analysis, and the topic according to extracting and follow the tracks of text message obtains user behavior pattern and intention.But, because privacy concern and authority restriction, obtain user profile content and there is a lot of obstacles.Another popular technology is that network framework and time attribute are to excavate user model.With work above unlike, our research directly focuses on the individual behavior of user.
Tracking and monitoring user behavior develops and extremely us can be helped to predict potential threat and excavate unknown event.Therefore a searching effective method goes to study them is very important.According to the communications records collected, we can obtain a network, nodes representative of consumer ID, while represent direct information interaction.Communication network is a typical time series network.It can be expressed by a series of snapshot.Behavioral activity according to user in snapshot can obtain user behavior benchmark, detects the abnormal behavior of user.
Summary of the invention
The present invention mainly provides a kind of abnormal behavior detection method based on communication network.The method can detect individual abnormal behavior based on the historical behavior of individuality, facilitates analyst quantize individual behavior exception and provide relevant decision support.
For the communications records obtained, first construct a communication network.Node on behalf user, while represent communications records.If originator u have sent information in t to addressee v, be just based upon one of t is pointed to v directed edge by u.This limit is represented with a vector (u, v, t).Then communication network is divided into a series of snapshot according to certain time interval.The set on limit can be regarded as when each impinges upon the time attribute ignoring it soon.
Suppose G={g 1, g 2..., g mit is the snapshot intercepting a series of communication network.For each user, first extract the essential information of each user's snapshot.Then we pay close attention to three non-textual features wherein: the distribution of the traffic, call duration time and addressee's channel zapping.
Calculate the traffic exceptional value of user, utilize Iglewicz and Hoaglin to propose based on the Z-scores method after the improvement of absolute median (MAD), will the absolute value of Z-scores afterwards be improved | mz i| as traffic exceptional value
Calculate the call duration time abnormal distribution value of user, the mean value utilizing all call duration times to distribute, to define the benchmark of call duration time distribution, utilizes Kullback-Leibler divergence to calculate call duration time abnormal distribution value.
Calculate addressee's channel zapping exceptional value of user, if defining an addressee appears in k snapshot, we are exactly k with regard to the frequency defining him, similar above, we also define receiver's channel zapping benchmark, utilize Kullback-Leibler divergence to calculate addressee's channel zapping exceptional value.
Map exceptional value to a standard value in interval [0,1] finally by a conversion regime, standardized exceptional value can be interpreted as the possibility observing exceptional value.Simultaneously also for relatively bringing a lot of facility between different user abnormal behaviour.
Accompanying drawing explanation
Accompanying drawing 1 is that the present invention detects the basic flow sheet of proposed method to abnormal behavior.
Embodiment
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the flow chart that abnormal behavior provided by the invention detects.Specifically can comprise the steps:
101, network snapshots is divided according to the time interval:
Communication network is a typical time series network.It can be expressed by a series of snapshot.According to certain time interval, communication network can be divided into several network snapshots, be convenient to carry out next step and analyze.
102, subscriber data is extracted according to network snapshots:
After several network snapshots of acquisition, we therefrom can extract the effective information of user, and the present invention pays close attention to the traffic, call duration time distribution and these three features of addressee's channel zapping.
103, according to subscriber data structuring user's benchmark:
After we extract subscriber data, construct user's benchmark according to subscriber data, the mean value of these benchmark normally some snapshot sample, obtain user's benchmark and be convenient to calculate exceptional value
104, according to subscriber data and user's benchmark exceptional value:
Choose three features of user in the present invention: the traffic, call duration time distribution and addressee's channel zapping carry out feature abnormalities calculating, and concrete account form is as follows:
I traffic
Communication network is mainly used in the information transmission between user, and therefore, a certain user in a communication network traffic is the key character of characterizing consumer behavior pattern.Suppose that the traffic in a period of time interval keeps relative stability.Based on this hypothesis, the change of user traffic can reflect reality the generation of a certain event in the world.We utilize the Z-scores of improvement to measure the exception { n of user traffic 1, n 2..., n m.
Z-scores is generally used for the exceptional value mark in numeric data.For one group of given data set { x 1, x 2..., x n, sample x iz-score calculated by following formula:
z i = ( x i - x ‾ ) s
Wherein x ‾ = Σ i = 1 n x i n , s = Σ i = 1 n ( x i - x ‾ ) 2 n - 1 .
If z iabsolute value exceeded 3, so corresponding x ijust exceptional value will be marked as.This method is also called three-sigma rule.But due to average and sample standard deviation s is not invariable, Z-score calculates the possible maximum of gained and does not rely on data value, and only depends on the quantity of measured value.Therefore, the method be not suitable for marking exceptional value, especially for small-scale data set.
Be directed to this defect, Iglewicz and Hoaglin utilize absolute median (MAD) improve before Z-scores method.For above-mentioned given data set, calculate x in the following manner iimprovement Z-scores:
mz i = 0.6745 ( x i - x ~ ) M A D
Wherein for the median of data-oriented collection, MAD is median.To the absolute value of rear Z-scores be improved | mz i| as x iexceptional value.Observation sample x ithis sample of the larger expression of exceptional value to depart from average far away.
II. call duration time distribution
The plan of major part user every day is more regular.The activity of a certain user within a period of time can be regarded as periodic behavior.Therefore, we are using the important indicator of call duration time distribution as acquisition user normal behavior model.
Telex network Annual distribution represent in 24 hours, the accounting of transmission of information in each snapshot.User's normal behavior pattern greatly depends on the implication of got feature and can define in several ways.Here the mean value utilizing all call duration times to distribute is to define the benchmark of call duration time distribution, and formula represents as follows:
T t = Σ i = 1 M h t i M
Wherein, T tt the element of call duration time distribution benchmark T.Meet discrete type probability distribution condition
In order to obtain the exceptional value of Annual distribution, this patent uses Kullback-Leibler divergence to calculate the difference of two discrete type probability distribution.The definition mode of Kullback-Leibler divergence is as follows:
D ( P | | Q ) = Σ x P ( x ) l o g P ( x ) Q ( x )
Wherein P represents that the benchmark that call duration time distributes, Q represent a certain call duration time distribution observed.When and if only if two Annual distribution are completely the same, the Kullback-Leibler divergence between them is 0.In addition, Kullback-Leibler divergence always non-negative.Therefore, this patent uses Kullback-Leibler divergence to carry out to be distributed in computing time the exceptional value of call duration time distribution in snapshot m.
III. addressee's channel zapping
Excavation recipient information is a very important research method.Many users have direct contact with addressee continually.The frequent degree that addressee is touched can react social relationships and the social status of user.
But user in snapshot along with time variations is not changeless.It is very difficult for analyzing user behavior by the distributed area of addressee, so we have studied addressee's channel zapping of each snapshot.
In order to distinguish the frequency of addressee, if our a definition addressee appears in k snapshot, we are exactly k with regard to the frequency defining him, and addressee has high frequency to mean to have and contact frequently.
After detecting recipient's frequency, then add up the communications records in snapshot.We can obtain all user addressee's channel zapping in snapshot, namely addressee's channel zapping represent in snapshot the ratio of the information being sent to each addressee.Similar above, we also define receiver's channel zapping benchmark kullback – Leibler divergence is as the exceptional value of addressee's channel zapping in snapshot m.
105, integrate exceptional value and obtain standardization exceptional value:
Show that the mark of abnormal person has different values and scope by describing method above based on different types of data and computational methods, caused comparing in the abnormal behaviour of different user have a lot of inconvenience.
Based on consideration above, introduce a conversion regime here to map exceptional value to a standard value in interval [0,1].According to presented hereinbefore, standardized exceptional value can be interpreted as the possibility observing exceptional value.Simultaneously also for relatively bringing a lot of facility between different user abnormal behaviour.
For a given exceptional value collection { s 1, s 2..., s m, standardized value ns ibe defined as
ns i=tanh(θ·s i)
Wherein θ is a regulating parameter.All exceptional values are all inner in interval [0,1].Exceptional value is that 0 expression measured value and benchmark are completely the same.We suppose exceptional value collection median (median) be mapped to 0.5.Namely so θ = l n 3 2 s ~ .

Claims (6)

1., based on an abnormal behavior detection method for communication network, it is characterized in that, comprising:
For the telex network record obtained, first a series of communication network snapshot is configured to according to time series, then user's non-textual feature is extracted according to subscriber data in snapshot, obtain user's benchmark, the exceptional value of user is gone out again according to the benchmark of user, finally integrate the exceptional value of each various criterion, obtain the exceptional value of ultimate criterion.
2. method according to claim 1, is characterized in that, the non-textual feature extracting user according to subscriber data in communication network snapshot comprises three features: the distribution of the traffic, call duration time and addressee's channel zapping.
3. method according to claim 2, is characterized in that, after extraction telex network measure feature, calculate traffic exceptional value, utilize the Z-scores improved to measure the exceptional value of user traffic, it is far away that the larger expression of exceptional value departs from average.
4. method according to claim 2, it is characterized in that, after extraction telex network Time-distribution, utilize the mean value of all call duration time distributions to define the benchmark of call duration time distribution, recycling Kullback-Leibler divergence calculates call duration time abnormal distribution value.
5. method according to claim 2, is characterized in that, after extraction user addressee frequency feature, obtain addressee's channel zapping benchmark, recycling Kullback-Leibler divergence calculates the exceptional value of addressee's channel zapping in snapshot.
6. the method according to claim 1-5, it is characterized in that, after calculating traffic exceptional value, call duration time abnormal distribution value and addressee's channel zapping exceptional value, by a formula standardization exceptional value, make exceptional value in identical interval, be convenient to compare and analyze.
CN201510475895.9A 2015-08-06 2015-08-06 Abnormal behaviour detection method based on communication network Pending CN105187383A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510475895.9A CN105187383A (en) 2015-08-06 2015-08-06 Abnormal behaviour detection method based on communication network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510475895.9A CN105187383A (en) 2015-08-06 2015-08-06 Abnormal behaviour detection method based on communication network

Publications (1)

Publication Number Publication Date
CN105187383A true CN105187383A (en) 2015-12-23

Family

ID=54909227

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510475895.9A Pending CN105187383A (en) 2015-08-06 2015-08-06 Abnormal behaviour detection method based on communication network

Country Status (1)

Country Link
CN (1) CN105187383A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106452955A (en) * 2016-09-29 2017-02-22 北京赛博兴安科技有限公司 Abnormal network connection detection method and system
CN107481090A (en) * 2017-07-06 2017-12-15 众安信息技术服务有限公司 A kind of user's anomaly detection method, device and system
CN109035768A (en) * 2018-07-25 2018-12-18 北京交通大学 A kind of taxi detours the recognition methods of behavior

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8561184B1 (en) * 2010-02-04 2013-10-15 Adometry, Inc. System, method and computer program product for comprehensive collusion detection and network traffic quality prediction
CN103744994A (en) * 2014-01-22 2014-04-23 中国科学院信息工程研究所 Communication-network-oriented user behavior pattern mining method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8561184B1 (en) * 2010-02-04 2013-10-15 Adometry, Inc. System, method and computer program product for comprehensive collusion detection and network traffic quality prediction
CN103744994A (en) * 2014-01-22 2014-04-23 中国科学院信息工程研究所 Communication-network-oriented user behavior pattern mining method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
吴孙丹: ""基于聚类的入侵检测方法的研究"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
常淑影: ""基于流监测的网络流量异常检测算法研究与实现"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
李全刚,时金桥,秦志光,柳厅文: ""面向邮件网络事件检测的用户行为模式挖掘"", 《计算机学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106452955A (en) * 2016-09-29 2017-02-22 北京赛博兴安科技有限公司 Abnormal network connection detection method and system
CN106452955B (en) * 2016-09-29 2019-03-26 北京赛博兴安科技有限公司 A kind of detection method and system of abnormal network connection
CN107481090A (en) * 2017-07-06 2017-12-15 众安信息技术服务有限公司 A kind of user's anomaly detection method, device and system
WO2019007306A1 (en) * 2017-07-06 2019-01-10 众安信息技术服务有限公司 Method, device and system for detecting abnormal behavior of user
CN109035768A (en) * 2018-07-25 2018-12-18 北京交通大学 A kind of taxi detours the recognition methods of behavior
CN109035768B (en) * 2018-07-25 2020-11-06 北京交通大学 Method for identifying taxi detour behavior

Similar Documents

Publication Publication Date Title
WO2021174751A1 (en) Method, apparatus and device for locating pollution source on basis of big data, and storage medium
Chacon-Hurtado et al. Rainfall and streamflow sensor network design: a review of applications, classification, and a proposed framework
CN102629904B (en) Detection and determination method of network navy
CN103795613B (en) Method for predicting friend relationships in online social network
CN108765004A (en) A method of user's electricity stealing is identified based on data mining
CN103995837A (en) Personalized tourist track planning method based on group footprints
CN104463603A (en) Credit assessment method and system
CN105303469A (en) Method and system for line loss abnormal reason data mining and analysis
CN106332052B (en) Micro-area public security early warning method based on mobile communication terminal
CN104156403A (en) Clustering-based big data normal-mode extracting method and system
CN103744994A (en) Communication-network-oriented user behavior pattern mining method and system
CN105187383A (en) Abnormal behaviour detection method based on communication network
CN105893352A (en) Air quality early-warning and monitoring analysis system based on big data of social network
Jin et al. Spatiotemporal distribution analysis of extreme precipitation in the Huaihe River Basin based on continuity
Yang et al. Anomaly detection on collective moving patterns: A hidden markov model based solution
Fu et al. Collaborative multiple change detection methods for monitoring the spatio-temporal dynamics of mangroves in Beibu Gulf, China
Liu et al. Quantifying COVID-19 recovery process from a human mobility perspective: An intra-city study in Wuhan
CN111460796B (en) Accidental sensitive word discovery method based on word network
Qi et al. Geo-tagging quality-of-experience self-reporting on twitter to mobile network outage events
Mount et al. The need for operational reasoning in data‐driven rating curve prediction of suspended sediment
CN107843779A (en) A kind of Power System Fault Record classifying and analyzing method and system based on fuzzy clustering
CN118433649A (en) Resident population identification method based on mobile phone signaling data
Chung et al. Information extraction methodology by web scraping for smart cities
CN117633249A (en) Basic variable construction method and device for SDGs space type monitoring index
CN109635008A (en) A kind of equipment fault detection method based on machine learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20151223

WD01 Invention patent application deemed withdrawn after publication